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Systems  of  Units.  Some  Important  Conversion  Factors 


The  most  important  systems  of  units  are  shown  in  the  table  below.  The  mks  system  is  also  known  as 
the  International  System  of  Units  (abbreviated  57),  and  the  abbreviations  sec  (instead  of  s), 
gm  (instead  of  g),  and  nt  (instead  of  N)  are  also  used. 


System  of  units 

Length 

Mass 

Time 

Force 

cgs  system 

centimeter  (cm) 

gram  (g) 

second  (s) 

dyne 

mks  system 

meter  (m) 

kilogram  (kg) 

second  (s) 

newton  (nt) 

Engineering  system 

foot  (ft) 

slug 

second  (s) 

pound  (lb) 

1 inch  (in.)  = 2.540000  cm  1 foot  (ft)  = 12  in.  = 30.480000  cm 

1 yard  (yd)  = 3 ft  = 91.440000  cm  1 statute  mile  (mi)  = 5280  ft  = 1.609344  km 

1 nautical  mile  = 6080  ft  = 1.853184  km 

1 acre  = 4840  yd2  = 4046.8564  m2  1 mi2  = 640  acres  = 2.5899881  km2 

1 fluid  ounce  = 1/128  U.S.  gallon  = 231/128  in.3  = 29.573730  cm3 
1 U.S.  gallon  = 4 quarts  (liq)  = 8 pints  (liq)  = 128  fl  oz  = 3785.4118  cm3 
1 British  Imperial  and  Canadian  gallon  = 1.200949  U.S.  gallons  = 4546.087  cm3 
1 slug  = 14.59390  kg 

1 pound  (lb)  = 4.448444  nt  1 newton  (nt)  = 105  dynes 

1 British  thermal  unit  (Btu)  = 1054.35  joules  1 joule  = 107  ergs 

1 calorie  (cal)  = 4.1840  joules 

1 kilowatt-hour  (kWh)  = 3414.4  Btu  = 3.6  ■ 106  joules 
1 horsepower  (hp)  = 2542.48  Btu/h  = 178.298  cal/sec  = 0.74570  kW 
1 kilowatt  (kW)  = 1000  watts  = 3414.43  Btu/h  = 238.662  cal/s 

°F  = °C  • 1.8  + 32  1°  = 60'  = 3600"  = 0.017453293  radian 


For  further  details  see,  for  example,  D.  Halliday,  R.  Resnick,  and  J.  Walker,  Fundamentals  of  Physics.  9th  ed.,  Hoboken, 
N.  J:  Wiley,  2011.  See  also  AN  American  National  Standard,  ASTM/IEEE  Standard  Metric  Practice,  Institute  of  Electrical  and 
Electronics  Engineers,  Inc.  (IEEE),  445  Hoes  Lane,  Piscataway,  N.  J.  08854,  website  at  www.ieee.org. 


Differentiation 

integration 

(cm/  = cu'  (c  constant) 

J uv'  dx  = uv  — J u'vdx  (by  parts) 

(m  + v)'  = it'  + v' 

r xn+1 

1 xn  dx  = + c (n  ¥=  1) 

J n + 1 

(uv)'  = u'v  + uv' 

f — dx  = In  Ixl  + c 
J x 11 

( u\  U V — uv 

w ~ y2 

f eax  dx  = -eax  + c 

J a 

du  du  dy 

— = — ■ — (Chain  rule) 

dx  dy  dx 

J sin  x dx  = —cos  x + c 
J cos  x dx  = sin  x + c 

i— 1 
1 

II 

J tanxdh  = —In  |cosx|  + c 
J cot  x dx  = In  |sin  x|  + c 

a 

II 

J sec  x dx  = In  |sec  x + tan  x|  + c 

(e  ) = aeax 

esc  x dx  = In  |csc  x — cot  x|  + c 

(i ax ) = ax\na 

r dx  1 x 

(sin  x)'  = cos  x 

„ 9 — arctan  + c 

J x + a a a 

(cosx/  = — sinx 

r dx  x 

/-* , - arcsin  + c 

J Va2  - x2  a 

(tanx)  = sec  x 

(cotx/  = — csc2x 

r dx  x 

r-. , = — arcsinh  + c 

J Vx2  + a2  a 

(sinhx),  = coshx 

r dx  x 

r-, 5 k — arccosh  + c 

J Vx2  - a2  a 

(coshx)  — sinhx 

(In  x/  = — 

X 

J sin2  x dx  = \x  — \ sin  2x  + c 
J cos2  x dx  = |x  + \ sin  2x  + c 

/i  ",  > l°8a  e 

(logax)  = 

X 

J tan2  x dx  = tan  x — x + c 
J cot2  x dx  = —cot  x — x + c 

(arcsinx)'  = , 1 

Vl  - x2 

J In  x dx  = x In  x — x + c 

(arccosx/  = ' — = 

Vl  - x2 

J eax  sin  bx  dx 

gax 

= ( a sin  bx  b cos  bx)  + c 

a2  + b2 

(arctanx)  = ^ ^ 2 

J e™  cos  bx  dx 

(arccotx)'  = - y^J 

ax 

e 

= (a  cos  bx  + b sin  bx)  + c 

a2  + b2 
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See  also  http://www.wiley.com/college/kreyszig 


Purpose  and  Structure  of  the  Book 

This  book  provides  a comprehensive,  thorough,  and  up-to-date  treatment  of  engineering 
mathematics . It  is  intended  to  introduce  students  of  engineering,  physics,  mathematics, 
computer  science,  and  related  fields  to  those  areas  of  applied  mathematics  that  are  most 
relevant  for  solving  practical  problems.  A course  in  elementary  calculus  is  the  sole 
prerequisite . (However,  a concise  refresher  of  basic  calculus  for  the  student  is  included 
on  the  inside  cover  and  in  Appendix  3.) 

The  subject  matter  is  arranged  into  seven  parts  as  follows: 

A.  Ordinary  Differential  Equations  (ODEs)  in  Chapters  1-6 

B.  Linear  Algebra.  Vector  Calculus.  See  Chapters  7-10 

C.  Fourier  Analysis.  Partial  Differential  Equations  (PDEs).  See  Chapters  11  and  12 

D.  Complex  Analysis  in  Chapters  13-18 

E.  Numeric  Analysis  in  Chapters  19-21 
Optimization,  Graphs  in  Chapters  22  and  23 

G.  Probability,  Statistics  in  Chapters  24  and  25. 

These  are  followed  by  five  appendices:  1 References,  2.  Answers  to  Odd-Numbered 
Problems,  3.  Auxiliary  Materials  (see  also  inside  covers  of  book),  4.  Additional  Proofs, 
5 Table  of  Functions.  This  is  shown  in  a block  diagram  on  the  next  page. 

The  parts  of  the  book  are  kept  independent.  In  addition,  individual  chapters  are  kept  as 
independent  as  possible.  (If  so  needed,  any  prerequisites — to  the  level  of  individual 
sections  of  prior  chapters — are  clearly  stated  at  the  opening  of  each  chapter.)  We  give  the 
instructor  maximum  flexibility  in  selecting  the  material  and  tailoring  it  to  his  or  her 
need.  The  book  has  helped  to  pave  the  way  for  the  present  development  of  engineering 
mathematics.  This  new  edition  will  prepare  the  student  for  the  current  tasks  and  the  future 
by  a modern  approach  to  the  areas  listed  above.  We  provide  the  material  and  learning 
tools  for  the  students  to  get  a good  foundation  of  engineering  mathematics  that  will  help 
them  in  their  careers  and  in  further  studies. 

General  Features  of  the  Book  Include: 

Simplicity  of  examples  to  make  the  book  teachable — why  choose  complicated 
examples  when  simple  ones  are  as  instructive  or  even  better? 

Independence  of  parts  and  blocks  of  chapters  to  provide  flexibility  in  tailoring 
courses  to  specific  needs. 

Self-contained  presentation,  except  for  a few  clearly  marked  places  where  a proof 
would  exceed  the  level  of  the  book  and  a reference  is  given  instead. 

Gradual  increase  in  difficulty  of  material  with  no  jumps  or  gaps  to  ensure  an 
enjoyable  teaching  and  learning  experience. 

Modern  standard  notation  to  help  students  with  other  courses,  modern  books,  and 
journals  in  mathematics,  engineering,  statistics,  physics,  computer  science,  and  others. 

Furthermore,  we  designed  the  book  to  be  a single,  self-contained,  authoritative,  and 
convenient  source  for  studying  and  teaching  applied  mathematics,  eliminating  the  need 
for  time-consuming  searches  on  the  Internet  or  time-consuming  trips  to  the  library  to  get 
a particular  reference  book. 
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Four  Underlying  Themes  of  the  Book 

The  driving  force  in  engineering  mathematics  is  the  rapid  growth  of  technology  and  the 
sciences.  New  areas — often  drawing  from  several  disciplines — come  into  existence. 
Electric  cars,  solar  energy,  wind  energy,  green  manufacturing,  nanotechnology,  risk 
management,  biotechnology,  biomedical  engineering,  computer  vision,  robotics,  space 
travel,  communication  systems,  green  logistics,  transportation  systems,  financial 
engineering,  economics,  and  many  other  areas  are  advancing  rapidly.  What  does  this  mean 
for  engineering  mathematics?  The  engineer  has  to  take  a problem  from  any  diverse  area 
and  be  able  to  model  it.  This  leads  to  the  first  of  four  underlying  themes  of  the  book. 

1.  Modeling  is  the  process  in  engineering,  physics,  computer  science,  biology, 
chemistry,  environmental  science,  economics,  and  other  fields  whereby  a physical  situation 
or  some  other  observation  is  translated  into  a mathematical  model.  This  mathematical 
model  could  be  a system  of  differential  equations,  such  as  in  population  control  (Sec.  4.5), 
a probabilistic  model  (Chap.  24),  such  as  in  risk  management,  a linear  programming 
problem  (Secs.  22.2-22.4)  in  minimizing  environmental  damage  due  to  pollutants,  a 
financial  problem  of  valuing  a bond  leading  to  an  algebraic  equation  that  has  to  be  solved 
by  Newton’s  method  (Sec.  19.2),  and  many  others. 

The  next  step  is  solving  the  mathematical  problem  obtained  by  one  of  the  many 
techniques  covered  in  Advanced  Engineering  Mathematics. 

The  third  step  is  interpreting  the  mathematical  result  in  physical  or  other  terms  to 
see  what  it  means  in  practice  and  any  implications. 

Finally,  we  may  have  to  make  a decision  that  may  be  of  an  industrial  nature  or 
recommend  a public  policy.  For  example,  the  population  control  model  may  imply 
the  policy  to  stop  fishing  for  3 years.  Or  the  valuation  of  the  bond  may  lead  to  a 
recommendation  to  buy.  The  variety  is  endless,  but  the  underlying  mathematics  is 
surprisingly  powerful  and  able  to  provide  advice  leading  to  the  achievement  of  goals 
toward  the  betterment  of  society,  for  example,  by  recommending  wise  policies 
concerning  global  warming,  better  allocation  of  resources  in  a manufacturing  process, 
or  making  statistical  decisions  (such  as  in  Sec.  25.4  whether  a drug  is  effective  in  treating 
a disease). 

While  we  cannot  predict  what  the  future  holds,  we  do  know  that  the  student  has  to 
practice  modeling  by  being  given  problems  from  many  different  applications  as  is  done 
in  this  book.  We  teach  modeling  from  scratch,  right  in  Sec.  1.1,  and  give  many  examples 
in  Sec.  1.3,  and  continue  to  reinforce  the  modeling  process  throughout  the  book. 

2.  Judicious  use  of  powerful  software  for  numerics  (listed  in  the  beginning  of  Part  E) 
and  statistics  (Part  G)  is  of  growing  importance.  Projects  in  engineering  and  industrial 
companies  may  involve  large  problems  of  modeling  very  complex  systems  with  hundreds 
of  thousands  of  equations  or  even  more.  They  require  the  use  of  such  software.  However, 
our  policy  has  always  been  to  leave  it  up  to  the  instructor  to  determine  the  degree  of  use  of 
computers,  from  none  or  little  use  to  extensive  use.  More  on  this  below. 

3.  The  beauty  of  engineering  mathematics.  Engineering  mathematics  relies  on 
relatively  few  basic  concepts  and  involves  powerful  unifying  principles.  We  point  them 
out  whenever  they  are  clearly  visible,  such  as  in  Sec.  4.1  where  we  “grow”  a mixing 
problem  from  one  tank  to  two  tanks  and  a circuit  problem  from  one  circuit  to  two  circuits, 
thereby  also  increasing  the  number  of  ODEs  from  one  ODE  to  two  ODEs.  This  is  an 
example  of  an  attractive  mathematical  model  because  the  “growth”  in  the  problem  is 
reflected  by  an  “increase”  in  ODEs. 
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4.  To  clearly  identify  the  conceptual  structure  of  subject  matters.  For  example, 
complex  analysis  (in  Part  D)  is  a field  that  is  not  monolithic  in  structure  but  was  formed 
by  three  distinct  schools  of  mathematics.  Each  gave  a different  approach,  which  we  clearly 
mark.  The  first  approach  is  solving  complex  integrals  by  Cauchy’s  integral  formula  (Chaps. 
13  and  14),  the  second  approach  is  to  use  the  Laurent  series  and  solve  complex  integrals 
by  residue  integration  (Chaps.  15  and  16),  and  finally  we  use  a geometric  approach  of 
conformal  mapping  to  solve  boundary  value  problems  (Chaps.  17  and  18).  Learning  the 
conceptual  structure  and  terminology  of  the  different  areas  of  engineering  mathematics  is 
very  important  for  three  reasons: 

a.  It  allows  the  student  to  identify  a new  problem  and  put  it  into  the  right  group  of 
problems.  The  areas  of  engineering  mathematics  are  growing  but  most  often  retain  their 
conceptual  structure. 

b.  The  student  can  absorb  new  information  more  rapidly  by  being  able  to  fit  it  into  the 
conceptual  structure. 

c.  Knowledge  of  the  conceptual  structure  and  terminology  is  also  important  when  using 
the  Internet  to  search  for  mathematical  information.  Since  the  search  proceeds  by  putting 
in  key  words  (i.e.,  terms)  into  the  search  engine,  the  student  has  to  remember  the  important 
concepts  (or  be  able  to  look  them  up  in  the  book)  that  identify  the  application  and  area 
of  engineering  mathematics. 

Big  Changes  in  This  Edition 

(D  Problem  Sets  Changed 

The  problem  sets  have  been  revised  and  rebalanced  with  some  problem  sets  having  more 
problems  and  some  less,  reflecting  changes  in  engineering  mathematics.  There  is  a greater 
emphasis  on  modeling.  Now  there  are  also  problems  on  the  discrete  Lourier  transform 
(in  Sec.  11.9). 

e Series  Solutions  of  ODEs,  Special  Functions  and  Fourier  Analysis  Reorganized 

Chap.  5,  on  series  solutions  of  ODEs  and  special  functions,  has  been  shortened.  Chap.  1 1 
on  Lourier  Analysis  now  contains  Sturm-Liouville  problems,  orthogonal  functions,  and 
orthogonal  eigenfunction  expansions  (Secs.  1 1.5,  1 1.6),  where  they  fit  better  conceptually 
(rather  than  in  Chap.  5),  being  extensions  of  Lourier’ s idea  of  using  orthogonal  functions. 

€>  Openings  of  Parts  and  Chapters  Rewritten  As  Well  As  Parts  of  Sections 

In  order  to  give  the  student  a better  idea  of  the  structure  of  the  material  (see  Underlying 
Theme  4 above),  we  have  entirely  rewritten  the  openings  of  parts  and  chapters. 
Furthermore,  large  parts  or  individual  paragraphs  of  sections  have  been  rewritten  or  new 
sentences  inserted  into  the  text.  This  should  give  the  students  a better  intuitive 
understanding  of  the  material  (see  Theme  3 above),  let  them  draw  conclusions  on  their 
own,  and  be  able  to  tackle  more  advanced  material.  Overall,  we  feel  that  the  book  has 
become  more  detailed  and  leisurely  written. 

f|f|  Student  Solutions  Manual  and  Study  Guide  Enlarged 

Upon  the  explicit  request  of  the  users,  the  answers  provided  are  more  detailed  and 
complete.  More  explanations  are  given  on  how  to  learn  the  material  effectively  by  pointing 
out  what  is  most  important. 

Q More  Historical  Footnotes,  Some  Enlarged 

Historical  footnotes  are  there  to  show  the  student  that  many  people  from  different  countries 
working  in  different  professions,  such  as  surveyors,  researchers  in  industry,  etc.,  contributed 
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to  the  field  of  engineering  mathematics.  It  should  encourage  the  students  to  be  creative  in 
their  own  interests  and  careers  and  perhaps  also  to  make  contributions  to  engineering 
mathematics. 

Further  Changes  and  New  Features 

Parts  of  Chap.  1 on  first-order  ODEs  are  rewritten.  More  emphasis  on  modeling,  also 
new  block  diagram  explaining  this  concept  in  Sec.  1.1.  Early  introduction  of  Euler’s 
method  in  Sec.  1.2  to  familiarize  student  with  basic  numerics.  More  examples  of 
separable  ODEs  in  Sec.  1.3. 

For  Chap.  2,  on  second-order  ODEs,  note  the  following  changes:  For  ease  of  reading, 
the  first  part  of  Sec.  2.4,  which  deals  with  setting  up  the  mass-spring  system,  has 
been  rewritten;  also  some  rewriting  in  Sec.  2.5  on  the  Euler-Cauchy  equation. 

Substantially  shortened  Chap.  5,  Series  Solutions  of  ODEs.  Special  Functions: 
combined  Secs.  5.1  and  5.2  into  one  section  called  “Power  Series  Method,”  shortened 
material  in  Sec.  5.4  Bessel’s  Equation  (of  the  first  kind),  removed  Sec.  5.7 
(Sturm-Liouville  Problems)  and  Sec.  5.8  (Orthogonal  Eigenfunction  Expansions)  and 
moved  material  into  Chap.  1 1 (see  “Major  Changes”  above). 

New  equivalent  definition  of  basis  (Sec.  7.4). 

In  Sec.  7.9,  completely  new  part  on  composition  of  linear  transformations  with 
two  new  examples.  Also,  more  detailed  explanation  of  the  role  of  axioms,  in 
connection  with  the  definition  of  vector  space. 

New  table  of  orientation  (opening  of  Chap.  8 “Linear  Algebra:  Matrix  Eigenvalue 
Problems”)  where  eigenvalue  problems  occur  in  the  book.  More  intuitive  explanation 
of  what  an  eigenvalue  is  at  the  begining  of  Sec.  8.1. 

Better  definition  of  cross  product  (in  vector  differential  calculus)  by  properly 
identifying  the  degenerate  case  (in  Sec.  9.3). 

Chap.  11  on  Fourier  Analysis  extensively  rearranged:  Secs.  11.2  and  11.3 

combined  into  one  section  (Sec.  11.2),  old  Sec.  11.4  on  complex  Fourier  Series 
removed  and  new  Secs.  11.5  (Sturm-Liouville  Problems)  and  11.6  (Orthogonal 
Series)  put  in  (see  “Major  Changes”  above).  New  problems  (new!)  in  problem  set 

11.9  on  discrete  Fourier  transform. 

New  section  12.5  on  modeling  heat  flow  from  a body  in  space  by  setting  up  the  heat 
equation.  Modeling  PDEs  is  more  difficult  so  we  separated  the  modeling  process 
from  the  solving  process  (in  Sec.  12.6). 

Introduction  to  Numerics  rewritten  for  greater  clarity  and  better  presentation;  new 
Example  1 on  how  to  round  a number.  Sec.  19.3  on  interpolation  shortened  by 
removing  the  less  important  central  difference  formula  and  giving  a reference  instead. 

Large  new  footnote  with  historical  details  in  Sec.  22.3,  honoring  George  Dantzig, 
the  inventor  of  the  simplex  method. 

Traveling  salesman  problem  now  described  better  as  a “difficult”  problem,  typical 
of  combinatorial  optimization  (in  Sec.  23.2).  More  careful  explanation  on  how  to 
compute  the  capacity  of  a cut  set  in  Sec.  23.6  (Flows  on  Networks). 

In  Chap.  24,  material  on  data  representation  and  characterization  restructured  in 
terms  of  five  examples  and  enlarged  to  include  empirical  rule  on  distribution  of 
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data,  outliers,  and  the  --score  (Sec.  24.1).  Furthermore,  new  example  on  encription 
(Sec.  24.4). 

Lists  of  software  for  numerics  (Part  E)  and  statistics  (Part  G)  updated. 

References  in  Appendix  1 updated  to  include  new  editions  and  some  references  to 
websites. 


Use  of  Computers 

The  presentation  in  this  book  is  adaptable  to  various  degrees  of  use  of  software, 
Computer  Algebra  Systems  (CAS’s),  or  programmable  graphic  calculators,  ranging 
from  no  use,  very  little  use,  medium  use,  to  intensive  use  of  such  technology.  The  choice 
of  how  much  computer  content  the  course  should  have  is  left  up  to  the  instructor,  thereby 
exhibiting  our  philosophy  of  maximum  flexibility  and  adaptability.  And,  no  matter  what 
the  instructor  decides,  there  will  be  no  gaps  or  jumps  in  the  text  or  problem  set.  Some 
problems  are  clearly  designed  as  routine  and  drill  exercises  and  should  be  solved  by 
hand  (paper  and  pencil,  or  typing  on  your  computer).  Other  problems  require  more 
thinking  and  can  also  be  solved  without  computers.  Then  there  are  problems  where  the 
computer  can  give  the  student  a hand.  And  finally,  the  book  has  CAS  projects,  CAS 
problems  and  CAS  experiments , which  do  require  a computer,  and  show  its  power  in 
solving  problems  that  are  difficult  or  impossible  to  access  otherwise.  Here  our  goal  is 
to  combine  intelligent  computer  use  with  high-quality  mathematics.  The  computer 
invites  visualization,  experimentation,  and  independent  discovery  work.  In  summary, 
the  high  degree  of  flexibility  of  computer  use  for  the  book  is  possible  since  there  are 
plenty  of  problems  to  choose  from  and  the  CAS  problems  can  be  omitted  if  desired. 

Note  that  information  on  software  (what  is  available  and  where  to  order  it)  is  at  the 
beginning  of  Part  E on  Numeric  Analysis  and  Part  G on  Probability  and  Statistics.  Since 
Maple  and  Mathematica  are  popular  Computer  Algebra  Systems,  there  are  two  computer 
guides  available  that  are  specifically  tailored  to  Advanced  Engineering  Mathematics: 
E.  Kreyszig  and  E.J.  Norminton,  Maple  Computer  Guide,  \0th  Edition  and  Mathematica 
Computer  Guide,  10th  Edition.  Their  use  is  completely  optional  as  the  text  in  the  book  is 
written  without  the  guides  in  mind. 


Suggestions  for  Courses:  A Four-Semester  Sequence 

The  material,  when  taken  in  sequence,  is  suitable  for  four  consecutive  semester  courses, 
meeting  3 to  4 hours  a week: 

1st  Semester  ODEs  (Chaps.  1-5  or  1-6) 

2nd  Semester  Linear  Algebra.  Vector  Analysis  (Chaps.  7-10) 

3rd  Semester  Complex  Analysis  (Chaps.  13-18) 

4th  Semester  Numeric  Methods  (Chaps.  19-21) 

Suggestions  for  Independent  One-Semester  Courses 

The  book  is  also  suitable  for  various  independent  one-semester  courses  meeting  3 hours 
a week.  For  instance. 

Introduction  to  ODEs  (Chaps.  1-2,  21.1) 

Laplace  Transforms  (Chap.  6) 

Matrices  and  Linear  Systems  (Chaps.  7-8) 


Vector  Algebra  and  Calculus  (Chaps.  9-10) 

Fourier  Series  and  PDEs  (Chaps.  11-12,  Secs.  21.4—21.7) 

Introduction  to  Complex  Analysis  (Chaps.  13-17) 

Numeric  Analysis  (Chaps.  19,  21) 

Numeric  Linear  Algebra  (Chap.  20) 

Optimization  (Chaps.  22-23) 

Graphs  and  Combinatorial  Optimization  (Chap.  23) 

Probability  and  Statistics  (Chaps.  24-25) 
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Many  physical  laws  and  relations  can  be  expressed  mathematically  in  the  form  of  differential 
equations.  Thus  it  is  natural  that  this  book  opens  with  the  study  of  differential  equations  and 
their  solutions.  Indeed,  many  engineering  problems  appear  as  differential  equations. 


The  main  objectives  of  Part  A are  twofold:  the  study  of  ordinary  differential  equations 
and  their  most  important  methods  for  solving  them  and  the  study  of  modeling. 


Ordinary  differential  equations  (ODEs)  are  differential  equations  that  depend  on  a single 
variable.  The  more  difficult  study  of  partial  differential  equations  (PDEs),  that  is, 
differential  equations  that  depend  on  several  variables,  is  covered  in  Part  C. 


Modeling  is  a crucial  general  process  in  engineering,  physics,  computer  science,  biology, 
medicine,  environmental  science,  chemistry,  economics,  and  other  fields  that  translates  a 
physical  situation  or  some  other  observations  into  a “mathematical  model.”  Numerous 
examples  from  engineering  (e.g.,  mixing  problem),  physics  (e.g.,  Newton’s  law  of  cooling), 
biology  (e.g.,  Gompertz  model),  chemistry  (e.g.,  radiocarbon  dating),  environmental  science 
(e.g.,  population  control),  etc.  shall  be  given,  whereby  this  process  is  explained  in  detail, 
that  is,  how  to  set  up  the  problems  correctly  in  terms  of  differential  equations. 


For  those  interested  in  solving  ODEs  numerically  on  the  computer,  look  at  Secs.  21.1-21.3 
of  Chapter  21  of  Part  F,  that  is,  numeric  methods  for  ODEs.  These  sections  are  kept 
independent  by  design  of  the  other  sections  on  numerics.  This  allows  for  the  study  of 
numerics  for  ODEs  directly  after  Chap.  1 or  2. 
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solving,  interpreting 


CHAPTER 


First-Order  ODEs 


Chapter  1 begins  the  study  of  ordinary  differential  equations  (ODEs)  by  deriving  them  from 
physical  or  other  problems  (modeling),  solving  them  by  standard  mathematical  methods, 
and  interpreting  solutions  and  their  graphs  in  terms  of  a given  problem.  The  simplest  ODEs 
to  be  discussed  are  ODEs  of  the  first  order  because  they  involve  only  the  first  derivative 
of  the  unknown  function  and  no  higher  derivatives.  These  unknown  functions  will  usually 
be  denoted  by  y(x)  or  y(f)  when  the  independent  variable  denotes  time  t.  The  chapter  ends 
with  a study  of  the  existence  and  uniqueness  of  solutions  of  ODEs  in  Sec.  1.7. 

Understanding  the  basics  of  ODEs  requires  solving  problems  by  hand  (paper  and  pencil, 
or  typing  on  your  computer,  but  first  without  the  aid  of  a CAS).  In  doing  so,  you  will 
gain  an  important  conceptual  understanding  and  feel  for  the  basic  terms,  such  as  ODEs, 
direction  field,  and  initial  value  problem.  If  you  wish,  you  can  use  your  Computer  Algebra 
System  (CAS)  for  checking  solutions. 

COMMENT.  Numerics  for  first-order  ODEs  can  be  studied  immediately  after  this 
chapter.  See  Secs.  21.1-21.2,  which  are  independent  of  other  sections  on  numerics. 

Prerequisite:  Integral  calculus. 

Sections  that  may  be  omitted  in  a shorter  course:  1.6,  1.7. 

References  and  Answers  to  Problems:  App.  1 Part  A,  and  App.  2. 


Concepts.  Modeling 

If  we  want  to  solve  an  engineering  problem  (usually  of  a physical  nature),  we  first 
have  to  formulate  the  problem  as  a mathematical  expression  in  terms  of  variables, 
functions,  and  equations.  Such  an  expression  is  known  as  a mathematical  model  of  the 
given  problem.  The  process  of  setting  up  a model,  solving  it  mathematically,  and 
interpreting  the  result  in  physical  or  other  terms  is  called  mathematical  modeling  or, 
briefly,  modeling. 

Modeling  needs  experience,  which  we  shall  gain  by  discussing  various  examples  and 
problems.  (Your  computer  may  often  help  you  in  solving  but  rarely  in  setting  up  models.) 

Now  many  physical  concepts,  such  as  velocity  and  acceleration,  are  derivatives.  Hence 
a model  is  very  often  an  equation  containing  derivatives  of  an  unknown  function.  Such 
a model  is  called  a differential  equation.  Of  course,  we  then  want  to  find  a solution  (a 
function  that  satisfies  the  equation),  explore  its  properties,  graph  it,  find  values  of  it,  and 
interpret  it  in  physical  terms  so  that  we  can  understand  the  behavior  of  the  physical  system 
in  our  given  problem.  However,  before  we  can  turn  to  methods  of  solution,  we  must  first 
define  some  basic  concepts  needed  throughout  this  chapter. 
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An  ordinary  differential  equation  (ODE)  is  an  equation  that  contains  one  or  several 
derivatives  of  an  unknown  function,  which  we  usually  call  y(x ) (or  sometimes  y(t)  if  the 
independent  variable  is  time  t).  The  equation  may  also  contain  y itself,  known  functions 
of  x (or  t ),  and  constants.  For  example, 


(1) 

y = cos  x 

(2) 

y +9  y = e 

(3) 

t "r  3 '2  A 

y y - s y = 0 
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CHAP.  1 First-Order  ODEs 


EXAMPLE  1 


are  ordinary  differential  equations  (ODEs).  Here,  as  in  calculus,  y denotes  dy/dx, 
y"  = dzy/dx2,  etc.  The  term  ordinary  distinguishes  them  from  partial  differential 
equations  (PDEs),  which  involve  partial  derivatives  of  an  unknown  function  of  two 
or  more  variables.  For  instance,  a PDE  with  unknown  function  u of  two  variables  x 
and  y is 


d2u 

dx2 


+ 


i2 

0 u 
dy2 


= 0. 


PDEs  have  important  engineering  applications,  but  they  are  more  complicated  than  ODEs; 
they  will  be  considered  in  Chap.  12. 

An  ODE  is  said  to  be  of  order  n if  the  nth  derivative  of  the  unknown  function  y is  the 
highest  derivative  of  y in  the  equation.  The  concept  of  order  gives  a useful  classification 
into  ODEs  of  first  order,  second  order,  and  so  on.  Thus,  (1)  is  of  first  order,  (2)  of  second 
order,  and  (3)  of  third  order. 

In  this  chapter  we  shall  consider  first-order  ODEs.  Such  equations  contain  only  the 
first  derivative  y'  and  may  contain  y and  any  given  functions  of  x.  Hence  we  can  write 
them  as 


(4) 


F(x,y,y')  = 0 


or  often  in  the  form 


/ =f(x,y). 

This  is  called  the  explicit  form,  in  contrast  to  the  implicit  form  (4).  For  instance,  the  implicit 

O ^ O 9 B f o o 

ODE  x 'y  — 4y  = 0 (where  x A 0)  can  be  written  explicitly  as  y = 4xy  . 


Concept  of  Solution 

A function 


y = h(x) 

is  called  a solution  of  a given  ODE  (4)  on  some  open  interval  a < x < b if  h(x)  is 
defined  and  differentiable  throughout  the  interval  and  is  such  that  the  equation  becomes 
an  identity  if  y and  y are  replaced  with  h and  h , respectively.  The  curve  (the  graph)  of 
h is  called  a solution  curve. 

Here,  open  interval  a < x < b means  that  the  endpoints  a and  b are  not  regarded  as 
points  belonging  to  the  interval.  Also,  a < x < b includes  infinite  intervals  — °°  < x < b, 
a < x < co,  — co<jt<oo  (the  real  line)  as  special  cases. 


Verification  of  Solution 

Verify  that  y = c/x(c  an  arbitrary  constant)  is  a solution  of  the  ODE  xy  = — y for  all  x A 0.  Indeed,  differentiate 
y = c/x  to  get  y'  = — c/x2.  Multiply  this  by  x,  obtaining  xy'  = —c/x;  thus,  xy'  = y,  the  given  ODE. 
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EXAMPLE  2 Solution  by  Calculus.  Solution  Curves 

The  ODE  y = dy/dx  = cos*  can  be  solved  directly  by  integration  on  both  sides.  Indeed,  using  calculus, 
we  obtain  y = f cos  * dx  = sin  * + c,  where  c is  an  arbitrary  constant.  This  is  a family  of  solutions . Each  value 
of  c,  for  instance,  2.75  or  0 or  —8,  gives  one  of  these  curves.  Figure  3 shows  some  of  them,  for  c = —3,  —2, 
-1,0,  1,2,  3,4.  ■ 


EXAMPLE  3 (A)  Exponential  Growth.  (B)  Exponential  Decay 

From  calculus  we  know  that  y = ce°'2t  has  the  derivative 

/ = — = 0.2eO2t  = 0.2 y. 
dt 

Hence  y is  a solution  of  y = 0.2y  (Fig.  4A).  This  ODE  is  of  the  form  y = ky.  With  positive-constant  k it  can 
model  exponential  growth,  for  instance,  of  colonies  of  bacteria  or  populations  of  animals.  It  also  applies  to 
humans  for  small  populations  in  a large  country  (e.g.,  the  United  States  in  early  times)  and  is  then  known  as 
Malthus’s  law.1  We  shall  say  more  about  this  topic  in  Sec.  1.5. 

(B)  Similarly,  y = —0.2  (with  a minus  on  the  right)  has  the  solution  y = ce~°'2t,  (Fig.  4B)  modeling 
exponential  decay,  as,  for  instance,  of  a radioactive  substance  (see  Example  5). 

y 

2.5  - 

2.0  - 

1.5  - \ \ 

1 .0 

0.5  ^ ' -A 


0 2 4 6 8 10  12  14  t 

Fig.  4B.  Solutions  of  y'  = —0.2 y 
in  Example  3 (exponential  decay) 


Fig.  4 A.  Solutions  of  y'  = 0.2y 
in  Example  3 (exponential  growth) 


1Named  after  the  English  pioneer  in  classic  economics,  THOMAS  ROBERT  MALTHUS  (1766-1834). 
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EXAMPLE  4 


We  see  that  each  ODE  in  these  examples  has  a solution  that  contains  an  arbitrary 
constant  c.  Such  a solution  containing  an  arbitrary  constant  c is  called  a general  solution 
of  the  ODE. 

(We  shall  see  that  c is  sometimes  not  completely  arbitrary  but  must  be  restricted  to  some 
interval  to  avoid  complex  expressions  in  the  solution.) 

We  shall  develop  methods  that  will  give  general  solutions  uniquely  (perhaps  except  for 
notation).  Hence  we  shall  say  the  general  solution  of  a given  ODE  (instead  of  a general 
solution). 

Geometrically,  the  general  solution  of  an  ODE  is  a family  of  infinitely  many  solution 
curves,  one  for  each  value  of  the  constant  c.  If  we  choose  a specific  c (e.g.,  c = 6.45  or  0 
or  —2.01)  we  obtain  what  is  called  a particular  solution  of  the  ODE.  A particular  solution 
does  not  contain  any  arbitrary  constants. 

In  most  cases,  general  solutions  exist,  and  every  solution  not  containing  an  arbitrary 
constant  is  obtained  as  a particular  solution  by  assigning  a suitable  value  to  c.  Exceptions 
to  these  rules  occur  but  are  of  minor  interest  in  applications;  see  Prob.  16  in  Problem 
Set  1.1. 

Initial  Value  Problem 

In  most  cases  the  unique  solution  of  a given  problem,  hence  a particular  solution,  is 
obtained  from  a general  solution  by  an  initial  condition  y(xo)  = yo,  with  given  values 
xq  and  yo,  that  is  used  to  determine  a value  of  the  arbitrary  constant  c.  Geometrically 
this  condition  means  that  the  solution  curve  should  pass  through  the  point  (xo,  yo) 
in  the  xy-plane.  An  ODE,  together  with  an  initial  condition,  is  called  an  initial  value 
problem.  Thus,  if  the  ODE  is  explicit,  y = /(x,  v),  the  initial  value  problem  is  of 
the  form 

(5)  y'  = f{x,  y),  y(x0)  = Vo- 

Initial  Value  Problem 

Solve  the  initial  value  problem 

, dy 

y = -T-  = 3y,  y{  0)  = 5.7. 

ax 

Solution.  The  general  solution  is  ;y(x)  = ce3x\  see  Example  3.  From  this  solution  and  the  initial  condition 
we  obtain  y(O)  = ce°  = c = 5.7.  Hence  the  initial  value  problem  has  the  solution  y(x)  = 51edx.  This  is  a 
particular  solution. 

More  on  Modeling 

The  general  importance  of  modeling  to  the  engineer  and  physicist  was  emphasized  at  the 
beginning  of  this  section.  We  shall  now  consider  a basic  physical  problem  that  will  show 
the  details  of  the  typical  steps  of  modeling.  Step  1 : the  transition  from  the  physical  situation 
(the  physical  system)  to  its  mathematical  formulation  (its  mathematical  model);  Step  2: 
the  solution  by  a mathematical  method;  and  Step  3:  the  physical  interpretation  of  the  result. 
This  may  be  the  easiest  way  to  obtain  a first  idea  of  the  nature  and  purpose  of  differential 
equations  and  their  applications.  Realize  at  the  outset  that  your  computer  (your  CAS ) 
may  perhaps  give  you  a hand  in  Step  2,  but  Steps  1 and  3 are  basically  your  work. 
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And  Step  2 requires  a solid  knowledge  and  good  understanding  of  solution  methods 
available  to  you — you  have  to  choose  the  method  for  your  work  by  hand  or  by  the 
computer.  Keep  this  in  mind,  and  always  check  computer  results  for  errors  (which  may 
arise,  for  instance,  from  false  inputs). 

EXAMPLE  5 Radioactivity.  Exponential  Decay 

Given  an  amount  of  a radioactive  substance,  say,  0.5  g (gram),  find  the  amount  present  at  any  later  time. 

Physical  Information.  Experiments  show  that  at  each  instant  a radioactive  substance  decomposes — and  is  thus 
decaying  in  time — proportional  to  the  amount  of  substance  present. 

Step  1.  Setting  up  a mathematical  model  of  the  physical  process.  Denote  by  y(t)  the  amount  of  substance  still 
present  at  any  time  t.  By  the  physical  law,  the  time  rate  of  change  y (t)  = dy/dt  is  proportional  to  y(t).  This 
gives  the  first-order  ODE 


dy 

(6)  -=-ky 

where  the  constant  k is  positive,  so  that,  because  of  the  minus,  we  do  get  decay  (as  in  [B]  of  Example  3). 
The  value  of  k is  known  from  experiments  for  various  radioactive  substances  (e.g.,  k = 1.4  • 10-11  sec-1, 
approximately,  for  radium  gsR3)- 

Now  the  given  initial  amount  is  0.5  g,  and  we  can  call  the  corresponding  instant  t = 0.  Then  we  have  the 
initial  condition  y(0)  = 0.5.  This  is  the  instant  at  which  our  observation  of  the  process  begins.  It  motivates 
the  term  initial  condition  (which,  however,  is  also  used  when  the  independent  variable  is  not  time  or  when 
we  choose  a t other  than  / = 0).  Hence  the  mathematical  model  of  the  physical  process  is  the  initial  value 
problem 


dy 

(7)  — = ~ky,  y(0)  = 0.5. 

dt 

Step  2.  Mathematical  solution.  As  in  (B)  of  Example  3 we  conclude  that  the  ODE  (6)  models  exponential  decay 
and  has  the  general  solution  (with  arbitrary  constant  c but  definite  given  k ) 

(8)  y(t)  = ce~kt. 

We  now  determine  c by  using  the  initial  condition.  Since  y(0)  = c from  (8),  this  gives  y(0)  = c = 0.5.  Hence 
the  particular  solution  governing  our  process  is  (cf.  Fig.  5) 

(9)  y(t)  = 0.5e-fct  (k  > 0). 

Always  check  your  result — it  may  involve  human  or  computer  errors!  Verify  by  differentiation  (chain  rule!) 
that  your  solution  (9)  satisfies  (7)  as  well  as  y(0)  = 0.5: 

f = -0.5 ke~kt  = -k  ■ 0.5e~kt  = -ky,  y(0)  = 0.5e°  = 0.5. 
dt 

Step  3.  Interpretation  of  result.  Formula  (9)  gives  the  amount  of  radioactive  substance  at  time  t.  It  starts  from 
the  correct  initial  amount  and  decreases  with  time  because  k is  positive.  The  limit  of  y as  t — > oo  is  zero. 


Fig.  5.  Radioactivity  (Exponential  decay, 
y = 0.5e  kt,  with  k = 1.5  as  an  example) 
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CALCULUS 


Solve  the  ODE  by  integration  or  by  remembering  a 
differentiation  formula. 

1.  y'  + 2 sin  2ttx  = 0 

2.  y + *e_x2/2  = 0 

3.  y'  = y 

4.  y = — 1.5y 

5.  y = 4e~x  cos  x 

6n 

■ y = ~y 

7.  y = cosh  5. 1 3jc 

8.  y'"  = e~°'2x 
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VERIFICATION.  INITIAL  VALUE 


PROBLEM  (IVP) 


(a)  Verify  that  y is  a solution  of  the  ODE.  (b)  Determine 

from  y the  particular  solution  of  the  IVP.  (c)  Graph  the 

solution  of  the  IVP. 

9.  y + 4y  = 1.4,  y = ce~4x  + 0.35,  y(0)  = 2 

10.  y + 5xy  = 0,  y = ce~25x  , v(0)  = tt 

11.  y = y + ex,  y = (x  + c)ex,  y(0)  = j 

12.  yy  = 4x,  y2  - 4xz  = c(y  > 0),  y(l)  = 4 

13.  y'  — y ~ y2,  y = 1 _x,  y(0)  = 0.25 

1 + ce 

14.  y'  tanjc  = 2y  — 8,  y = c sin2jc  + 4,  yd^r)  = 0 

15.  Find  two  constant  solutions  of  the  ODE  in  Prob.  13  by 
inspection. 

16.  Singular  solution.  An  ODE  may  sometimes  have  an 
additional  solution  that  cannot  be  obtained  from  the 
general  solution  and  is  then  called  a singular  solution. 
The  ODE  y,z  — xy'  + y = 0 is  of  this  kind.  Show 
by  differentiation  and  substitution  that  it  has  the 
general  solution  y = cx  — c 2 and  the  singular  solution 
y = x2/4.  Explain  Fig.  6. 


Fig.  6 
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MODELING,  APPLICATIONS 


These  problems  will  give  you  a first  impression  of  modeling. 
Many  more  problems  on  modeling  follow  throughout  this 
chapter. 


17.  Half-life.  The  half-life  measures  exponential  decay. 
It  is  the  time  in  which  half  of  the  given  amount  of 
radioactive  substance  will  disappear.  What  is  the  half- 
life  of  2lgRa  (in  years)  in  Example  5? 

18.  Half-life.  Radium  2l|Ra  has  a half-life  of  about 
3.6  days. 

(a)  Given  1 gram,  how  much  will  still  be  present  after 
1 day? 

(b)  After  1 year? 


19.  Free  fall.  In  dropping  a stone  or  an  iron  ball,  air 
resistance  is  practically  negligible.  Experiments 
show  that  the  acceleration  of  the  motion  is  constant 
(equal  to  g = 9.80  m/sec2  = 32  ft/sec2, called  the 
acceleration  of  gravity).  Model  this  as  an  ODE  for 
y(t),  the  distance  fallen  as  a function  of  time  t.  If  the 
motion  starts  at  time  t = 0 from  rest  (i.e.,  with  velocity 
v = y'  = 0),  show  that  you  obtain  the  familiar  law  of 
free  fall 


y = \gt 


2 


20.  Exponential  decay.  Subsonic  flight.  The  efficiency 
of  the  engines  of  subsonic  airplanes  depends  on  air 
pressure  and  is  usually  maximum  near  35,000  ft. 
Find  the  air  pressure  y{x)  at  this  height.  Physical 
information.  The  rate  of  change  y (jc)  is  proportional 
to  the  pressure.  At  18,000  ft  it  is  half  its  value 
yo  = y(0)  at  sea  level.  Hint.  Remember  from  calculus 
that  if  y = ekx,  then  y'  = kekx  = ky.  Can  you  see 
without  calculation  that  the  answer  should  be  close 
to  y„/4? 


Particular  solutions  and  singular 
solution  in  Problem  16 


SEC.  1.2  Geometric  Meaning  of  y'  = f(x,  y).  Direction  Fields,  Euler’s  Method 
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U Geometric  Meaning  of  y = f(x,  y). 
Direction  Fields,  Eulers  Method 

A first-order  ODE 


(1)  y = fix,  y) 

has  a simple  geometric  interpretation.  From  calculus  you  know  that  the  derivative  y\x ) of 
y(x)  is  the  slope  of  y(jt).  Hence  a solution  curve  of  (1)  that  passes  through  a point  (jc0,  v0) 
must  have,  at  that  point,  the  slope  y (x0)  equal  to  the  value  of/ at  that  point;  that  is, 


/(*o)  = fUo,  >’o)- 


Using  this  fact,  we  can  develop  graphic  or  numeric  methods  for  obtaining  approximate 
solutions  of  ODEs  (1).  This  will  lead  to  a better  conceptual  understanding  of  an  ODE  (1). 
Moreover,  such  methods  are  of  practical  importance  since  many  ODEs  have  complicated 
solution  formulas  or  no  solution  formulas  at  all,  whereby  numeric  methods  are  needed. 


Graphic  Method  of  Direction  Fields.  Practical  Example  Illustrated  in  Fig.  7.  We 

can  show  directions  of  solution  curves  of  a given  ODE  (1)  by  drawing  short  straight-line 
segments  (lineal  elements)  in  the  xy-plane.  This  gives  a direction  field  (or  slope  field) 
into  which  you  can  then  fit  (approximate)  solution  curves.  This  may  reveal  typical 
properties  of  the  whole  family  of  solutions. 

Figure  7 shows  a direction  field  for  the  ODE 

(2)  y'  = y + x 

obtained  by  a CAS  (Computer  Algebra  System)  and  some  approximate  solution  curves 
fitted  in. 


\\\\\\\\  \ \_2  tv  \ \ \ \ \ v 


Fig.  7.  Direction  field  of  y'  = y + x,  with  three  approximate  solution 
curves  passing  through  (0, 1),  (0,  0),  (0,  —1),  respectively 


10 


CHAP.  1 First-Order  ODEs 


If  you  have  no  CAS,  first  draw  a few  level  curves  f(x,  y ) = const  of  f(x,  y),  then  parallel 
lineal  elements  along  each  such  curve  (which  is  also  called  an  isocline,  meaning  a curve 
of  equal  inclination),  and  finally  draw  approximation  curves  fit  to  the  lineal  elements. 

We  shall  now  illustrate  how  numeric  methods  work  by  applying  the  simplest  numeric 
method,  that  is  Euler’s  method,  to  an  initial  value  problem  involving  ODE  (2).  First  we 
give  a brief  description  of  Euler’s  method. 

Numeric  Method  by  Euler 

Given  an  ODE  (1)  and  an  initial  value  y(xo)  = yo,  Euler’s  method  yields  approximate 
solution  values  at  equidistant  x- values  x0,  x\  = xq  + h,  x2  = x'o  + 2 h,  • • ■ , namely, 

yi  = yo  + hf(x o,  y0)  (Fig.  8) 
yz  = y i + ¥(xi,yi),  etc. 


In  general. 


yn  = yn- 1 + hf(xn-i,yn-d 

where  the  step  h equals,  e.g.,  0.1  or  0.2  (as  in  Table  1.1)  or  a smaller  value  for  greater 
accuracy. 


Fig.  8.  First  Euler  step,  showing  a solution  curve,  its  tangent  at  (x0,  y0), 
step  h and  increment  hf(x0,  y0)  in  the  formula  for 


Table  1.1  shows  the  computation  of  n = 5 steps  with  step  h = 0.2  for  the  ODE  (2)  and 
initial  condition  y(0)  = 0,  corresponding  to  the  middle  curve  in  the  direction  field.  We 
shall  solve  the  ODE  exactly  in  Sec.  1.5.  For  the  time  being,  verify  that  the  initial  value 
problem  has  the  solution  y = ex  — x — 1.  The  solution  curve  and  the  values  in  Table  1.1 
are  shown  in  Fig.  9.  These  values  are  rather  inaccurate.  The  errors  y(xn)  — yn  are  shown 
in  Table  1.1  as  well  as  in  Fig.  9.  Decreasing  h would  improve  the  values,  but  would  soon 
require  an  impractical  amount  of  computation.  Much  better  methods  of  a similar  nature 
will  be  discussed  in  Sec.  21.1. 


SEC.  1.2  Geometric  Meaning  of  y'  = f[x,  y).  Direction  Fields,  Euler’s  Method 
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Table  1."  Euler  method  fory’  = y + x,y(0)  = 0 for 
x = 0,  , 1.0  with  step  h — 0.2 


n 

xn 

yn 

y(Xn) 

Error 

0 

0.0 

0.000 

0.000 

0.000 

1 

0.2 

0.000 

0.021 

0.021 

2 

0.4 

0.04 

0.092 

0.052 

3 

0.6 

0.128 

0.222 

0.094 

4 

0.8 

0.274 

0.426 

0.152 

5 

1.0 

0.488 

0.718 

0.230 

y 


Fig.  9.  Euler  method:  Approximate  values  in  Table  1.1  and  solution  curve 


P^RQBL=E:M=S^T~1TZ 
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DIRECTION  FIELDS,  SOLUTION  CURVES 


Graph  a direction  field  (by  a CAS  or  by  hand).  In  the  field 
graph  several  solution  curves  by  hand,  particularly  those 
passing  through  the  given  points  ( x , y). 

1.  y = 1 + y2,  (j7 r,  1) 

2.  yy'  + 4x  = 0,  (1,  1),  (0,  2) 

3.  y'  = 1 - v2,  (0,  0),  (2,  i) 

4.  y'  =2y  - y2,  (0,  0),  (0,  1 ),  (0,  2),  (0,  3) 

5.  y = x - 1/y,  (1,  |) 

6.  y'  = sin2y,  (0,  -0.4),  (0,  1) 

7.  v'  = ey/x , (2,  2),  (3,3) 

8.  y'  = -2xy,  (0,  |),  (0,  1),  (0,  2) 
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ACCURACY  OF  DIRECTION  FIELDS 


Direction  fields  are  very  useful  because  they  can  give  you 
an  impression  of  all  solutions  without  solving  the  ODE, 
which  may  be  difficult  or  even  impossible.  To  get  a feel  for 
the  accuracy  of  the  method,  graph  a field,  sketch  solution 
curves  in  it,  and  compare  them  with  the  exact  solutions. 

9.  y = cos  ttx 

10.  y = - 5y V2  (Sol.  Vy  + f x = c) 

11.  Autonomous  ODE.  This  means  an  ODE  not  showing 
x (the  independent  variable)  explicitly.  (The  ODEs  in 
Probs.  6 and  10  are  autonomous.)  What  will  the  level 
curves /(jc,  y)  = const  (also  called  isoclines  = curves 


of  equal  inclination)  of  an  autonomous  ODE  look  like? 
Give  reason. 
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MOTIONS 


Model  the  motion  of  a body  B on  a straight  line  with 
velocity  as  given,  y(f)  being  the  distance  of  B from  a point 
y = 0 at  time  t.  Graph  a direction  field  of  the  model  (the 
ODE).  In  the  field  sketch  the  solution  curve  satisfying  the 
given  initial  condition. 


12.  Product  of  velocity  times  distance  constant,  equal  to  2, 
y(0)  = 2. 

13.  Distance  = Velocity  X Time,  y(l)  = 1 

14.  Square  of  the  distance  plus  square  of  the  velocity  equal 
to  1 , initial  distance  1 / V2 

15.  Parachutist.  Two  forces  act  on  a parachutist,  the 
attraction  by  the  earth  mg  (m  = mass  of  person  plus 
equipment,  g = 9.8  m/sec2 the  acceleration  of  gravity) 
and  the  air  resistance,  assumed  to  be  proportional  to  the 
square  of  the  velocity  v(t).  Using  Newton’s  second  law 
of  motion  (mass  X acceleration  = resultant  of  the  forces), 
set  up  a model  (an  ODE  for  v(t )).  Graph  a direction  field 
(choosing  m and  the  constant  of  proportionality  equal  to  1). 
Assume  that  the  parachute  opens  when  v = 10  m/sec. 
Graph  the  corresponding  solution  in  the  field.  What  is  the 
limiting  velocity?  Would  the  parachute  still  be  sufficient 
if  the  air  resistance  were  only  proportional  to  u(r)? 


12 


CHAP.  1 First-Order  ODEs 


16.  CAS  PROJECT.  Direction  Fields.  Discuss  direction 
fields  as  follows. 

(a)  Graph  portions  of  the  direction  field  of  the  ODE  (2) 
(see  Fig.  7),  for  instance,  —5  S x S 2,  — 1 S y = 5. 
Explain  what  you  have  gained  by  this  enlargement  of 
the  portion  of  the  field. 

(b)  Using  implicit  differentiation,  find  an  ODE  with 
the  general  solution  x2  + 9y2  = c (y  > 0).  Graph  its 
direction  field.  Does  the  field  give  the  impression 
that  the  solution  curves  may  be  semi-ellipses?  Can  you 
do  similar  work  for  circles?  Hyperbolas?  Parabolas? 
Other  curves? 

(c)  Make  a conjecture  about  the  solutions  of  y = — x/y 
from  the  direction  field. 

(d)  Graph  the  direction  field  of  y'  = — \y  and  some 
solutions  of  your  choice.  How  do  they  behave?  Why 
do  they  decrease  for  y > 0? 
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EULER’S  METHOD 


This  is  the  simplest  method  to  explain  numerically  solving 
an  ODE,  more  precisely,  an  initial  value  problem  (IVP). 
(More  accurate  methods  based  on  the  same  principle  are 
explained  in  Sec.  21.1.)  Using  the  method,  to  get  a feel  for 
numerics  as  well  as  for  the  nature  of  IVPs,  solve  the  IVP 
numerically  with  a PC  or  a calculator,  10  steps.  Graph  the 
computed  values  and  the  solution  curve  on  the  same 
coordinate  axes. 


17. 

r 

y = 

: y. 

v(0)  = l. 

h 

= 

0.1 

18. 

/ 

y = 

: y. 

v(0)  = l. 

h 

= 

0.01 

19. 

/ 

y = 

= (y  ■ 

- x)2 , y(0) 

= 

0, 

h = 0.1 

Sol. 

y = 

x — tanh  x 

20. 

t 

y = 

= — 5x4y2,  y(0) 

= 

1, 

h = 0.2 

Sol. 

y = 

l/G  + x)5 

1.3  Separable  ODEs.  Modeling 

Many  practically  useful  ODEs  can  be  reduced  to  the  form 

(l)  g(y)y'  = f(x) 


by  purely  algebraic  manipulations.  Then  we  can  integrate  on  both  sides  with  respect  to  x, 
obtaining 


(2) 


g(y)y'dx 


f(x)  dx  + c. 


On  the  left  we  can  switch  to  y as  the  variable  of  integration.  By  calculus,  y dx  = dy,  so  that 


(3) 


g(y)dy 


f(x ) dx  + c. 


If  / and  g are  continuous  functions,  the  integrals  in  (3)  exist,  and  by  evaluating  them  we 
obtain  a general  solution  of  (1).  This  method  of  solving  ODEs  is  called  the  method  of 
separating  variables,  and  (1 ) is  called  a separable  equation,  because  in  (3)  the  variables 
are  now  separated:  x appears  only  on  the  right  and  y only  on  the  left. 

E X A M Separable  ODE 

The  ODE  y = 1 + y2  is  separable  because  it  can  be  written 
dy 

= dx.  By  integration,  arctan  y = x + c or  y = tan  (x  4-  c ). 

1 4- 


It  is  very  important  to  introduce  the  constant  of  integration  immediately  when  the  integration  is  performed. 
If  we  wrote  arctan  y = x,  then  y = tan  x,  and  then  introduced  c,  we  would  have  obtained  y = tan  x + c,  which 
is  not  a solution  (when  c =£  0).  Verify  this. 


SEC.  1.3  Separable  ODEs.  Modeling 


13 


EXAMPLE  2 


EXAMPLE  3 


EXAMPLE  4 


Separable  ODE 

The  ODE  y = (x  + l)e~xy2  is  separable;  we  obtain  y~2 dy  = (x  + 1 )e~x dx. 


By  integration. 


— y 1 = — (x  + 2)e  x + c,  y = 


1 


(x  + 2)e  — c 


Initial  Value  Problem  (IVP).  Bell-Shaped  Curve 

Solve  y'  = —2xy,y(0)  = 1.8. 

Solution.  By  separation  and  integration. 


— = —2x  dx,  In  y = —x2  + c,  y = ce  ^ . 

y 

This  is  the  general  solution.  From  it  and  the  initial  condition,  y(0)  = ce°  = c = 1.8.  Hence  the  IVP  has  the 
solution  y = 1.8e~x  . This  is  a particular  solution,  representing  a bell-shaped  curve  (Fig.  10). 


y 

\ 

/ 

\ 

/ 

i 

- \ 

/ 

\ 

/ 

\ 

i — i 

i ^ i 

-2-1  0 1 2 * 


Fig.  10.  Solution  in  Example  3 (bell-shaped  curve) 


Modeling 

The  importance  of  modeling  was  emphasized  in  Sec.  1.1,  and  separable  equations  yield 
various  useful  models.  Let  us  discuss  this  in  terms  of  some  typical  examples. 

Radiocarbon  Dating2 

In  September  1991  the  famous  Iceman  (Oetzi),  a mummy  from  the  Neolithic  period  of  the  Stone  Age  found  in 
the  ice  of  the  Oetztal  Alps  (hence  the  name  “Oetzi”)  in  Southern  Tyrolia  near  the  Austrian-Italian  border,  caused 
a scientific  sensation.  When  did  Oetzi  approximately  live  and  die  if  the  ratio  of  carbon  to  carbon  1|C  in 
this  mummy  is  52.5%  of  that  of  a living  organism? 

Physical  Information.  In  the  atmosphere  and  in  living  organisms,  the  ratio  of  radioactive  carbon  1|C  (made 
radioactive  by  cosmic  rays)  to  ordinary  carbon  is  constant.  When  an  organism  dies,  its  absorption  of 
by  breathing  and  eating  terminates.  Hence  one  can  estimate  the  age  of  a fossil  by  comparing  the  radioactive 
carbon  ratio  in  the  fossil  with  that  in  the  atmosphere.  To  do  this,  one  needs  to  know  the  half-life  of  1|C,  which 
is  5715  years  ( CRC  Handbook  of  Chemistry  and  Physics,  83rd  ed.,  Boca  Raton:  CRC  Press,  2002,  page  11-52, 
line  9). 

Solution.  Modeling.  Radioactive  decay  is  governed  by  the  ODE  y'  = ky  (see  Sec.  1.1,  Example  5).  By 
separation  and  integration  (where  t is  time  and  yo  is  the  initial  ratio  of  qC  to  gC) 

— = kdt.  In  \y\  = kt  + c,  y = y0 ekt  (y0  = ec). 

y 


2Method  by  WILLARD  FRANK  LIBBY  (1908-1980),  American  chemist,  who  was  awarded  for  this  work 
the  1960  Nobel  Prize  in  chemistry. 
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EXAMPLE  5 


Next  we  use  the  half-life  H = 5715  to  determine  k.  When  t = H,  half  of  the  original  substance  is  still  present.  Thus, 

1.11  J.TT  In  0.5  0.693 

y0ekH  = 0.5^o,  ekH  = 0.5,  k = = -----  = -0.0001213. 

H 5715 

Finally,  we  use  the  ratio  52.5%  for  determining  the  time  t when  Oetzi  died  (actually,  was  killed), 

ekt  = £-0-0001213t  = 0.525,  t = = 5312.  Answer:  About  5300  years  ago. 

-0.0001213  J 6 

Other  methods  show  that  radiocarbon  dating  values  are  usually  too  small.  According  to  recent  research,  this  is 
due  to  a variation  in  that  carbon  ratio  because  of  industrial  pollution  and  other  factors,  such  as  nuclear  testing. 

Mixing  Problem 

Mixing  problems  occur  quite  frequently  in  chemical  industry.  We  explain  here  how  to  solve  the  basic  model 
involving  a single  tank.  The  tank  in  Fig.  1 1 contains  1000  gal  of  water  in  which  initially  100  lb  of  salt  is  dissolved. 
Brine  runs  in  at  a rate  of  10  gal/min,  and  each  gallon  contains  5 lb  of  dissoved  salt.  The  mixture  in  the  tank  is 
kept  uniform  by  stirring.  Brine  runs  out  at  10  gal/min.  Find  the  amount  of  salt  in  the  tank  at  any  time  t. 

Solution.  Step  1.  Setting  up  a model.  Let  y{t)  denote  the  amount  of  salt  in  the  tank  at  time  t.  Its  time  rate 
of  change  is 

y = Salt  inflow  rate  — Salt  outflow  rate  Balance  law. 

5 lb  times  10  gal  gives  an  inflow  of  50  lb  of  salt.  Now,  the  outflow  is  10  gal  of  brine.  This  is  10/1000  = 0.01 
(=  1%)  of  the  total  brine  content  in  the  tank,  hence  0.01  of  the  salt  content  y(r),  that  is,  0.01  y(t).  Thus  the 
model  is  the  ODE 

(4)  y = 50  — O.Oly  = -0.01(y  - 5000). 

Step  2.  Solution  of  the  model.  The  ODE  (4)  is  separable.  Separation,  integration,  and  taking  exponents  on  both 
sides  gives 


= -0.01  dt , In  Iv  - 5000 1 = -0.01?  + c*  y - 5000  = ce~omt. 

y - 5000  } 

Initially  the  tank  contains  100  lb  of  salt.  Hence  y(0)  = 100  is  the  initial  condition  that  will  give  the  unique 
solution.  Substituting  y = 100  and  t = 0 in  the  last  equation  gives  100  — 5000  = ce  = c.  Hence  c = —4900. 
Hence  the  amount  of  salt  in  the  tank  at  time  t is 

(5)  y(t)  = 5000  - 4900e~omt. 

This  function  shows  an  exponential  approach  to  the  limit  5000  lb;  see  Fig.  11.  Can  you  explain  physically  that 
y(r)  should  increase  with  time?  That  its  limit  is  5000  lb?  Can  you  see  the  limit  directly  from  the  ODE? 

The  model  discussed  becomes  more  realistic  in  problems  on  pollutants  in  lakes  (see  Problem  Set  1.5,  Prob.  35) 
or  drugs  in  organs.  These  types  of  problems  are  more  difficult  because  the  mixing  may  be  imperfect  and  the  flow 
rates  (in  and  out)  may  be  different  and  known  only  very  roughly. 


Tank 


Fig.  11.  Mixing  problem  in  Example  5 


SEC.  1.3  Separable  ODEs.  Modeling 
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EXAMPLE  6 


Heating  an  Office  Building  (Newton’s  Law  of  Cooling3) 

Suppose  that  in  winter  the  daytime  temperature  in  a certain  office  building  is  maintained  at  70°F.  The  heating 
is  shut  off  at  10  P.M.  and  turned  on  again  at  6 a.m.  On  a certain  day  the  temperature  inside  the  building  at  2 A.M. 
was  found  to  be  65 °F.  The  outside  temperature  was  50°F  at  10  P.M.  and  had  dropped  to  40°F  by  6 a.m.  What 
was  the  temperature  inside  the  building  when  the  heat  was  turned  on  at  6 A.M.? 

Physical  information.  Experiments  show  that  the  time  rate  of  change  of  the  temperature  T of  a body  B (which 
conducts  heat  well,  for  example,  as  a copper  ball  does)  is  proportional  to  the  difference  between  T and  the 
temperature  of  the  surrounding  medium  (Newton’s  law  of  cooling). 

Solution.  Step  1.  Setting  up  a model.  Let  T{t)  be  the  temperature  inside  the  building  and  TA  the  outside 
temperature  (assumed  to  be  constant  in  Newton’s  law).  Then  by  Newton’s  law, 

dT  , 

(6)  — = Kr  ~ W. 

dt 

Such  experimental  laws  are  derived  under  idealized  assumptions  that  rarely  hold  exactly.  However,  even  if  a 
model  seems  to  fit  the  reality  only  poorly  (as  in  the  present  case),  it  may  still  give  valuable  qualitative  information. 
To  see  how  good  a model  is,  the  engineer  will  collect  experimental  data  and  compare  them  with  calculations 
from  the  model. 

Step  2.  General  solution.  We  cannot  solve  (6)  because  we  do  not  know  TA , just  that  it  varied  between  50°F 
and  40°F,  so  we  follow  the  Golden  Rule:  If  you  cannot  solve  your  problem,  try  to  solve  a simpler  one.  We 
solve  (6)  with  the  unknown  function  TA  replaced  with  the  average  of  the  two  known  values,  or  45 °F.  For  physical 
reasons  we  may  expect  that  this  will  give  us  a reasonable  approximate  value  of  T in  the  building  at  6 a.m. 

For  constant  7^  — 45  (or  any  other  constant  value)  the  ODE  (6)  is  separable.  Separation,  integration,  and 
taking  exponents  gives  the  general  solution 

T dT ^ = k dt,  In  | T — 45 1 = kt  + c*,  T(t)  = 45  + cekt  (c  = ec'). 

Step  3.  Particular  solution.  We  choose  10  P.M.  to  be  t = 0.  Then  the  given  initial  condition  is  7(0)  = 70  and 
yields  a particular  solution,  call  it  Tp.  By  substitution, 

7X0)  = 45  + ce°  = 70,  c = 70  - 45  = 25,  Tp(t)  = 45  + 25ekt. 

Step  4.  Determination  ofk.  We  use  7(4)  = 65,  where  t = 4 is  2 a.m.  Solving  algebraically  for  k and  inserting 
k into  Tp(f)  gives  (Fig.  12) 

7^(4)  = 45  + 25e4fc  = 65,  e4fc  = 0.8,  k = \ In  0.8  = -0.056,  Tv(t)  = 45  + 25e~0056t. 


Fig.  12.  Particular  solution  (temperature)  in  Example  6 


3Sir  ISAAC  NEWTON  (1642-1727),  great  English  physicist  and  mathematician,  became  a professor  at 
Cambridge  in  1669  and  Master  of  the  Mint  in  1699.  He  and  the  German  mathematician  and  philosopher 
GOTTFRIED  WILHELM  LEIBNIZ  (1646-1716)  invented  (independently)  the  differential  and  integral  calculus. 
Newton  discovered  many  basic  physical  laws  and  created  the  method  of  investigating  physical  problems  by 
means  of  calculus.  His  Philosophiae  naturalis  principia  mathematica  (. Mathematical  Principles  of  Natural 
Philosophy,  1687)  contains  the  development  of  classical  mechanics.  His  work  is  of  greatest  importance  to  both 
mathematics  and  physics. 


16 


CHAP.  1 First-Order  ODEs 


EXAMPLE  7 


Step  5.  Answer  and  interpretation.  6 A.M.  is  f = 8 (namely.  8 hours  after  10  P.M.),  and 

Tp( 8)  = 45  + 25e-0'056'8  = 61[°F], 


Hence  the  temperature  in  the  building  dropped  9°F,  a result  that  looks  reasonable. 


Leaking  Tank.  Outflow  of  Water  Through  a Hole  (Torricelli’s  Law) 

This  is  another  prototype  engineering  problem  that  leads  to  an  ODE.  It  concerns  the  outflow  of  water  from  a 
cylindrical  tank  with  a hole  at  the  bottom  (Fig.  13).  You  are  asked  to  find  the  height  of  the  water  in  the  tank  at 
any  time  if  the  tank  has  diameter  2 m,  the  hole  has  diameter  1 cm,  and  the  initial  height  of  the  water  when  the 
hole  is  opened  is  2.25  m.  When  will  the  tank  be  empty? 

Physical  information.  Under  the  influence  of  gravity  the  outflowing  water  has  velocity 

(7)  v(r)  = 0.600 \/2 gh(t)  (Torricelli’s  law4), 

where  h(t)  is  the  height  of  the  water  above  the  hole  at  time  t,  and  g = 980cm/sec2  = 32.17  ft/sec2  is  the 
acceleration  of  gravity  at  the  surface  of  the  earth. 

Solution.  Step  1.  Setting  up  the  model.  To  get  an  equation,  we  relate  the  decrease  in  water  level  h(t)  to  the 
outflow.  The  volume  AV  of  the  outflow  during  a short  time  At  is 

AV  = Av  At  (A  = Area  of  hole). 

AV  must  equal  the  change  AV*  of  the  volume  of  the  water  in  the  tank.  Now 

AV*  = —B  Ah  ( B = Cross-sectional  area  of  tank) 

where  Ah  (>  0)  is  the  decrease  of  the  height  h(t)  of  the  water.  The  minus  sign  appears  because  the  volume  of 
the  water  in  the  tank  decreases.  Equating  AV  and  AV*  gives 

-B  Ah  = Av  At. 


We  now  express  v according  to  Torricelli’s  law  and  then  let  At  (the  length  of  the  time  interval  considered) 
approach  0 — this  is  a standard  way  of  obtaining  an  ODE  as  a model.  That  is,  we  have 


Ah  A 
~At  = ~~BV 


-f  0.600V2 gh(t) 
B 


and  by  letting  At  — > 0 we  obtain  the  ODE 

— = -26.56  -VS. 
dt  B 

where  26.56  = 0.600 V2  • 980.  This  is  our  model,  a first-order  ODE. 

Step  2.  General  solution.  Our  ODE  is  separable.  A/B  is  constant.  Separation  and  integration  gives 

dh  A A 

— — = —26.56  — dt  and  2V/z  = c*  — 26.56  — t. 

Vh  B B 

Dividing  by  2 and  squaring  gives  h = (c  — 1 3. 2SAt/B)2.  Inserting  13.28A/B  = 13.28  • 0.527r/10027T  = 0.000332 
yields  the  general  solution 


h{t)  = (c  - 0.000  332r)2 


4EVANGELISTA  TORRICELLI  (1608-1647),  Italian  physicist,  pupil  and  successor  of  GALILEO  GALILEI 
(1564-1642)  at  Florence.  The  “contraction  factor”  0.600  was  introduced  by  J.  C.  BORDA  in  1766  because  the 
stream  has  a smaller  cross  section  than  the  area  of  the  hole. 


SEC.  1.3  Separable  ODEs.  Modeling 
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Step  3.  Particular  solution.  The  initial  height  (the  initial  condition)  is  h( 0)  = 225  cm.  Substitution  of  t = 0 
and  h = 225  gives  from  the  general  solution  c = 225,  c = 15.00  and  thus  the  particular  solution  (Fig.  13) 


hJt)  = (15.00  - 0.0003320 


Step  4.  Tank  empty.  hv(t ) = 0 if  t = 15.00/0.000332  = 45,181 
Here  you  see  distinctly  the  importance  of  the  choice  of  units 
in  which  time  is  measured  in  seconds!  We  used  g = 980  cm/sec2. 


= 12.6  [hours], 
we  have  been  working  with  the  cgs  system. 


Step  5.  Checking.  Check  the  result. 


2.25  m 


^—2.00 

Water 
at  ti 

. i 

1 

hit) 

^ . 

Outflowing 

water 


h 

250  - 
200  -\ 

150  - 
100  - 
50  - 

Q I I I ! I 

0 10000  30000  50000  t 


Tank  Water  level  hit)  in  tank 

Fig.  13  Example  7.  Outflow  from  a cylindrical  tank  (“leaking  tank"). 
Torricelli’s  law 


Extended  Method:  Reduction  to  Separable  Form 

Certain  nonseparable  ODEs  can  be  made  separable  by  transformations  that  introduce  for 
y a new  unknown  function.  We  discuss  this  technique  for  a class  of  ODEs  of  practical 
importance,  namely,  for  equations 


(8) 


Here, /is  any  (differentiable)  function  of  y/x,  such  as  sin(y/x),  (y/x)4,  and  so  on.  (Such 
an  ODE  is  sometimes  called  a homogeneous  ODE,  a term  we  shall  not  use  but  reserve 
for  a more  important  purpose  in  Sec.  1.5.) 

The  form  of  such  an  ODE  suggests  that  we  set  y/x  = u;  thus, 

(9)  y = ux  and  by  product  differentiation  y'  = u x + u. 

Substitution  into  y'  = fiy/x)  then  gives  u x + u = f{u)  or  u x = f(u)  — u.  We  see  that 
if  f{u)  — u =/=  0,  this  can  be  separated: 


du  dx 

f(u ) — u x 


(10) 
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EXAMPLE  8 Reduction  to  Separable  Form 

Solve 


~ / 2 2 

2 xyy  = y — x . 

Solution.  To  get  the  usual  explicit  form,  divide  the  given  equation  by  2xy, 


2 2 

y -x  y x 


Ixy  2x  2 y 

Now  substitute  y and  y from  (9)  and  then  simplify  by  subtracting  u on  both  sides. 


, u 1 

U X + u = — — — , 
2 2 u 


, _ _U  _ 1 — u - 1 

2 2m  2m 


You  see  that  in  the  last  equation  you  can  now  separate  the  variables, 

2 u du  dx  . n | | 

= . By  integration.  In  (1  + mz)  = — In  |jc|  + c*  = In 


1 + u 


+ c*. 


Take  exponents  on  both  sides  to  get  1 + m2  = c/x  or  1 + (y/x)2  = c/x.  Multiply  the  last  equation  by  x2  to 
obtain  (Fig.  14) 


2 , 2 

x + y = ex 


Thus 


v = 


This  general  solution  represents  a family  of  circles  passing  through  the  origin  with  centers  on  the  x-axis. 


Fig.  14.  General  solution  (family  of  circles)  in  Example  8 


PRQ&LE:M=SET  1T1 


1.  CAUTION!  Constant  of  integration.  Why  is  it 

important  to  introduce  the  constant  of  integration 
immediately  when  you  integrate? 


2-10 


GENERAL  SOLUTION 


Find  a general  solution.  Show  the  steps  of  derivation.  Check 
your  answer  by  substitution. 

2.  y y + x =0 

3.  y = sec  y 

4.  y'  sin  27 tx  = Try  cos  27 tx 

5.  yy'  + 36x  = 0 

6.  y'  = e2x-V 


7.  xy'  = y + 2x3  sin2  — (Set  y/x  = u) 

8.  y = (y  + 4x)2  (Set  y + 4x  = v) 

9.  xy'  = y2  + y (Set  y/x  = u ) 

10.  xy'  = x + y (Sety/.r  = u) 


INITIAL  VALUE  PROBLEMS  (IVPs) 

Solve  the  IVP.  Show  the  steps  of  derivation,  beginning  with 
the  general  solution. 
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11.  xy'  + y = 0,  y(4)  = 6 

12.  y = 1 + 4y2,  y(l)  = 0 

13.  yrcosh2x  = sin2y,  y(0)  = t 

14.  dr/dt  = —2 tr,  r( 0)  = r0 

15.  y'  = -4x/y,  y( 2)  = 3 

16.  y'  = (x  + y - 2)2,  y(0)  = 2 

(Set  v = x + y - 2) 

17.  xy'  = y + 3x4  cos2  (y/x),  y(l)  = 0 
(Set  y/x  = u ) 


18.  Particular  solution.  Introduce  limits  of  integration  in 
(3)  such  that  y obtained  from  (3)  satisfies  the  initial 
condition  y(x0)  = y0. 


SEC.  1.3  Separable  ODEs.  Modeling 
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MODELING,  APPLICATIONS 


19.  Exponential  growth.  If  the  growth  rate  of  the  number 
of  bacteria  at  any  time  t is  proportional  to  the  number 
present  at  t and  doubles  in  1 week,  how  many  bacteria 
can  be  expected  after  2 weeks?  After  4 weeks? 


20.  Another  population  model. 

(a)  If  the  birth  rate  and  death  rate  of  the  number  of 
bacteria  are  proportional  to  the  number  of  bacteria 
present,  what  is  the  population  as  a function  of  time. 


(b)  What  is  the  limiting  situation  for  increasing  time? 
Interpret  it. 

21.  Radiocarbon  dating.  What  should  be  the  1|C  content 
(in  percent  of  y0)  of  a fossilized  tree  that  is  claimed  to 
be  3000  years  old?  (See  Example  4.) 


22.  Linear  accelerators  are  used  in  physics  for 
accelerating  charged  particles.  Suppose  that  an  alpha 
particle  enters  an  accelerator  and  undergoes  a constant 
acceleration  that  increases  the  speed  of  the  particle 
from  103  m/sec  to  104  m/sec  in  10-3  sec.  Find  the 
acceleration  a and  the  distance  traveled  during  that 
period  of  10-3  sec. 

23.  Boyle-Mariotte’s  law  for  ideal  gases.5  Experiments 
show  for  a gas  at  low  pressure  p (and  constant 
temperature)  the  rate  of  change  of  the  volume  V(p) 
equals  ~V/p.  Solve  the  model. 

24.  Mixing  problem.  A tank  contains  400  gal  of  brine 
in  which  100  lb  of  salt  are  dissolved.  Fresh  water  runs 
into  the  tank  at  a rate  of  2 gal/min.The  mixture,  kept 
practically  uniform  by  stirring,  runs  out  at  the  same 
rate.  How  much  salt  will  there  be  in  the  tank  at  the 
end  of  1 hour? 


25.  Newton’s  law  of  cooling.  A thermometer,  reading 
5°C,  is  brought  into  a room  whose  temperature  is  22°C. 
One  minute  later  the  thermometer  reading  is  12°C. 
How  long  does  it  take  until  the  reading  is  practically 
22°C,  say,  21.9°C? 

26.  Gompertz  growth  in  tumors.  The  Gompertz  model 
is  y = —Ay  In  y (A  > 0),  where  y(t)  is  the  mass  of 
tumor  cells  at  time  t.  The  model  agrees  well  with 
clinical  observations.  The  declining  growth  rate  with 
increasing  y > 1 corresponds  to  the  fact  that  cells  in 
the  interior  of  a tumor  may  die  because  of  insufficient 
oxygen  and  nutrients.  Use  the  ODE  to  discuss  the 
growth  and  decline  of  solutions  (tumors)  and  to  find 
constant  solutions.  Then  solve  the  ODE. 

27.  Dryer.  If  a wet  sheet  in  a dryer  loses  its  moisture  at 
a rate  proportional  to  its  moisture  content,  and  if  it 
loses  half  of  its  moisture  during  the  first  10  min  of 


drying,  when  will  it  be  practically  dry,  say,  when  will 
it  have  lost  99%  of  its  moisture?  First  guess,  then 
calculate. 

28.  Estimation.  Could  you  see,  practically  without  calcu- 
lation, that  the  answer  in  Prob.  27  must  lie  between 
60  and  70  min?  Explain. 

29.  Alibi?  Jack,  arrested  when  leaving  a bar,  claims  that 
he  has  been  inside  for  at  least  half  an  hour  (which 
would  provide  him  with  an  alibi).  The  police  check 
the  water  temperature  of  his  car  (parked  near  the 
entrance  of  the  bar)  at  the  instant  of  arrest  and  again 
30  min  later,  obtaining  the  values  190°F  and  110°F, 
respectively.  Do  these  results  give  Jack  an  alibi? 
(Solve  by  inspection.) 

30.  Rocket.  A rocket  is  shot  straight  up  from  the  earth, 
with  a net  acceleration  (=  acceleration  by  the  rocket 
engine  minus  gravitational  pullback)  of  7fm/sec2 
during  the  initial  stage  of  flight  until  the  engine  cut  out 
at  t = 10  sec.  How  high  will  it  go,  air  resistance 
neglected? 

31.  Solution  curves  of  y'  = g(y/x).  Show  that  any 
(nonvertical)  straight  line  through  the  origin  of  the 
xy-plane  intersects  all  these  curves  of  a given  ODE  at 
the  same  angle. 

32.  Friction.  If  a body  slides  on  a surface,  it  experiences 
friction  F (a  force  against  the  direction  of  motion). 
Experiments  show  that  |.F|  = p\N\  (Coulomb’s6  law  of 
kinetic  friction  without  lubrication),  where  N is  the 
normal  force  (force  that  holds  the  two  surfaces  together; 
see  Fig.  15)  and  the  constant  of  proportionality  p is 
called  the  coefficient  of  kinetic  friction.  In  Fig.  15 
assume  that  the  body  weighs  45  nt  (about  10  lb;  see 
front  cover  for  conversion),  p = 0.20  (corresponding 
to  steel  on  steel),  a = 30°,  the  slide  is  10  m long,  the 
initial  velocity  is  zero,  and  air  resistance  is 
negligible.  Find  the  velocity  of  the  body  at  the  end 
of  the  slide. 


5R0BERT  BOYLE  (1627-1691),  English  physicist  and  chemist,  one  of  the  founders  of  the  Royal  Society.  EDME  MARIOTTE  (about 
1620-1684),  French  physicist  and  prior  of  a monastry  near  Dijon.  They  found  the  law  experimentally  in  1662  and  1676,  respectively. 

6CHARLES  AUGUSTIN  DE  COULOMB  (1736-1806).  French  physicist  and  engineer. 
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33.  Rope.  To  tie  a boat  in  a harbor,  how  many  times 
must  a rope  be  wound  around  a bollard  (a  vertical 
rough  cylindrical  post  fixed  on  the  ground)  so  that  a 
man  holding  one  end  of  the  rope  can  resist  a force 
exerted  by  the  boat  1000  times  greater  than  the  man 
can  exert?  First  guess.  Experiments  show  that  the 
change  AS  of  the  force  5 in  a small  portion  of  the 
rope  is  proportional  to  S and  to  the  small  angle  \(f> 
in  Fig.  16.  Take  the  proportionality  constant  0.15. 
The  result  should  surprise  you! 


this  as  the  condition  for  the  two  families  to  be 
orthogonal  (i.e.,  to  intersect  at  right  angles)?  Do  your 
graphs  confirm  this? 

(e)  Sketch  families  of  curves  of  your  own  choice  and 
find  their  ODEs.  Can  every  family  of  curves  be  given 
by  an  ODE? 

35.  CAS  PROJECT.  Graphing  Solutions.  A CAS  can 

usually  graph  solutions,  even  if  they  are  integrals  that 
cannot  be  evaluated  by  the  usual  analytical  methods  of 
calculus. 

(a)  Show  this  for  the  five  initial  value  problems 
y = e~x  , y(0)  = 0,  ±1,  ±2  graphing  all  five  curves 
on  the  same  axes. 

(b)  Graph  approximate  solution  curves,  using  the  first 
few  terms  of  the  Maclaurin  series  (obtained  by  term- 
wise  integration  of  that  of  y)  and  compare  with  the 
exact  curves. 

(c)  Repeat  the  work  in  (a)  for  another  ODE  and  initial 
conditions  of  your  own  choice,  leading  to  an  integral 
that  cannot  be  evaluated  as  indicated. 


34.  TEAM  PROJECT.  Family  of  Curves.  A family  of 
curves  can  often  be  characterized  as  the  general 
solution  of  y'  = f{x,  y). 

(a)  Show  that  for  the  circles  with  center  at  the  origin 
we  get  y = — x/y. 

(b)  Graph  some  of  the  hyperbolas  xy  = c.  Find  an 
ODE  for  them. 

(c)  Find  an  ODE  for  the  straight  lines  through  the 
origin. 

(d)  You  will  see  that  the  product  of  the  right  sides  of 
the  ODEs  in  (a)  and  (c)  equals  — 1 . Do  you  recognize 


36.  TEAM  PROJECT.  Torricelli’s  Law.  Suppose  that 
the  tank  in  Example  7 is  hemispherical,  of  radius  R, 
initially  full  of  water,  and  has  an  outlet  of  5 cm2  cross- 
sectional  area  at  the  bottom.  (Make  a sketch.)  Set 
up  the  model  for  outflow.  Indicate  what  portion  of 
your  work  in  Example  7 you  can  use  (so  that  it  can 
become  part  of  the  general  method  independent  of  the 
shape  of  the  tank).  Find  the  time  t to  empty  the  tank 
(a)  for  any  R,  (b)  for  R = 1 m.  Plot  t as  function  of 
R.  Find  the  time  when  h = R/ 2 (a)  for  any  R,  (b)  for 
R = 1 m. 


Exact  ODEs.  Integrating  Factors 

We  recall  from  calculus  that  if  a function  u(x,  y)  has  continuous  partial  derivatives,  its 
differential  (also  called  its  total  differential ) is 

du  dll 

du  = — dx  H dy. 

dx  dy 

From  this  it  follows  that  if  u(x,  y)  = c = const,  then  du  = 0. 

For  example,  if  u = x + x2y3  = c,  then 

du  = (1  + 2xy3)  dx  + 3 x2y2  dy  = 0 


or 

i dy  1 + 2 xy3 
y dx  3 x2y2 


SEC.  1.4  Exact  ODEs.  Integrating  Factors 
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an  ODE  that  we  can  solve  by  going  backward.  This  idea  leads  to  a powerful  solution 
method  as  follows. 

A first-order  ODE  Mix,  y)  + Nix,  y)y  = 0,  written  as  (use  dy  = y dx  as  in  Sec.  1.3) 

(1)  Mix,  y)  dx  + N(x,  y)  dy  = 0 

is  called  an  exact  differential  equation  if  the  differential  form  Mix,  y)  dx  + N(x,  y)  dy 
is  exact,  that  is,  this  form  is  the  differential 

du  du 

(2)  du  = — dx  H dy 

dx  dy 

of  some  function  u(x,  y).  Then  (1)  can  be  written 


du  = 0. 


By  integration  we  immediately  obtain  the  general  solution  of  (1)  in  the  form 
(3)  uix,  y)  = C. 

This  is  called  an  implicit  solution,  in  contrast  to  a solution  y = h(x)  as  defined  in  Sec. 
1.1,  which  is  also  called  an  explicit  solution,  for  distinction.  Sometimes  an  implicit  solution 
can  be  converted  to  explicit  form.  (Do  this  for  xz  + y2  = 1.)  If  this  is  not  possible,  your 
CAS  may  graph  a figure  of  the  contour  lines  (3)  of  the  function  u(x,  y)  and  help  you  in 
understanding  the  solution. 

Comparing  (1)  and  (2),  we  see  that  (1)  is  an  exact  differential  equation  if  there  is  some 
function  u(x,  y)  such  that 


du  du 

(4)  (a)  — = M,  (b)  — = N. 

dx  dy 

From  this  we  can  derive  a formula  for  checking  whether  (1)  is  exact  or  not,  as  follows. 

Let  M and  N be  continuous  and  have  continuous  first  partial  derivatives  in  a region  in 
the  xy- plane  whose  boundary  is  a closed  curve  without  self-intersections.  Then  by  partial 
differentiation  of  (4)  (see  App.  3.2  for  notation), 

dM  _ d2u 
dy  dy  dx’ 

dN  d2u 

dx  dx  dy 

By  the  assumption  of  continuity  the  two  second  partial  derivaties  are  equal.  Thus 

dM  _ dN 
dy  dx 


(5) 
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This  condition  is  not  only  necessary  but  also  sufficient  for  (1)  to  be  an  exact  differential 
equation.  (We  shall  prove  this  in  Sec.  10.2  in  another  context.  Some  calculus  books,  for 
instance,  [GenRef  12],  also  contain  a proof.) 

If  (1)  is  exact,  the  function  u{x,y)  can  be  found  by  inspection  or  in  the  following 
systematic  way.  From  (4a)  we  have  by  integration  with  respect  to  x 


(6) 


u 


M dx  + k(y ); 


in  this  integration,  y is  to  be  regarded  as  a constant,  and  k(y)  plays  the  role  of  a “constant” 
of  integration.  To  determine  k(y),  we  derive  du/dy  from  (6),  use  (4b)  to  get  dk/dy,  and 
integrate  dk/dy  to  get  k.  (See  Example  1 , below.) 

Formula  (6)  was  obtained  from  (4a).  Instead  of  (4a)  we  may  equally  well  use  (4b). 
Then,  instead  of  (6),  we  first  have  by  integration  with  respect  to  y 


(6*) 


u 


N dy  + l(x). 


To  determine  l(x),  we  derive  du/dx  from  (6*),  use  (4a)  to  get  dl/dx,  and  integrate.  We 
illustrate  all  this  by  the  following  typical  examples. 


An  Exact  ODE 

Solve 

(7)  cos  ( x + y)  dx  + (3y2  + 2 y + cos  (x  + y))  dy  = 0. 

Solution.  Step  1.  Test  for  exactness.  Our  equation  is  of  the  form  (1)  with 

M = cos  (x  + y), 

N = 3 y2  + 2y  + cos  (x  + y). 


Thus 


— = “sht  (x  + y), 
dy 


dN 

— = -sin  (x  + y). 
dx 


From  this  and  (5)  we  see  that  (7)  is  exact. 

Step  2.  Implicit  general  solution.  From  (6)  we  obtain  by  integration 

(8)  u = I M dx  + k(y)  = | cos  (x  + y)  dx  + k(y ) = sin  (x  + y)  + k{y). 

To  find  k(y),  we  differentiate  this  formula  with  respect  to  y and  use  formula  (4b),  obtaining 

du  dk  o 

— = cos  (x  + y)  H = N = 3 y + 2y  + cos  (x  + y). 

dy  dy 

Hence  dk/dy  = 3y2  + 2y.  By  integration,  k = y3  + y2  + c*.  Inserting  this  result  into  (8)  and  observing  (3), 
we  obtain  the  answer 


u(x , y)  = sin  (x  + y)  + y3  + y2  = c. 
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EXAMPLE  2 


EXAMPLE  3 


Step  3.  Checking  an  implicit  solution.  We  can  check  by  differentiating  the  implicit  solution  u{x,  y)  = c 
implicitly  and  see  whether  this  leads  to  the  given  ODE  (7): 

du  du  9 

(9)  du  = — dx  3 dy  = cos  ( x + y)  dx  + (cos  (x  + y)  + 3 y + 2y)  dy  = 0. 

dx  By 

This  completes  the  check. 

An  Initial  Value  Problem 

Solve  the  initial  value  problem 

(10)  (cosy  sinhx  + l)  dx  — sin  y cosh  x dy  = 0,  y(l)  = 2. 

Solution.  You  may  verify  that  the  given  ODE  is  exact.  We  find  u.  For  a change,  let  us  use  (6*), 

u = — J sin  y cosh  x dy  + l{x)  = cos  y cosh  x + l{x). 

From  this,  dw/dx  = cosy  sinhx  + dl/dx  = M = cosy  sinhx  + 1 . Hence  dl/dx  = 1 . By  integration, /(x)  = x + c*. 
This  gives  the  general  solution  u(x,  y)  = cos  y cosh  x + x = c.  From  the  initial  condition,  cos  2 cosh  1 + 1 = 
0.358  = c.  Hence  the  answer  is  cos  y cosh  x + x = 0.358.  Figure  17  shows  the  particular  solutions  fore  = 0,  0.358 
(thicker  curve),  1,  2,  3.  Check  that  the  answer  satisfies  the  ODE.  (Proceed  as  in  Example  1.)  Also  check  that  the 
initial  condition  is  satisfied. 


Fig.  17.  Particular  solutions  in  Example  2 

WARNING!  Breakdown  in  the  Case  of  Nonexactness 

The  equation  — y dx  + x dy  = 0 is  not  exact  because  M — —y  and  N = x,  so  that  in  (5),  BM/By  = — 1 but 
BN/Bx  = 1.  Let  us  show  that  in  such  a case  the  present  method  does  not  work.  From  (6), 

f Bu  dk 

u — \M  dx  + k(y)  = —xy  + k(y),  hence  — = -x  H . 

J dy  dy 

Now,  Bu/By  should  equal  N = x,  by  (4b).  However,  this  is  impossible  because  k(y)  can  depend  only  on  y.  Try 
(6*);  it  will  also  fail.  Solve  the  equation  by  another  method  that  we  have  discussed. 


Reduction  to  Exact  Form.  Integrating  Factors 

The  ODE  in  Example  3 is  —y  dx  + x dy  = 0.  It  is  not  exact.  However,  if  we  multiply  it 
by  1/x2,  we  get  an  exact  equation  [check  exactness  by  (5)!], 


(11) 


— y dx  + x dy 

2 


X 


— hr  dx  H dy  = ( — J = 0. 

X X \xj 


Integration  of  (11)  then  gives  the  general  solution  y/x  = c = const. 


24 


CHAP.  1 First-Order  ODEs 


EXAMPLE  4 


This  example  gives  the  idea.  All  we  did  was  to  multiply  a given  nonexact  equation,  say, 
(12)  P(x,  y ) dx  + Q(x,  y ) dy  = 0, 


by  a function  F that,  in  general,  will  be  a function  of  both  x and  y.  The  result  was  an  equation 
(13)  FPdx  + FQdy  = 0 


that  is  exact,  so  we  can  solve  it  as  just  discussed.  Such  a function  F(x,  y)  is  then  called 

an  integrating  factor  of  (12). 


Integrating  Factor 

The  integrating  factor  in  (1 1)  is  F = 1/x2.  Hence  in  this  case  the  exact  equation  (13)  is 


— y dx  + xdy  (y\  y 

FP  dx  + FQ  dy  = — : = d — ) = 0.  Solution  = c. 

x2  W * 


These  are  straight  lines  y = cx  through  the  origin.  (Note  that  x = 0 is  also  a solution  of  — y dx  + xdy  = 0.) 

It  is  remarkable  that  we  can  readily  find  other  integrating  factors  for  the  equation  — y dx  + xdy  = 0,  namely, 
1/y2,  1 /fry),  and  I /(x2  + y2),  because 


(14) 


— y dx  + x dy 


—y  dx  + x dy 
xy 


—y  dx  + xdy 


■ y 


= d arctan  : 


How  to  Find  Integrating  Factors 

In  simpler  cases  we  may  find  integrating  factors  by  inspection  or  perhaps  after  some  trials, 
keeping  (14)  in  mind.  In  the  general  case,  the  idea  is  the  following. 

For  M dx  + N dy  = 0 the  exactness  condition  (5)  is  dM/dy  = dN/dx.  Hence  for  (13), 
FP  dx  + FQ  dy  = 0,  the  exactness  condition  is 


(15) 


d d 

(FP)  = (FQ). 

ay  dx 


By  the  product  rule,  with  subscripts  denoting  partial  derivatives,  this  gives 


FyP  + FPy  - FXQ  + FQX. 

In  the  general  case,  this  would  be  complicated  and  useless.  So  we  follow  the  Golden  Rule: 
If  you  cannot  solve  your  problem,  try  to  solve  a simpler  one — the  result  may  be  useful 
(and  may  also  help  you  later  on).  Hence  we  look  for  an  integrating  factor  depending  only 
on  one  variable:  fortunately,  in  many  practical  cases,  there  are  such  factors,  as  we  shall 
see.  Thus,  let  F = Fix).  Then  Fy  = 0,  and  Fx  = F'  = dF/dx,  so  that  (15)  becomes 

FPy  = F'Q  + FQX. 


Dividing  by  FQ  and  reshuffling  terms,  we  have 


\_dF 
F dx 


= R, 


R = 1Ap_aet 

Q\dy  dxj 


(16) 


where 
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THEOREM  1 


THEOREM  2 


EXAMPLE  5 


This  proves  the  following  theorem. 


Integrating  Factor  F(x) 

If  ( 12)  is  such  that  the  right  side  R of  ( 16)  depends  only  on  x,  then  (12)  has  an 
integrating  factor  F = F(x),  which  is  obtained  by  integrating  (16)  and  taking 
exponents  on  both  sides. 


(17) 


F(x)  = exp  R(x)  dx. 


Similarly,  if  F*  = F*(y),  then  instead  of  (16)  we  get 


(18) 


1 dF* 

= R*, 

F*  dy 


where 


and  we  have  the  companion 


Integrating  Factor  F*(y) 

If  (12)  is  such  that  the  right  side  R*  o/(18)  depends  only  on  y,  then  (12)  has  an 
in  tegrating  factor  F*  = F*(y),  which  is  obtained  from  (18)  in  the  form 


(19) 


F*(y)  = exp 


R*(y)  dy. 


Application  of  Theorems  1 and  2.  Initial  Value  Problem 

Using  Theorem  1 or  2,  find  an  integrating  factor  and  solve  the  initial  value  problem 
(20)  (ex+y  + yey)  dx  + (xey  - 1)  dy  = 0,  y(0)  = -1 

Solution.  Step  1.  Nonexactness.  The  exactness  check  fails: 


3 P 3 , + v y x+y  „ SQ  g 

— = — (e  y + yey)  = e v + ey  + yey  but  — = — (xey  - 1)  = ey. 
dy  dy  dx  dx 

Step  2.  Integrating  factor.  General  solution.  Theorem  1 fails  because  R [the  right  side  of  (16)]  depends  on 
both  x and  y. 


R 


_ J/3P  _ SQ\  _ 


Q\dy  dx 
Try  Theorem  2.  The  right  side  of  (18)  is 


(ex 


R* 


i_t dQ  _ dP\ 
P \ dx  dy  ) 


ye 


- (ey  - ex 


ey  - yeyl  = -1. 


Hence  (19)  gives  the  integrating  factor  F*(y)  = e y.  From  this  result  and  (20)  you  get  the  exact  equation 

(ex  + y)  dx  + (x  — e~y ) dy  = 0. 
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Test  for  exactness;  you  will  get  1 on  both  sides  of  the  exactness  condition.  By  integration,  using  (4a), 

u = | (ex  + y)  dx  = ex  + xy  + k(y). 

Differentiate  this  with  respect  to  y and  use  (4b)  to  get 

dk 


du  dk  .. 

— = x + — = N = x - e~y, 

dy  dy 


dy 


k — e y + c*. 


Hence  the  general  solution  is 


u(x,  y)  — ex  + xy  + e y = c. 

Setp  3.  Particular  solution.  The  initial  condition  y(0)  = — 1 gives  u{ 0,  — 1)  = 1 + 0 + e = 3.72.  Hence  the 
answer  is  ex  + xy  + e~y  = 1 + e = 3.72.  Figure  18  shows  several  particular  solutions  obtained  as  level  curves 
of  u(x,  y)  = c,  obtained  by  a CAS,  a convenient  way  in  cases  in  which  it  is  impossible  or  difficult  to  cast  a 
solution  into  explicit  form.  Note  the  curve  that  (nearly)  satisfies  the  initial  condition. 

Step  4.  Checking.  Check  by  substitution  that  the  answer  satisfies  the  given  equation  as  well  as  the  initial 
condition. 


PROBLEM  SET  T4 


ODEs.  INTEGRATING  FACTORS 

Test  for  exactness.  If  exact,  solve.  If  not,  use  an  integrating 
factor  as  given  or  obtained  by  inspection  or  by  the  theorems 
in  the  text.  Also,  if  an  initial  condition  is  given,  find  the 
corresponding  particular  solution. 

1.  2xy  dx  + x2  dy  — 0 

2.  x3 dx  + y3dy  = 0 

3.  sin  x cos  y dx  + cos  x sin  y dy  = 0 

4.  e3e(dr  + 3rdd)  = 0 

5.  (xz  + y2)dx  — 2xy  dy  = 0 

6.  3(y  + l)  dx  = 2xdy,  (y  + l).r-4 

7.  2x  tan  y dx  + sec2  y dy  = 0 


8.  ex(cos  y dx  — sin  y dy)  = 0 

9.  e2x(2cosydx  — sinyc/y)  = 0,  _v(0)  = 0 

10.  y dx  + [y  + tan  (x  + y)]  dy  = 0,  cos  (x  + y) 

11.  2 cosh  x cos  y dx  — sinh  x sin  y dy 

12.  (2xy  dx  + dy)ex  = 0,  y(0)  = 2 

13.  e~vdx  + e~x(-e~v  + l)dy  = 0,  F = ex+v 

14.  («  + l)y  dx  + (b  + 1)j cdy  — 0,  y(l)  = 1, 

F = xayb 

15.  Exactness.  Under  what  conditions  for  the  constants  a, 
b,  k,  l is  (ax  + by)  dx  + ( kx  + ly)  dy  = 0 exact?  Solve 
the  exact  ODE. 


SEC.  1.5  Linear  ODEs.  Bernoulli  Equation.  Population  Dynamics 


27 


16.  TEAM  PROJECT.  Solution  by  Several  Methods. 

Show  this  as  indicated.  Compare  the  amount  of  work. 

(a)  ey(sinh  xdx  + coshxdy)  = Oas  an  exact  ODE 
and  by  separation. 

(b)  (1  + 2x)cos  y dx  + dy/  cosy  = Oby  Theorem  2 
and  by  separation. 

(c)  {x2  + y2)  dx  — 2xy  dy  = 0 by  Theorem  1 or  2 and 
by  separation  with  v = y/x. 

(d)  3jc2  y dx  + 4x3  dy  = 0 by  Theorems  1 and  2 and 
by  separation. 

(e)  Search  the  text  and  the  problems  for  further  ODEs 
that  can  be  solved  by  more  than  one  of  the  methods 
discussed  so  far.  Make  a list  of  these  ODEs.  Find 
further  cases  of  your  own. 

17.  WRITING  PROJECT.  Working  Backward. 

Working  backward  from  the  solution  to  the  problem 
is  useful  in  many  areas.  Euler,  Lagrange,  and  other 
great  masters  did  it.  To  get  additional  insight  into 
the  idea  of  integrating  factors,  start  from  a u(x,  y)  of 
your  choice,  find  du  = 0,  destroy  exactness  by 
division  by  some  F(x,  y),  and  see  what  ODE’s 
solvable  by  integrating  factors  you  can  get.  Can  you 
proceed  systematically,  beginning  with  the  simplest 
F(x,  y)? 


18.  CAS  PROJECT.  Graphing  Particular  Solutions. 

Graph  particular  solutions  of  the  following  ODE, 
proceeding  as  explained. 

(21)  dy  — y2  sin  xdx  = 0. 

(a)  Show  that  (21)  is  not  exact.  Find  an  integrating 
factor  using  either  Theorem  1 or  2.  Solve  (21). 

(b)  Solve  (21)  by  separating  variables.  Is  this  simpler 
than  (a)? 

(c)  Graph  the  seven  particular  solutions  satisfying  the 
following  initial  conditions  y(0)  = 1,  y(7r/2)  = ±|, 
±|,  ± 1 (see  figure  below). 

(d)  Which  solution  of  (21)  do  we  not  get  in  (a)  or  (b)? 


Particular  solutions  in  CAS  Project  18 


Linear  ODEs.  Bernoulli  Equation. 

Population  Dynamics 

Linear  ODEs  or  ODEs  that  can  be  transformed  to  linear  form  are  models  of  various 
phenomena,  for  instance,  in  physics,  biology,  population  dynamics,  and  ecology,  as  we 
shall  see.  A first-order  ODE  is  said  to  be  linear  if  it  can  be  brought  into  the  form 

(1)  y + p(x)y  = r(x), 

by  algebra,  and  nonlinear  if  it  cannot  be  brought  into  this  form. 

The  defining  feature  of  the  linear  ODE  (1)  is  that  it  is  linear  in  both  the  unknown 
function  y and  its  derivative  y = dy/dx,  whereas  p and  r may  be  any  given  functions  of 
x.  If  in  an  application  the  independent  variable  is  time,  we  write  t instead  of  x. 

If  the  first  term  is  f(x)y  (instead  of  y ),  divide  the  equation  by  f(x)  to  get  the  standard 
form  (1),  with  y as  the  first  term,  which  is  practical. 

For  instance,  y cos  x + y sin  x = x is  a linear  ODE,  and  its  standard  form  is 
y + y tan  x = x sec  x. 

The  function  r(x)  on  the  right  may  be  a force,  and  the  solution  y(x)  a displacement  in 
a motion  or  an  electrical  current  or  some  other  physical  quantity.  In  engineering,  r(x)  is 
frequently  called  the  input,  and  y(x)  is  called  the  output  or  the  response  to  the  input  (and, 
if  given,  to  the  initial  condition). 
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Homogeneous  Linear  ODE.  We  want  to  solve  (1)  in  some  interval  a < x < b,  call 
it  J,  and  we  begin  with  the  simpler  special  case  that  r(x)  is  zero  for  all  x in  J.  (This  is 
sometimes  written  r(x)  = 0.)  Then  the  ODE  (1)  becomes 


and  is  called  homogeneous.  By  separating  variables  and  integrating  we  then  obtain 


Taking  exponents  on  both  sides,  we  obtain  the  general  solution  of  the  homogeneous 
ODE  (2), 


here  we  may  also  choose  c = 0 and  obtain  the  trivial  solution  y(x)  = 0 for  all  x in  that 
interval. 

Nonhomogeneous  Linear  ODE.  We  now  solve  (1)  in  the  case  that  r(x)  in  (1)  is  not 
everywhere  zero  in  the  interval  J considered.  Then  the  ODE  (1)  is  called  nonhomogeneous. 
It  turns  out  that  in  this  case,  (1)  has  a pleasant  property;  namely,  it  has  an  integrating  factor 
depending  only  on  x.  We  can  find  this  factor  F(x)  by  Theorem  1 in  the  previous  section 
or  we  can  proceed  directly,  as  follows.  We  multiply  (1)  by  F(x),  obtaining 

(1*)  Fy'  + pFy  = rF. 

The  left  side  is  the  derivative  (Fy)'  = F'y  + Fy'  of  the  product  Fy  if 


(2) 


y'  + p(x)y  = 0 


— = —p(x)dx, 

y 


thus 


In  |y|  = — p(x)dx  + c*. 


(3) 


y(x)  = ce-^x)dx 


(c  = ±ec  when  y - ' 0); 


pFy  = F'y,  thus  pF  = F' . 


By  separating  variables,  dF/F  = p dx.  By  integration,  writing  h = fp  dx. 


In  | T5’!  = h = p dx,  thus  F = eh. 


With  this  F and  h'  = p,  Eq.  (1*)  becomes 


h * i / t h h t \ / h\t  / h \f  h 

e y + h e y = e y + (e  ) y = (e  y)  = re  . 


By  integration, 


h hi, 

e y = e r dx  + c. 


Dividing  by  eh,  we  obtain  the  desired  solution  formula 


(4) 


h = p(x)  dx. 
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EXAMPLE  1 


EXAMPLE  2 


This  reduces  solving  (1)  to  the  generally  simpler  task  of  evaluating  integrals.  For  ODEs 
for  which  this  is  still  difficult,  you  may  have  to  use  a numeric  method  for  integrals  from 
Sec.  19.5  or  for  the  ODE  itself  from  Sec.  21.1.  We  mention  that  h has  nothing  to  do  with 
h{x)  in  Sec.  1.1  and  that  the  constant  of  integration  in  h does  not  matter;  see  Prob.  2. 

The  structure  of  (4)  is  interesting.  The  only  quantity  depending  on  a given  initial 
condition  is  c.  Accordingly,  writing  (4)  as  a sum  of  two  terms, 


(4*) 

we  see  the  following: 


y(x)  = e 


-h 


ehrdx  + ce  h. 


(5)  Total  Output  = Response  to  the  Input  r + Response  to  the  Initial  Data. 


First-Order  ODE,  General  Solution,  Initial  Value  Problem 

Solve  the  initial  value  problem 

y + y tan  x = sin  2x,  y(0)  = 1 . 

Solution.  Here  p = tan  x,  r = sin  2x  = 2 sin  x cos  x,  and 

h = ip  dx  = | tan  x dx  = In  | sec  jc| . 

From  this  we  see  that  in  (4), 

eh  = sec  x , e~h  = cos  x , ehr  = (sec  x){2  sin  x cos  x)  = 2 sin  x, 

and  the  general  solution  of  our  equation  is 


y(jr)  = cos  jr  I 2 sin  x dx  + c \ = c cos  x — 2 cos  jc. 


From  this  and  the  initial  condition,  1 — c • 1 — 2 • l2;  thus  c = 3 and  the  solution  of  our  initial  value  problem 
is  y = 3 cos  x — 2 cos2  jc.  Here  3 cos  x is  the  response  to  the  initial  data,  and  —2  cos2  jc  is  the  response  to  the 
input  sin  2jc. 


Electric  Circuit 

Model  the  RL- circuit  in  Fig.  19  and  solve  the  resulting  ODE  for  the  current  I(t)  A (amperes),  where  t is 
time.  Assume  that  the  circuit  contains  as  an  EMF  E(t)  (electromotive  force)  a battery  of  E = 48  V (volts),  which 
is  constant,  a resistor  of  R = 1 1 El  (ohms),  and  an  inductor  of  L = 0.1  H (henrys),  and  that  the  current  is  initially 
zero. 

Physical  Laws.  A current  I in  the  circuit  causes  a voltage  drop  RI  across  the  resistor  (Ohm’s  law)  and 
a voltage  drop  Li'  = L dl/dt  across  the  conductor,  and  the  sum  of  these  two  voltage  drops  equals  the  EMF 

(Kirchhoff’s  Voltage  Law,  KVL). 

Remark.  In  general,  KVL  states  that  “The  voltage  (the  electromotive  force  EMF)  impressed  on  a closed 
loop  is  equal  to  the  sum  of  the  voltage  drops  across  all  the  other  elements  of  the  loop.”  For  Kirchoff’s  Current 
Law  (KCL)  and  historical  information,  see  footnote  7 in  Sec.  2.9. 

Solution.  According  to  these  laws  the  model  of  the  /?L-circuit  is  Li'  + RI  = E(t),  in  standard  form 

RJ  = m 
L L ' 


(6) 
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EXAMPLE  3 


We  can  solve  this  linear  ODE  by  (4)  with  x = t,  y = /.  p = R/L,  h = ( R/L)t , obtaining  the  general  solution 

7 = e-t + A 


By  integration, 

(7) 


/ p eWL)>  , ,, 

i = e-(R/»  £ - + c = £ + cc-(R/“. 


Vl  r/l 


R 


In  our  case,  R/L  = 11/0.1  = 110  and  E(t)  = 48/0.1  = 480  = const;  thus, 


In  modeling,  one  often  gets  better  insight  into  the  nature  of  a solution  (and  smaller  roundoff  errors)  by  inserting 
given  numeric  data  only  near  the  end.  Here,  the  general  solution  (7)  shows  that  the  current  approaches  the  limit 
E/R  = 48/11  faster  the  larger  R/L  is,  in  our  case,  R/L  = 11/0.1  = 110,  and  the  approach  is  very  fast,  from 
below  if  7(0)  < 48/ 1 1 or  from  above  if  1(0)  > 48/ 1 1 . If  7(0)  = 48/ 1 1 , the  solution  is  constant  (48/1 1 A).  See 
Fig.  19. 

The  initial  value  7(0)  = 0 gives  7(0)  = E/R  + c = 0,  c = —E/R  and  the  particular  solution 
(8)  7 = -(1  - e~(R/m),  thus  7 = jj(1  - c-110t). 


77  = 11  n 


7(f) 

8 


4 - 


0.01 


0.02  0.03 

Current  I(t) 


0.04 


0.05 


t 


Fig.  19.  RL-circuit 


Hormone  Level 

Assume  that  the  level  of  a certain  hormone  in  the  blood  of  a patient  varies  with  time.  Suppose  that  the  time  rate 
of  change  is  the  difference  between  a sinusoidal  input  of  a 24-hour  period  from  the  thyroid  gland  and  a continuous 
removal  rate  proportional  to  the  level  present.  Set  up  a model  for  the  hormone  level  in  the  blood  and  find  its 
general  solution.  Find  the  particular  solution  satisfying  a suitable  initial  condition. 

Solution.  Step  1.  Setting  up  a model.  Let  y{t)  be  the  hormone  level  at  time  t.  Then  the  removal  rate  is  Ky(t). 
The  input  rate  is  A + B cos  cot,  where  co  = 2tt/24  = 77/ 12  and  A is  the  average  input  rate;  here  A B to  make 
the  input  rate  nonnegative.  The  constants  A,  B,  K can  be  determined  from  measurements.  Hence  the  model  is  the 
linear  ODE 


y'(t ) = In  — Out  = A + B cos  cot  — Ky{t),  thus  y'  + Ky  = A + B cos  cot. 

The  initial  condition  for  a particular  solution  ypart  is  ypart(0)  = y0  with  t = 0 suitably  chosen,  for  example, 
6:00  a.m. 

Step  2.  General  solution.  In  (4)  we  have  p = K = const,  h — Kt,  and  r = A + B cos  cot.  Hence  (4)  gives  the 
general  solution  (evaluate  f eKt  cos  cot  dt  by  integration  by  parts) 
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y(t)  = e 


A + B cos  cot  \dt  + ce 


_ e~KteKt 


A 

K 


K cos  cot  + a)  sin  cot 


+ ce 


A 

K 


K2  + (77/ 12)2 


777  77  . 7Tt 

K cos 1 sin  — 

12  12  12 


The  last  term  decreases  to  0 as  t increases,  practically  after  a short  time  and  regardless  of  c (that  is,  of  the  initial 
condition).  The  other  part  of  y{t)  is  called  the  steady-state  solution  because  it  consists  of  constant  and  periodic 
terms.  The  entire  solution  is  called  the  transient-state  solution  because  it  models  the  transition  from  rest  to  the 
steady  state.  These  terms  are  used  quite  generally  for  physical  and  other  systems  whose  behavior  depends  on  time. 

Step  3.  Particular  solution.  Setting  t = 0 in  y{t)  and  choosing  y0  = 0,  we  have 


y(0)  = 


B 


K + c = 0, 


thus 


K K2  + (tt/12)2  77 
Inserting  this  result  into  y(f),  we  obtain  the  particular  solution 


_ _A  _ 


KB 


K K2  + (tt/12)2 


ypart^O  "L 


B 


K K2  + (tt/12)2 


7 77  77 


777 


K cos  — H — — sin  — — 1-  - 


12  12 


12 


KB 


K K2  + (tt/12): 


with  the  steady-state  part  as  before.  To  plot  _ypaI1  we  must  specify  values  for  the  constants,  say,  A = B = 1 
and  K = 0.05.  Figure  20  shows  this  solution.  Notice  that  the  transition  period  is  relatively  short  (although 
K is  small),  and  the  curve  soon  looks  sinusoidal;  this  is  the  response  to  the  input  A + it  cos  ( 7T1)  = 

1 + COS  (pj  7 Tt). 
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Fig.  20.  Particular  solution  in  Example  3 

Reduction  to  Linear  Form.  Bernoulli  Equation 

Numerous  applications  can  be  modeled  by  ODEs  that  are  nonlinear  but  can  be  transformed 
to  linear  ODEs.  One  of  the  most  useful  ones  of  these  is  the  Bernoulli  equation7 


(9) 


y + p(*)y  = g(x)ya 


(i a any  real  number). 


7JAKOB  BERNOULLI  (1654—1705),  Swiss  mathematician,  professor  at  Basel,  also  known  for  his  contribution 
to  elasticity  theory  and  mathematical  probability.  The  method  for  solving  Bernoulli’s  equation  was  discovered  by 
Leibniz  in  1696.  Jakob  Bernoulli’s  students  included  his  nephew  NIKLAUS  BERNOULLI  (1687-1759),  who 
contributed  to  probability  theory  and  infinite  series,  and  his  youngest  brother  JOHANN  BERNOULLI  (1667-1748), 
who  had  profound  influence  on  the  development  of  calculus,  became  Jakob’s  successor  at  Basel,  and  had  among 
his  students  GABRIEL  CRAMER  (see  Sec.  7.7)  and  LEONHARD  EULER  (see  Sec.  2.5).  His  son  DANIEL 
BERNOULLI  (1700-1782)  is  known  for  his  basic  work  in  fluid  flow  and  the  kinetic  theory  of  gases. 
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EXAMPLE  4 


If  a = 0 or  a = 1,  Equation  (9)  is  linear.  Otherwise  it  is  nonlinear.  Then  we  set 

u(x)  = [j(.x)]1-“. 

We  differentiate  this  and  substitute  y from  (9),  obtaining 

u'  = (1  - a)y~ay  = (1  - d)y~a{gya  - py). 

Simplification  gives 

«'=(!-  a)ig  ~ 

where  v1-“  = u on  the  right,  so  that  we  get  the  linear  ODE 
(10)  u + (1  — a)pu  = (1  — a)g. 

For  further  ODEs  reducible  to  linear  form,  see  lnce’s  classic  [All]  listed  in  App.  1.  See 
also  Team  Project  30  in  Problem  Set  1.5. 


Logistic  Equation 

Solve  the  following  Bernoulli  equation,  known  as  the  logistic  equation  (or  Verhulst  equation8): 

(11)  y = Ay  - By2 

Solution.  Write  (11)  in  the  form  (9),  that  is, 

y'  - Ay  = - By 2 

to  see  that  a = 2,  so  that  u = y1-a  = y-1.  Differentiate  this  u and  substitute  y'  from  (11), 
u = ~y~2y'  = ~y~\Ay  - By2)  = B - Ay'1. 

The  last  term  is  —Ay'1  = —Au.  Hence  we  have  obtained  the  linear  ODE 

u + Au  = B. 


The  general  solution  is  [by  (4)] 

u = ce~At  + B/A. 

Since  u = 1 /y,  this  gives  the  general  solution  of  (11), 


(12) 


B/A 


Directly  from  (11)  we  see  that  y = 0 (y(t)  = 0 for  all  t ) is  also  a solution. 


(Fig.  21) 


8PIERRE-FRAN£OIS  VERHULST,  Belgian  statistician,  who  introduced  Eq.  (8)  as  a model  for  human 
population  growth  in  1838. 
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EXAMPLE  5 


Fig.  21.  Logistic  population  model.  Curves  (9)  in  Example  4 with  A/R  = 4 


Population  Dynamics 

The  logistic  equation  (11)  plays  an  important  role  in  population  dynamics,  a field 
that  models  the  evolution  of  populations  of  plants,  animals,  or  humans  over  time  t. 
If  B = 0,  then  (11)  is  y'  = dy/dt  = Ay.  In  this  case  its  solution  (12)  is  y = (1  /c)eAt 
and  gives  exponential  growth,  as  for  a small  population  in  a large  country  (the 
United  States  in  early  times!).  This  is  called  Malthus’s  law.  (See  also  Example  3 in 
Sec.  1.1.) 

The  term  —By2  in  (11)  is  a “braking  term”  that  prevents  the  population  from  growing 
without  bound.  Indeed,  if  we  write  y = Ay [ 1 — (B/A)y],  we  see  that  if  y < A/B.  then 
y > 0,  so  that  an  initially  small  population  keeps  growing  as  long  as  y < A/B.  But  if 
y > A/B,  then  y < 0 and  the  population  is  decreasing  as  long  as  y > A/B.  The  limit 
is  the  same  in  both  cases,  namely,  A/B.  See  Fig.  21. 

We  see  that  in  the  logistic  equation  (11)  the  independent  variable  t does  not  occur 
explicitly.  An  ODE  y = f(t,  y)  in  which  t does  not  occur  explicitly  is  of  the  form 

(13)  y'  =f(y ) 

and  is  called  an  autonomous  ODE.  Thus  the  logistic  equation  (11)  is  autonomous. 

Equation  (13)  has  constant  solutions,  called  equilibrium  solutions  or  equilibrium 
points.  These  are  determined  by  the  zeros  of  f(y),  because /(y)  = 0 gives  y = 0 by 
(13);  hence  y = const.  These  zeros  are  known  as  critical  points  of  (13).  An 
equilibrium  solution  is  called  stable  if  solutions  close  to  it  for  some  t remain  close 
to  it  for  all  further  t.  It  is  called  unstable  if  solutions  initially  close  to  it  do  not  remain 
close  to  it  as  t increases.  For  instance,  y = 0 in  Fig.  21  is  an  unstable  equilibrium 
solution,  and  y = 4 is  a stable  one.  Note  that  (11)  has  the  critical  points  y = 0 and 
y = A/B. 


Stable  and  Unstable  Equilibrium  Solutions.  “Phase  Line  Plot” 

The  ODE  y'  = (y  — l)(y  — 2)  has  the  stable  equilibrium  solution  yx  = 1 and  the  unstable  y2  — 2,  as  the  direction 
field  in  Fig.  22  suggests.  The  values  yx  and  y2  are  the  zeros  of  the  parabola /(y)  = (y  — l)(y  — 2)  in  the  figure. 
Now,  since  the  ODE  is  autonomous,  we  can  “condense”  the  direction  field  to  a “phase  line  plot”  giving  yi  and 
y2,  and  the  direction  (upward  or  downward)  of  the  arrows  in  the  field,  and  thus  giving  information  about  the 
stability  or  instability  of  the  equilibrium  solutions. 
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y(.x) 
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Fig.  22.  Example  5.  (A)  Direction  field.  (B)  “Phase  line”.  (C)  Parabola  f(y) 


A few  further  population  models  will  be  discussed  in  the  problem  set.  For  some  more 
details  of  population  dynamics,  see  C.  W.  Clark.  Mathematical  Bioeconomics : The 
Mathematics  of  Conservation  3rd  ed.  Hoboken,  NJ,  Wiley,  2010. 

Further  applications  of  linear  ODEs  follow  in  the  next  section. 


PJR^reEM=S^ET=^1^5 


1.  CAUTION!  Show  that  e lnx  = 1/jc  (not  — .*)  and 

e — ln(sec*)  = CQS  x 

2.  Integration  constant.  Give  a reason  why  in  (4)  you  may 
choose  the  constant  of  integration  in  jp  dx  to  be  zero. 
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GENERAL  SOLUTION.  INITIAL  VALUE 


PROBLEMS 


Find  the  general  solution.  If  an  initial  condition  is  given, 
find  also  the  corresponding  particular  solution  and  graph  or 
sketch  it.  (Show  the  details  of  your  work.) 

3.  y'  — y = 5.2 

4.  y = 2y  — 4x 

5.  y'  +ky  = e~kx 

6.  y + 2y  = 4 cos  2x,  y(57T)  = 3 

7.  xy  = 2v  + x3ex 

8.  y + ytanx  = e“0  01lcosr,  y(0)  = 0 

9.  y + ysinx  = ecosx,  v(0)  = -2.5 

10.  y'  cos  ,r  + (3y  — l)secx  = 0,  y(j7r)  = 4/3 

11.  y = (y  - 2)  cot  x 

12.  xy'  + 4y  = 8x4,  y(  I ) = 2 

13.  y'  = 6 (y  — 2.5)tanh  1.5x 


14.  CAS  EXPERIMENT,  (a)  Solve  the  ODE  y - y/x  = 
— .r_1  cos  ( 1 /jc).  Find  an  initial  condition  for  which  the 
arbitrary  constant  becomes  zero.  Graph  the  resulting 
particular  solution,  experimenting  to  obtain  a good 
figure  near  x = 0. 

(b)  Generalizing  (a)  from  n = 1 to  arbitrary  n , solve  the 
ODE  y — ny/x  = — x™-2cos  ( 1 /jc).  Find  an  initial 
condition  as  in  (a)  and  experiment  with  the  graph. 
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GENERAL  PROPERTIES  OF  LINEAR  ODEs 


These  properties  are  of  practical  and  theoretical  importance 
because  they  enable  us  to  obtain  new  solutions  from  given 
ones.  Thus  in  modeling,  whenever  possible,  we  prefer  linear 
ODEs  over  nonlinear  ones,  which  have  no  similar  properties. 

Show  that  nonhomogeneous  linear  ODEs  (1)  and  homo- 
geneous linear  ODEs  (2)  have  the  following  properties. 
Illustrate  each  property  by  a calculation  for  two  or  three 
equations  of  your  choice.  Give  proofs. 


15.  The  sum  + y2  of  two  solutions  yq  and  y2  of  the 
homogeneous  equation  (2)  is  a solution  of  (2),  and  so  is 
a scalar  multiple  ay^  for  any  constant  a.  These  properties 
are  not  true  for  (1)! 
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16.  y = 0 (that  is,  y(x)  = 0 for  all  x,  also  written  y(x)  = 0) 
is  a solution  of  (2)  [not  of  (1)  if  r(x)  A 0!],  called  the 

trivial  solution. 

17.  The  sum  of  a solution  of  (1)  and  a solution  of  (2)  is  a 
solution  of  (1). 

18.  The  difference  of  two  solutions  of  (1)  is  a solution  of  (2). 

19.  If  yi  is  a solution  of  (1),  what  can  you  say  about  cyi? 

20.  If  yi  and  y2  are  solutions  of  y[  + py ± = tq  and 
y2  + py2  — r2,  respectively  (with  the  same  p\),  what 
can  you  say  about  the  sum  yi  + y22 

21.  Variation  of  parameter.  Another  method  of  obtaining 
(4)  results  from  the  following  idea.  Write  (3)  as  cy*, 
where  y*  is  the  exponential  function,  which  is  a solution 
of  the  homogeneous  linear  ODE  y*'  + py*  = 0. 
Replace  the  arbitrary  constant  c in  (3)  with  a function 
u to  be  determined  so  that  the  resulting  function  y — uy* 
is  a solution  of  the  nonhomogeneous  linear  ODE 
y + py  = r. 
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NONLINEAR  ODEs 


Using  a method  of  this  section  or  separating  variables,  find 
the  general  solution.  If  an  initial  condition  is  given,  find 
also  the  particular  solution  and  sketch  or  graph  it. 

22.  / + y = y2,  y(0)  = -§ 

23.  y + xy  = xy-1,  v(0)  = 3 

24.  y + y = —x/y 

25.  y'  = 3.2y  - 10y2 

26.  y = (tan  y)/{x  - 1),  y(0)  = jir 

27.  y'  = \/{6ey  - 2x) 

28.  2xyy'  + (x  — l)y2  = x2ex  (Sety2  = z) 


29.  REPORT  PROJECT.  Transformation  of  ODEs. 

We  have  transformed  ODEs  to  separable  form,  to  exact 
form,  and  to  linear  form.  The  purpose  of  such 
transformations  is  an  extension  of  solution  methods  to 
larger  classes  of  ODEs.  Describe  the  key  idea  of  each 
of  these  transformations  and  give  three  typical  exam- 
ples of  your  choice  for  each  transformation.  Show  each 
step  (not  just  the  transformed  ODE). 


30.  TEAM  PROJECT.  Riccati  Equation.  Clairaut 
Equation.  Singular  Solution. 

A Riccati  equation  is  of  the  form 


(14)  y + p(x)y  = g( x)y2  + h(x). 


A Clairaut  equation  is  of  the  form 

(15)  y = xy’  + g(y'). 

(a)  Apply  the  transformation  y = Y + 1/u  to  the 
Riccati  equation  (14),  where  Tis  a solution  of  (14),  and 
obtain  for  u the  linear  ODE  u + (2Fg  — p)u  = —g. 
Explain  the  effect  of  the  transformation  by  writing  it 
as  y = Y + v,  v = l/u. 


(b)  Show  that  y = Y = x is  a solution  of  the  ODE 
y — (2x3  + 1 ) y = — x2y2  — x4  — x + 1 and  solve  this 
Riccati  equation,  showing  the  details. 

(c)  Solve  the  Clairaut  equation  y'2  — xy'  + y = 0 as 
follows.  Differentiate  it  with  respect  to  x,  obtaining 
y"(2 y - x)  = 0.  Then  solve  (A)  y"  = 0 and  (B) 
2y  — x = 0 separately  and  substitute  the  two  solutions 
(a)  and  (b)  of  (A)  and  (B ) into  the  given  ODE.  Thus 
obtain  (a)  a general  solution  (straight  lines)  and  (b)  a 
parabola  for  which  those  lines  (a)  are  tangents  (Fig.  6 
in  Prob.  Set  1.1);  so  (b)  is  the  envelope  of  (a).  Such  a 
solution  (b)  that  cannot  be  obtained  from  a general 
solution  is  called  a singular  solution. 

(d)  Show  that  the  Clairaut  equation  (15)  has  as 
solutions  a family  of  straight  lines  y = cx  + g{c)  and 
a singular  solution  determined  by  g\s)  = —x,  where 
s = y , that  forms  the  envelope  of  that  family. 
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MODELING.  FURTHER  APPLICATIONS 


31.  Newton’s  law  of  cooling.  If  the  temperature  of  a cake 
is  300°F  when  it  leaves  the  oven  and  is  200°F  ten 
minutes  later,  when  will  it  be  practically  equal  to  the 
room  temperature  of  60°F,  say,  when  will  it  be  61°F? 


32.  Heating  and  cooling  of  a building.  Heating  and 
cooling  of  a building  can  be  modeled  by  the  ODE 


T'  = k,(T  - Ta)  + k2(J  - Tj  + P, 


where  T = T(t)  is  the  temperature  in  the  building  at 
time  ?,  Ta  the  outside  temperature,  Tw  the  temperature 
wanted  in  the  building,  and  P the  rate  of  increase  of  T 
due  to  machines  and  people  in  the  building,  and  kj  and 
k2  are  (negative)  constants.  Solve  this  ODE,  assuming 
P — const,  Tw  = const,  and  Ta  varying  sinusoidally 
over  24  hours,  say,  Ta  = A — C cos(27r/24)t.Discuss 
the  effect  of  each  term  of  the  equation  on  the  solution. 

33.  Drug  injection.  Find  and  solve  the  model  for  drug 
injection  into  the  bloodstream  if,  beginning  at  t = 0,  a 
constant  amount  A g/min  is  injected  and  the  drug  is 
simultaneously  removed  at  a rate  proportional  to  the 
amount  of  the  drug  present  at  time  t. 

34.  Epidemics.  A model  for  the  spread  of  contagious 
diseases  is  obtained  by  assuming  that  the  rate  of  spread 
is  proportional  to  the  number  of  contacts  between 
infected  and  noninfected  persons,  who  are  assumed  to 
move  freely  among  each  other.  Set  up  the  model.  Find 
the  equilibrium  solutions  and  indicate  their  stability  or 
instability.  Solve  the  ODE.  Find  the  limit  of  the 
proportion  of  infected  persons  as  t — * oo  and  explain 
what  it  means. 

35.  Lake  Erie.  Lake  Erie  has  a water  volume  of  about 
450  km3 and  a flow  rate  (in  and  out)  of  about  175  km2 
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per  year.  If  at  some  instant  the  lake  has  pollution 
concentration  p = 0.04%,  how  long,  approximately, 
will  it  take  to  decrease  it  to  p/2,  assuming  that  the 
inflow  is  much  cleaner,  say,  it  has  pollution 
concentration  p/4,  and  the  mixture  is  uniform  (an 
assumption  that  is  only  imperfectly  true)?  First  guess. 

36.  Harvesting  renewable  resources.  Fishing.  Suppose 
that  the  population  y(t)  of  a certain  kind  of  fish  is  given 
by  the  logistic  equation  (11),  and  fish  are  caught  at  a 
rate  Hy  proportional  to  y.  Solve  this  so-called  Schaefer 
model.  Find  the  equilibrium  solutions  Vi  and  y2  (>  0) 
when  H < A.  The  expression  Y = Hyz  is  called 
the  equilibrium  harvest  or  sustainable  yield  corre- 
sponding to  H.  Why? 

37.  Harvesting.  In  Prob.  36  find  and  graph  the  solution 
satisfying  y(0)  = 2 when  (for  simplicity)  A = B = 1 
and  H = 0.2.  What  is  the  limit?  What  does  it  mean? 
What  if  there  were  no  fishing? 

38.  Intermittent  harvesting.  In  Prob.  36  assume  that  you 
fish  for  3 years,  then  fishing  is  banned  for  the  next 
3 years.  Thereafter  you  start  again.  And  so  on.  This  is 
called  intermittent  harvesting.  Describe  qualitatively 
how  the  population  will  develop  if  intermitting  is 
continued  periodically.  Find  and  graph  the  solution  for 
the  first  9 years,  assuming  that  A = B = 1,  H = 0.2, 
and  y(0)  = 2. 


Fig.  23.  Fish  population  in  Problem  38 

39.  Extinction  vs.  unlimited  growth.  If  in  a population 
y (t)  the  death  rate  is  proportional  to  the  population,  and 
the  birth  rate  is  proportional  to  the  chance  encounters 
of  meeting  mates  for  reproduction,  what  will  the  model 
be?  Without  solving,  find  out  what  will  eventually 
happen  to  a small  initial  population.  To  a large  one. 
Then  solve  the  model. 

40.  Air  circulation.  In  a room  containing  20,000  ft3  of  air, 
600  ft3of  fresh  air  flows  in  per  minute,  and  the  mixture 
(made  practically  uniform  by  circulating  fans)  is 
exhausted  at  a rate  of  600  cubic  feet  per  minute  (cfm). 
What  is  the  amount  of  fresh  air  y(t)  at  any  time  if 
y(0)  = 0?  After  what  time  will  90%  of  the  air  be  fresh? 


1.6  Orthogonal  Trajectories.  Optional 

An  important  type  of  problem  in  physics  or  geometry  is  to  find  a family  of  curves  that 
intersects  a given  family  of  curves  at  right  angles.  The  new  curves  are  called  orthogonal 
trajectories  of  the  given  curves  (and  conversely).  Examples  are  curves  of  equal 
temperature  (isotherms)  and  curves  of  heat  flow,  curves  of  equal  altitude  (contour  lines) 
on  a map  and  curves  of  steepest  descent  on  that  map,  curves  of  equal  potential 
(equipotential  curves,  curves  of  equal  voltage — the  ellipses  in  Fig.  24)  and  curves  of 
electric  force  (the  parabolas  in  Fig.  24). 

Here  the  angle  of  intersection  between  two  curves  is  defined  to  be  the  angle  between 
the  tangents  of  the  curves  at  the  intersection  point.  Orthogonal  is  another  word  for 
perpendicular. 

In  many  cases  orthogonal  trajectories  can  be  found  using  ODEs.  In  general,  if  we 
consider  G(x,  y,  c)  = 0 to  be  a given  family  of  curves  in  the  xy-plane,  then  each  value  of 
c gives  a particular  curve.  Since  c is  one  parameter,  such  a family  is  called  a one- 
parameter  family  of  curves. 

In  detail,  let  us  explain  this  method  by  a family  of  ellipses 


(1) 


(c  > 0) 


SEC.  1.6  Orthogonal  Trajectories.  Optional 


37 


and  illustrated  in  Fig.  24.  We  assume  that  this  family  of  ellipses  represents  electric 
equipotential  curves  between  the  two  black  ellipses  (equipotential  surfaces  between  two 
elliptic  cylinders  in  space,  of  which  Fig.  24  shows  a cross-section).  We  seek  the 
orthogonal  trajectories,  the  curves  of  electric  force.  Equation  (1)  is  a one-parameter  family 
with  parameter  c.  Each  value  of  c (>  0)  corresponds  to  one  of  these  ellipses. 

Step  1.  Find  an  ODE  for  which  the  given  family  is  a general  solution.  Of  course,  this 
ODE  must  no  longer  contain  the  parameter  c.  Differentiating  (1),  we  have  x + 2yy  = 0. 
Hence  the  ODE  of  the  given  curves  is 

(2)  y'  = fix,  y)  = ~ * ■ 

2y 


Fig.  24.  Electrostatic  field  between  two  ellipses  (elliptic  cylinders  in  space): 
Elliptic  equipotential  curves  (equipotential  surfaces)  and  orthogonal 
trajectories  (parabolas) 


Step  2.  Find  an  ODE  for  the  orthogonal  trajectories  y = y(x).  This  ODE  is 


(3) 


y 


l 

fix,  y) 


= + 


2 y 


x 


with  the  same /as  in  (2).  Why?  Well,  a given  curve  passing  through  a point  (.r0,  y0)  has 
slope /(x0,  yo)  at  that  point,  by  (2).  The  trajectory  through  (x0,  y0)  has  slope  — 1//(jc0,  y0) 
by  (3).  The  product  of  these  slopes  is  —1,  as  we  see.  From  calculus  it  is  known  that  this 
is  the  condition  for  orthogonality  (perpendicularity)  of  two  straight  lines  (the  tangents  at 
(.r0,  .Vo)),  hence  of  the  curve  and  its  orthogonal  trajectory  at  (x0,  y0 ). 

Step  3.  Solve  (3)  by  separating  variables,  integrating,  and  taking  exponents: 

dy  dx  .|~i  ~ , ~ * 2 

y X 


This  is  the  family  of  orthogonal  trajectories,  the  quadratic  parabolas  along  which  electrons 
or  other  charged  particles  (of  very  small  mass)  would  move  in  the  electric  field  between 
the  black  ellipses  (elliptic  cylinders). 


38 


CHAP.  1 First-Order  ODEs 


FAMILIES  OF  CURVES 

Represent  the  given  family  of  curves  in  the  form 
G(x,  y;  c)  = 0 and  sketch  some  of  the  curves. 

1.  All  ellipses  with  foci  —3  and  3 on  the  x-axis. 

2.  All  circles  with  centers  on  the  cubic  parabola  y = x3 
and  passing  through  the  origin  (0,  0). 

3.  The  catenaries  obtained  by  translating  the  catenary 
y = cosh  xin  the  direction  of  the  straight  line  y = x. 


ORTHOGONAL  TRAJECTORIES  (OTs) 

Sketch  or  graph  some  of  the  given  curves.  Guess  what  their 
OTs  may  look  like.  Find  these  OTs. 

4.  y = x2  + c 5.  y = cx 

6.  xy  = c 7.  y = c/x2 

8.  y = Vx  + c 9.  y = ce~x 

10.  x2  + (y  - cf  = c2 


APPLICATIONS,  EXTENSIONS 

11.  Electric  field.  Let  the  electric  equipotential  lines 

(curves  of  constant  potential)  between  two  concentric 
cylinders  with  the  z-axis  in  space  be  given  by 
u(x,  y)  = x2  + y2  = c (these  are  circular  cylinders  in 
the  xyz-space).  Using  the  method  in  the  text,  find  their 
orthogonal  trajectories  (the  curves  of  electric  force). 

12.  Electric  field.  The  lines  of  electric  force  of  two  opposite 
charges  of  the  same  strength  at  ( — 1,0)  and  (1,0)  are 
the  circles  through  ( — 1,  0)and  (1,0).  Show  that  these 
circles  are  given  by  x2  + (y  — c)2  = 1 + c2.  Show 
that  the  equipotential  lines  (which  are  orthogonal 
trajectories  of  those  circles)  are  the  circles  given  by 
(x  + c*)2  + y2  = c*2  — 1 (dashed  in  Fig.  25). 


Fig.  25.  Electric  field  in  Problem  12 

13.  Temperature  field.  Let  the  isotherms  (curves  of 
constant  temperature)  in  a body  in  the  upper  half-plane 
y > 0 be  given  by  4x2  + 9y2  = c.  Find  the  ortho- 
gonal trajectories  (the  curves  along  which  heat  will 
flow  in  regions  filled  with  heat-conducting  material  and 
free  of  heat  sources  or  heat  sinks). 

14.  Conic  sections.  Find  the  conditions  under  which 
the  orthogonal  trajectories  of  families  of  ellipses 
x2/a2  + y2/b2  = c are  again  conic  sections.  Illustrate 
your  result  graphically  by  sketches  or  by  using  your 
CAS.  What  happens  ifa^0?If6^0? 

15.  Cauchy-Riemann  equations.  Show  that  for  a family 
u(x,  v)  = c = const  the  orthogonal  trajectories  v(x,  y)  = 
c*  = const  can  be  obtained  from  the  following 
Cauchy-Riemann  equations  (which  are  basic  in 
complex  analysis  in  Chap.  13)  and  use  them  to  find  the 
orthogonal  trajectories  of  ex  sin  y = const.  (Here,  sub- 
scripts denote  partial  derivatives.) 

Ux  Vy , Uy  Vx 

16.  Congruent  OTs.  If  y'  =/(x)  with /independent  of  y, 
show  that  the  curves  of  the  corresponding  family  are 
congruent,  and  so  are  their  OTs. 


Existence  and  Uniqueness  of  Solutions 
for  Initial  Value  Problems 

The  initial  value  problem 

1/1  + W =0,  y(0)  = 1 

has  no  solution  because  y = 0 (that  is,  y(x ) = 0 for  all  x)  is  the  only  solution  of  the  ODE. 
The  initial  value  problem 


y'  = 2x, 


y(  0)  = l 
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has  precisely  one  solution,  namely,  y = x2  + 1.  The  initial  value  problem 

xy'  = y - 1,  y(0)  = 1 

has  infinitely  many  solutions,  namely,  y = 1 + cx,  where  c is  an  arbitrary  constant  because 
y(0)  = 1 for  all  c. 

From  these  examples  we  see  that  an  initial  value  problem 

(1)  y = f(x,  y),  y(x  0)  = yo 

may  have  no  solution,  precisely  one  solution,  or  more  than  one  solution.  This  fact  leads 
to  the  following  two  fundamental  questions. 


Problem  of  Existence 

Under  what  conditions  does  an  initial  value  problem  of  the  form  (1)  have  at  least 
one  solution  ( hence  one  or  several  solutions)? 

Problem  of  Uniqueness 

Under  what  conditions  does  that  problem  have  at  most  one  solution  ( hence  excluding 
the  case  that  is  has  more  than  one  solution)? 


Theorems  that  state  such  conditions  are  called  existence  theorems  and  uniqueness 
theorems,  respectively. 

Of  course,  for  our  simple  examples,  we  need  no  theorems  because  we  can  solve  these 
examples  by  inspection;  however,  for  complicated  ODEs  such  theorems  may  be  of 
considerable  practical  importance.  Even  when  you  are  sure  that  your  physical  or  other 
system  behaves  uniquely,  occasionally  your  model  may  be  oversimplified  and  may  not 
give  a faithful  picture  of  reality. 


THEOREM  1 


Existence  Theorem 

Let  the  right  side  f(x,  v)  of  the  ODE  in  the  initial  value  problem 

(1)  y'=f{x,y),  y(x0)  = y0 

be  continuous  at  all  points  ( x , y)  in  some  rectangle 

R:  \x  - x0\  < a,  |y  - y0|  < b (Fig.  26) 

and  bounded  in  R;  that  is,  there  is  a number  K such  that 

(2)  | f(x,  y)\  = K for  all  (x,  y)  in  R. 

Then  the  initial  value  problem  (1)  has  at  least  one  solution  y(x).  This  solution  exists 
at  least  for  all  x in  the  subinterval  \x  — v'ol  < a of  the  interval  \x  — xol  < a; 
here,  a is  the  smaller  of  the  two  numbers  a and  b/K. 
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y 


R 

? 

1 

1 

1 

Fig.  26.  Rectangle  R in  the  existence  and  uniqueness  theorems 


( Example  of  Boundedness.  The  function /(x,  y)  = x2  + y2  is  bounded  (with  K = 2)  in  the 
square  \x\  < 1 , | v < 1.  The  function  f(x,  y)  = tan  (x  + y)  is  not  bounded  for 
\x  + y|  < 77/2.  Explain!) 


THEOREM  2 


Uniqueness  Theorem 

Let  f and  its  partial  derivative  fy  = df/dy  be  continuous  for  all  ( x , y)  in  the  rectangle 
R (Fig.  26)  and  bounded , say, 

(3)  (a)  |/(x,  y)  | ^ K,  (b)  \fy(x,y)\  ^ M for  all  (x,  y)  in  R. 

Then  the  initial  value  problem  (1)  has  at  most  one  solution  ~y(x).  Thus,  by  Theorem  1, 
the  problem  has  precisely  one  solution.  This  solution  exists  at  least  for  all  x in  that 
subinterval  \x  — xol  < a. 


Understanding  These  Theorems 

These  two  theorems  take  care  of  almost  all  practical  cases.  Theorem  1 says  that  if  f(x,  y) 
is  continuous  in  some  region  in  the  xy-plane  containing  the  point  (xo,  Vo),  then  the  initial 
value  problem  ( 1 ) has  at  least  one  solution. 

Theorem  2 says  that  if,  moreover,  the  partial  derivative  df/ dy  of  / with  respect  to  y 
exists  and  is  continuous  in  that  region,  then  (1)  can  have  at  most  one  solution;  hence,  by 
Theorem  1,  it  has  precisely  one  solution. 

Read  again  what  you  have  just  read — these  are  entirely  new  ideas  in  our  discussion. 

Proofs  of  these  theorems  are  beyond  the  level  of  this  book  (see  Ref.  [All]  in  App.  1); 
however,  the  following  remarks  and  examples  may  help  you  to  a good  understanding  of 
the  theorems. 

Since  y'  =f(x,y),  the  condition  (2)  implies  that  \y'  Si  K\  that  is,  the  slope  of  any 
solution  curve  y(x)  in  R is  at  least  —K  and  at  most  K.  Hence  a solution  curve  that  passes 
through  the  point  (xo,  yo)  must  lie  in  the  colored  region  in  Fig.  27  bounded  by  the  lines 
/ 1 and  1 2 whose  slopes  are  —K  and  K,  respectively.  Depending  on  the  form  of  R,  two 
different  cases  may  arise.  In  the  first  case,  shown  in  Fig.  27a,  we  have  b/K  is  a and 
therefore  a = a in  the  existence  theorem,  which  then  asserts  that  the  solution  exists  for  all 
x between  xo  — a and  xo  + a.  In  the  second  case,  shown  in  Fig.  27b,  we  have  b/K  < a. 
Therefore,  a = b/K  < a,  and  all  we  can  conclude  from  the  theorems  is  that  the  solution 
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EXAMPLE  1 


exists  for  all  x between  x0  — b/K  and  xo  + b/K.  For  larger  or  smaller  x’s  the  solution 
curve  may  leave  the  rectangle  R,  and  since  nothing  is  assumed  about  / outside  R,  nothing 
can  be  concluded  about  the  solution  for  those  larger  or  amaller  x’s;  that  is,  for  such  x’s 
the  solution  may  or  may  not  exist — we  don’t  know. 


(a)  (6) 

Fig.  27.  The  condition  (2)  of  the  existence  theorem,  (a)  First  case,  (b)  Second  case 


Let  us  illustrate  our  discussion  with  a simple  example.  We  shall  see  that  our  choice  of 
a rectangle  R with  a large  base  (a  long  x-interval)  will  lead  to  the  case  in  Fig.  27b. 


Choice  of  a Rectangle 

Consider  the  initial  value  problem 


y'  = 1 + v2,  y(0)  = 0 


and  take  the  rectangle  R:  \x\  < 5,  | y < 3.  Then  a = 5,  b = 3,  and 


I/O,  7)1 

V 

dy 


= |l  + y2|  £ K = 10, 
= 2|y|  £ M = 6, 


a 


< a. 


Indeed,  the  solution  of  the  problem  is  y = tan*  (see  Sec.  1.3,  Example  1).  This  solution  is  discontinuous  at 
±77/2,  and  there  is  no  continuous  solution  valid  in  the  entire  interval  |*|  < 5 from  which  we  started. 


The  conditions  in  the  two  theorems  are  sufficient  conditions  rather  than  necessary  ones, 
and  can  be  lessened.  In  particular,  by  the  mean  value  theorem  of  differential  calculus  we 
have 


fix,  y2)  - fix,  Vi) 


O' 2 - Ti) 


df 

dy 


y 


y 


where  (x,  yi)  and  (x,  y 2)  are  assumed  to  be  in  R,  and  y is  a suitable  value  between  yx 
and  y2-  From  this  and  (3b)  it  follows  that 

(4)  I fix,  yf)  ~ fix,  Vi)|  S M\y2  - Til  ■ 
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It  can  be  shown  that  (3b)  may  be  replaced  by  the  weaker  condition  (4),  which  is  known 
as  a Lipschitz  condition.9  However,  continuity  of  fix,  y)  is  not  enough  to  guarantee  the 
uniqueness  of  the  solution.  This  may  be  illustrated  by  the  following  example. 


Nonuniqueness 

The  initial  value  problem 

y'  = Vjyl-  y(0)  = o 

has  the  two  solutions 


y = o 


and 


y-i-  = 


r x1 2 3 4 5 */4  if  *so 

l — x2/4  if  X < 0 


although/!*,  y)  = vTyl  is  continuous  for  all  y.  The  Lipschitz  condition  (4)  is  violated  in  any  region  that  includes 
the  line  y = 0,  because  for  yi  = 0 and  positive  y2  we  have 


(5) 


I Rx,yf)  - fix,  y i)| 

\yz  ~ >rl 


'y2 

y2 


(v£  > o) 


y 2 


and  this  can  be  made  as  large  as  we  please  by  choosing  y2  sufficiently  small,  whereas  (4)  requires  that  the 
quotient  on  the  left  side  of  (5)  should  not  exceed  a fixed  constant  M. 


1.  Linear  ODE.  If  p and  r in  y + p(x)y  = r(x)  are 
continuous  for  all  x in  an  interval  \x  — jt0l  — a,  show 
that  fix,  y ) in  this  ODE  satisfies  the  conditions  of  our 
present  theorems,  so  that  a corresponding  initial  value 
problem  has  a unique  solution.  Do  you  actually  need 
these  theorems  for  this  ODE? 

2.  Existence?  Does  the  initial  value  problem 
ix  — 2)y'  = y,  y(2)  = 1 have  a solution?  Does  your 
result  contradict  our  present  theorems? 

3.  Vertical  strip.  If  the  assumptions  of  Theorems  1 and 
2 are  satisfied  not  merely  in  a rectangle  but  in  a vertical 
infinite  strip  | jc  — jc0I  < a,  in  what  interval  will  the 
solution  of  (1)  exist? 

4.  Change  of  initial  condition.  What  happens  in  Prob. 
2 if  you  replace  y(2)  = 1 with  y(2)  = kl 

5.  Length  of  .r-interval.  In  most  cases  the  solution  of  an 

initial  value  problem  ( 1 ) exists  in  an  x-interval  larger  than 

that  guaranteed  by  the  present  theorems.  Show  this  fact 

for  y = 2y2,  y(  I ) = 1 by  finding  the  best  possible  a 


(choosing  b optimally)  and  comparing  the  result  with  the 
actual  solution. 

6.  CAS  PROJECT.  Picard  Iteration,  (a)  Show  that  by 
integrating  the  ODE  in  (1)  and  observing  the  initial 
condition  you  obtain 

(6)  yix)  = y0  + j fit,  yit))dt. 

J*o 

This  form  (6)  of  (1)  suggests  Picard’s  Iteration  Method10 
which  is  defined  by 

(7)  y „ix)  = y0  + f fit,  yn-i(t)  dt,  n = 1, 2,  ■ ■ ■ • 

Ao 

It  gives  approximations  y-\,  y2,  y3,  ■ . .of  the  unknown 
solution  v of  (1).  Indeed,  you  obtain  yq  by  substituting 
y = y0  on  the  right  and  integrating — this  is  the  first 
step — then  y2  by  substituting  y = Vi  on  the  right  and 
integrating — this  is  the  second  step — and  so  on.  Write 


9RUDOLF  LIPSCHITZ  (1832-1903),  German  mathematician.  Lipschitz  and  similar  conditions  are  important 
in  modern  theories,  for  instance,  in  partial  differential  equations. 

10EMILE  PICARD  (1856-1941).  French  mathematician,  also  known  for  his  important  contributions  to 
complex  analysis  (see  Sec.  16.2  for  his  famous  theorem).  Picard  used  his  method  to  prove  Theorems  1 and  2 
as  well  as  the  convergence  of  the  sequence  (7)  to  the  solution  of  (1).  In  precomputer  times,  the  iteration  was  of 
little  practical  value  because  of  the  integrations. 
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a program  of  the  iteration  that  gives  a printout  of  the 
first  approximations  yo,  Vi.  . . . , yjv as  well  as  their 
graphs  on  common  axes.  Try  your  program  on  two 
initial  value  problems  of  your  own  choice. 

(b)  Apply  the  iteration  to  y'  = x + y,  y(0)  = 0.  Also 
solve  the  problem  exactly. 

(c)  Apply  the  iteration  to  y = 2 y2,  y(0)  = 1 . Also 
solve  the  problem  exactly. 

(d)  Find  all  solutions  of  y = 2Vy,  y(l ) = 0.  Which 
of  them  does  Picard’s  iteration  approximate? 

(e)  Experiment  with  the  conjecture  that  Picard’s 
iteration  converges  to  the  solution  of  the  problem  for 
any  initial  choice  of  y in  the  integrand  in  (7)  (leaving 
y0  outside  the  integral  as  it  is).  Begin  with  a simple  ODE 
and  see  what  happens.  When  you  are  reasonably  sure, 
take  a slightly  more  complicated  ODE  and  give  it  a try. 


7.  Maximum  a.  What  is  the  largest  possible  a in 
Example  1 in  the  text? 

8.  Lipschitz  condition.  Show  that  for  a linear  ODE 
y + p(x)y  = r( x)  with  continuous  p and  r in 
\x  — jc0|  S a a Lipschitz  condition  holds.  This  is 
remarkable  because  it  means  that  for  a linear  ODE  the 
continuity  of  fix,  y)  guarantees  not  only  the  existence 
but  also  the  uniqueness  of  the  solution  of  an  initial 
value  problem.  (Of  course,  this  also  follows  directly 
from  (4)  in  Sec.  1.5.) 

9.  Common  points.  Can  two  solution  curves  of  the  same 
ODE  have  a common  point  in  a rectangle  in  which  the 
assumptions  of  the  present  theorems  are  satisfied? 

10.  Three  possible  cases.  Find  all  initial  conditions  such 
that  (x2  — x)y'  = (2jc  — l)yhas  no  solution,  precisely 
one  solution,  and  more  than  one  solution. 


GFF^PTE  RlREvrEWQUES  T IONS  AND  PROBLEMS 


1.  Explain  the  basic  concepts  ordinary  and  partial 
differential  equations  (ODEs,  PDEs),  order,  general 
and  particular  solutions,  initial  value  problems  (IVPs). 
Give  examples. 

2.  What  is  a linear  ODE?  Why  is  it  easier  to  solve  than 
a nonlinear  ODE? 

3.  Does  every  first-order  ODE  have  a solution?  A solution 
formula?  Give  examples. 

4.  What  is  a direction  field?  A numeric  method  for  first- 
order  ODEs? 

5.  What  is  an  exact  ODE?  Is  /( x)  dx  + g(y)  dy  = 0 
always  exact? 

6.  Explain  the  idea  of  an  integrating  factor.  Give  two 
examples. 

7.  What  other  solution  methods  did  we  consider  in  this 
chapter? 

8.  Can  an  ODE  sometimes  be  solved  by  several  methods? 
Give  three  examples. 

9.  What  does  modeling  mean?  Can  a CAS  solve  a model 
given  by  a first-order  ODE?  Can  a CAS  set  up  a model? 

10.  Give  problems  from  mechanics,  heat  conduction,  and 
population  dynamics  that  can  be  modeled  by  first-order 
ODEs. 
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DIRECTION  FIELD:  NUMERIC  SOLUTION 


Graph  a direction  field  (by  a CAS  or  by  hand)  and  sketch 
some  solution  curves.  Solve  the  ODE  exactly  and  compare. 
In  Prob.  16  use  Euler’s  method. 


11.  y + 2y  = 0 

12.  y = 1 - y2 

13.  y'  = y - 4y2 


14.  xy  = y + x2 

15.  y + y = 1.01  cos  10.r 

16.  Solve  y — y ~ yZ,  y(0)  = 0.2  by  Euler’s  method 
(10  steps,  h = 0.1).  Solve  exactly  and  compute  the  error. 
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GENERAL  SOLUTION 


Find  the  general  solution.  Indicate  which  method  in  this 
chapter  you  are  using.  Show  the  details  of  your  work. 

17.  y + 2.5y  = 1 ,6x 

18.  y — 0.4y  = 29  sin  x 

19.  25yy'  - 4x  = 0 

20.  y = ay  + by 2 (a  A 0) 

21.  (3xev  + 2y)  dx  + ( x2 ev  + x)  dy  — 0 
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INITIAL  VALUE  PROBLEM  (IVP) 


Solve  the  IVP.  Indicate  the  method  used.  Show  the  details 
of  your  work. 

22.  y + 4xv  = e~2^,  y(0)  = -4.3 

23.  y = Vl  - y2,  y(0)  = 1/V2 

24.  y'  + 2>’  = y3,  y(0)  = § 

25.  3 secy  dx  + gsecj cdy  = 0,  v(0)  = 0 

26.  xsinh  y dy  = cosh  yrfa,  y(3)  = 0 
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MODELING,  APPLICATIONS 


27.  Exponential  growth.  If  the  growth  rate  of  a culture 
of  bacteria  is  proportional  to  the  number  of  bacteria 
present  and  after  1 day  is  1.25  times  the  original 
number,  within  what  interval  of  time  will  the  number 
of  bacteria  (a)  double,  (b)  triple? 
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28.  Mixing  problem.  The  tank  in  Fig.  28  contains  80  lb 
of  salt  dissolved  in  500  gal  of  water.  The  inflow  per 
minute  is  20  lb  of  salt  dissolved  in  20  gal  of  water.  The 
outflow  is  20  gal/min  of  the  uniform  mixture.  Find  the 
time  when  the  salt  content  y(t)  in  the  tank  reaches  95% 
of  its  limiting  value  (as  t — » oo). 


29.  Half-life.  If  in  a reactor,  uranium  2g7U  loses  10%  of 
its  weight  within  one  day,  what  is  its  half-life?  How 
long  would  it  take  for  99%  of  the  original  amount  to 
disappear? 

30.  Newton’s  law  of  cooling.  A metal  bar  whose 
temperature  is  20  °C  is  placed  in  boiling  water.  How 
long  does  it  take  to  heat  the  bar  to  practically  100°C, 
say,  to  99.9 °Q  if  the  temperature  of  the  bar  after  1 min 
of  heating  is  5 1.5  °C?  First  guess,  then  calculate. 


Fig.  28.  Tank  in  Problem  28 


SUMMARYQF  CH  APTER  1 

First-Order  ODEs 


This  chapter  concerns  ordinary  differential  equations  (ODEs)  of  first  order  and 

their  applications.  These  are  equations  of  the  form 

( 1 ) F(x,y,y')  = 0 or  in  explicit  form  y'  =f(x,y) 

involving  the  derivative  y = dy/dx  of  an  unknown  function  y,  given  functions  of 
x,  and,  perhaps,  y itself.  If  the  independent  variable  x is  time,  we  denote  it  by  t. 

In  Sec.  1.1  we  explained  the  basic  concepts  and  the  process  of  modeling,  that  is, 
of  expressing  a physical  or  other  problem  in  some  mathematical  form  and  solving 
it.  Then  we  discussed  the  method  of  direction  fields  (Sec.  1.2),  solution  methods 
and  models  (Secs.  1.3-1. 6),  and,  finally,  ideas  on  existence  and  uniqueness  of 
solutions  (Sec.  1.7). 

A first-order  ODE  usually  has  a general  solution,  that  is,  a solution  involving  an 
arbitrary  constant,  which  we  denote  by  c.  In  applications  we  usually  have  to  find  a 
unique  solution  by  determining  a value  of  c from  an  initial  condition  y(jc0)  = yo- 
Together  with  the  ODE  this  is  called  an  initial  value  problem 

(2)  y = f(x,  y),  y(x0)  = yo  (*o,  yo  given  numbers) 

and  its  solution  is  a particular  solution  of  the  ODE.  Geometrically,  a general 
solution  represents  a family  of  curves,  which  can  be  graphed  by  using  direction 
fields  (Sec.  1.2).  And  each  particular  solution  corresponds  to  one  of  these  curves. 
A separable  ODE  is  one  that  we  can  put  into  the  form 

(3)  g(y)  dy  = f(x)  dx  (Sec.  1.3) 

by  algebraic  manipulations  (possibly  combined  with  transformations,  such  as 
y/x  = u)  and  solve  by  integrating  on  both  sides. 


Summary  of  Chapter  1 
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An  exact  ODE  is  of  the  form 


(4) 


M(x,  y ) dx  + N(x,  y)  dy  = 0 


(Sec.  1.4) 


where  M dx  + N dy  is  the  differential 


du  — uxdx  + uy  dy 


of  a function  u(x,  y),  so  that  from  du  = 0 we  immediately  get  the  implicit  general 
solution  u(x,  y)  = c.  This  method  extends  to  nonexact  ODEs  that  can  be  made  exact 
by  multiplying  them  by  some  function  Fix,  y,),  called  an  integrating  factor  (Sec.  1.4). 
Linear  ODEs 


are  very  important.  Their  solutions  are  given  by  the  integral  formula  (4),  Sec.  1.5. 
Certain  nonlinear  ODEs  can  be  transformed  to  linear  form  in  terms  of  new  variables. 
This  holds  for  the  Bernoulli  equation 


Applications  and  modeling  are  discussed  throughout  the  chapter,  in  particular  in 
Secs.  1.1,  1.3,  1.5  {population  dynamics , etc.),  and  1.6  ( trajectories ). 

Picard’s  existence  and  uniqueness  theorems  are  explained  in  Sec.  1.7  (and 
Picard’s  iteration  in  Problem  Set  1.7). 

Numeric  methods  for  first-order  ODEs  can  be  studied  in  Secs.  21.1  and  21.2 
immediately  after  this  chapter,  as  indicated  in  the  chapter  opening. 


(5) 


y + P(x)y  = r(x) 


y'  + p(x)y  = g(x)ya 


(Sec.  1.5). 


CHAPTER  2 


Second-Order  Linear  ODEs 


Many  important  applications  in  mechanical  and  electrical  engineering,  as  shown  in  Secs. 
2.4,  2.8,  and  2.9,  are  modeled  by  linear  ordinary  differential  equations  (linear  ODEs)  of  the 
second  order.  Their  theory  is  representative  of  all  linear  ODEs  as  is  seen  when  compared 
to  linear  ODEs  of  third  and  higher  order,  respectively.  However,  the  solution  formulas  for 
second-order  linear  ODEs  are  simpler  than  those  of  higher  order,  so  it  is  a natural  progression 
to  study  ODEs  of  second  order  first  in  this  chapter  and  then  of  higher  order  in  Chap.  3. 

Although  ordinary  differential  equations  (ODEs)  can  be  grouped  into  linear  and  nonlinear 
ODEs,  nonlinear  ODEs  are  difficult  to  solve  in  contrast  to  linear  ODEs  for  which  many 
beautiful  standard  methods  exist. 

Chapter  2 includes  the  derivation  of  general  and  particular  solutions,  the  latter  in 
connection  with  initial  value  problems. 

For  those  interested  in  solution  methods  for  Legendre’s,  Bessel’s,  and  the  hypergeometric 
equations  consult  Chap.  5 and  for  Sturm-Liouville  problems  Chap.  11. 

COMMENT  Numerics  for  second-order  ODEs  can  be  studied  immediately  after  this 
chapter.  See  Sec.  21.3,  which  is  independent  of  other  sections  in  Chaps.  19-21. 

Prerequisite:  Chap.  1,  in  particular,  Sec.  1.5. 

Sections  that  may  be  omitted  in  a shorter  course:  2.3,  2.9,  2.10. 

References  and  Answers  to  Problems:  App.  1 Part  A,  and  App.  2. 


2.!  Homogeneous  Linear  ODEs  of  Second  Order 

We  have  already  considered  first-order  linear  ODEs  (Sec.  1.5)  and  shall  now  define  and 
discuss  linear  ODEs  of  second  order.  These  equations  have  important  engineering 
applications,  especially  in  connection  with  mechanical  and  electrical  vibrations  (Secs.  2.4, 
2.8,  2.9)  as  well  as  in  wave  motion,  heat  conduction,  and  other  parts  of  physics,  as  we 
shall  see  in  Chap.  12. 

A second-order  ODE  is  called  linear  if  it  can  be  written 

(1)  y"  + p(x)y'  + q(x)y  = r(x) 

and  nonlinear  if  it  cannot  be  written  in  this  form. 

The  distinctive  feature  of  this  equation  is  that  it  is  linear  in  y and  its  derivatives,  whereas 
the  functions  p,  q,  and  r on  the  right  may  be  any  given  functions  of  x.  If  the  equation 
begins  with,  say  ,f(x)y",  then  divide  by  fix)  to  have  the  standard  form  (1)  with  y"  as  the 
first  term. 
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EXAMPLE  1 


The  definitions  of  homogeneous  and  nonhomogenous  second-order  linear  ODEs  are 
very  similar  to  those  of  first-order  ODEs  discussed  in  Sec.  1.5.  Indeed,  if  r(x ) = 0 (that 
is,  r(x)  = 0 for  all  x considered;  read  “r(x)  is  identically  zero”),  then  (1)  reduces  to 

(2)  y"  + p(x)y'  + q(x)y  = 0 

and  is  called  homogeneous.  If  r(x)  # 0,  then  (1)  is  called  nonhomogeneous.  This  is 
similar  to  Sec.  1.5. 

An  example  of  a nonhomogeneous  linear  ODE  is 

y"  + 25y  = e~xcosx, 
and  a homogeneous  linear  ODE  is 

xy"  + y + xy  = 0,  written  in  standard  form  y"  + f y'  + y = 0. 
Finally,  an  example  of  a nonlinear  ODE  is 

n , f2  n 

y y + y = o. 

The  functions  p and  q in  (1)  and  (2)  are  called  the  coefficients  of  the  ODEs. 
Solutions  are  defined  similarly  as  for  first-order  ODEs  in  Chap.  1.  A function 

y = Kx) 

is  called  a solution  of  a (linear  or  nonlinear)  second-order  ODE  on  some  open  interval  I 
if  h is  defined  and  twice  differentiable  throughout  that  interval  and  is  such  that  the  ODE 
becomes  an  identity  if  we  replace  the  unknown  y by  h,  the  derivative  y by  h , and  the 
second  derivative  y by  h" . Examples  are  given  below. 


Homogeneous  Linear  ODEs:  Superposition  Principle 

Sections  2. 1-2.6  will  be  devoted  to  homogeneous  linear  ODEs  (2)  and  the  remaining 
sections  of  the  chapter  to  nonhomogeneous  linear  ODEs. 

Linear  ODEs  have  a rich  solution  structure.  For  the  homogeneous  equation  the  backbone 
of  this  structure  is  the  superposition  principle  or  linearity  principle,  which  says  that  we 
can  obtain  further  solutions  from  given  ones  by  adding  them  or  by  multiplying  them  with 
any  constants.  Of  course,  this  is  a great  advantage  of  homogeneous  linear  ODEs.  Let  us 
first  discuss  an  example. 


Homogeneous  Linear  ODEs:  Superposition  of  Solutions 

The  functions  y = cos  x and  y = sin  x are  solutions  of  the  homogeneous  linear  ODE 

y"  + y = o 

for  all  x.  We  verify  this  by  differentiation  and  substitution.  We  obtain  (cos  x)"  = — cos  x;  hence 


y ' + y = (cos  x)  + cos  x = —cos  x + cos  x = 0. 
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THEOREM  1 


PROOF 


EXAMPLE  2 


EXAMPLE  3 


Similarly  for  y = sin  x (verify!).  We  can  go  an  important  step  further.  We  multiply  cos  x by  any  constant,  for 
instance,  4.7,  and  sin  x by,  say,  —2,  and  take  the  sum  of  the  results,  claiming  that  it  is  a solution.  Indeed, 
differentiation  and  substitution  gives 

(4.7  cos  x — 2 sin  x)"  + (4.7  cos  x — 2 sin  x)  = —4.7  cos  x + 2 sin  x + 4.7  cos  x — 2 sin  x = 0. 


In  this  example  we  have  obtained  from  vi  (=  cos  x)  and  y2  ( = sin  x)  a function  of  the  form 

(3)  y = ciyi  + c2y2  (cy,  c2  arbitrary  constants). 

This  is  called  a linear  combination  of  yi  and  y2.  In  terms  of  this  concept  we  can  now 
formulate  the  result  suggested  by  our  example,  often  called  the  superposition  principle 
or  linearity  principle. 


Fundamental  Theorem  for  the  Homogeneous  Linear  ODE  (2) 

For  a homogeneous  linear  ODE  (2),  any  linear  combination  of  two  solutions  on  an 
open  interval  I is  again  a solution  of  (2)  on  I.  In  particular,  for  such  an  equation, 
sums  and  constant  multiples  of  solutions  are  again  solutions. 


Let  v i and  y2  be  solutions  of  (2)  on  I.  Then  by  substituting  y = ay-y  + c2y2  and 
its  derivatives  into  (2),  and  using  the  familiar  rule  ( r: j y i + c2y2) ' = ciyi  + c2y2,  etc., 
we  get 

y"  + py'  + qy  = (ci>’i  + c2y2)"  + p(ayi  + c2y2y  + qiciyi  + c2y2 ) 

= ay"  + c2y2  + p(ay[  + c2y2)  + qiayy  + c2y2) 

= C'i()’i  + py'i  + qy\)  + C2(y2  + py2  + qy2)  = 0, 

since  in  the  last  line,  (•••)  = 0 because  yj  and  y2  are  solutions,  by  assumption.  This  shows 
that  y is  a solution  of  (2)  on  I. 

CAUTION!  Don’t  forget  that  this  highly  important  theorem  holds  for  homogeneous 
linear  ODEs  only  but  does  not  hold  for  nonhomogeneous  linear  or  nonlinear  ODEs,  as 
the  following  two  examples  illustrate. 

A Nonhomogeneous  Linear  ODE 

Verify  by  substitution  that  the  functions  y = 14-  cos  x and  y = 1 + sin  x are  solutions  of  the  nonhomogeneous 
linear  ODE 


y"  + y=  i. 


but  their  sum  is  not  a solution.  Neither  is,  for  instance,  2(1  + cos  x)  or  5(1  + sin  a). 


A Nonlinear  ODE 

Verify  by  substitution  that  the  functions  y = x2  and  y = 1 are  solutions  of  the  nonlinear  ODE 

y"y  — xy'  — 0, 


but  their  sum  is  not  a solution.  Neither  is  — x2,  so  you  cannot  even  multiply  by  — 1 ! 


SEC.  2.1  Homogeneous  Linear  ODEs  of  Second  Order 
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Initial  Value  Problem.  Basis.  General  Solution 

Recall  from  Chap.  1 that  for  a first-order  ODE,  an  initial  value  problem  consists  of  the 
ODE  and  one  initial  condition  v(x0)  = )'o.  The  initial  condition  is  used  to  determine  the 
arbitrary  constant  c in  the  general  solution  of  the  ODE.  This  results  in  a unique  solution, 
as  we  need  it  in  most  applications.  That  solution  is  called  a particular  solution  of  the 
ODE.  These  ideas  extend  to  second-order  ODEs  as  follows. 

For  a second-order  homogeneous  linear  ODE  (2)  an  initial  value  problem  consists  of 
(2)  and  two  initial  conditions 

(4)  ;y(*o)  = K0,  y\x  o)  = Kx. 

These  conditions  prescribe  given  values  Kq  and  K \ of  the  solution  and  its  first  derivative 
(the  slope  of  its  curve)  at  the  same  given  x = in  the  open  interval  considered. 

The  conditions  (4)  are  used  to  determine  the  two  arbitrary  constants  ci  and  c2  in  a 

general  solution 

(5)  y = cxy  i + c2y2 

of  the  ODE;  here,  yi  and  y2  are  suitable  solutions  of  the  ODE,  with  “suitable”  to  be 
explained  after  the  next  example.  This  results  in  a unique  solution,  passing  through  the 
point  (x o,  Kq)  with  K x as  the  tangent  direction  (the  slope)  at  that  point.  That  solution  is 
called  a particular  solution  of  the  ODE  (2). 

EXAMPLE  4 Initial  Value  Problem 

Solve  the  initial  value  problem 

/ + y = 0,  y(0)  = 3.0,  /( 0)  = -0.5. 

Solution.  Step  1.  General  solution.  The  functions  cos  x and  sin  v are  solutions  of  the  ODE  (by  Example  1), 
and  we  take 


Fig.  29.  Particular  solution 
and  initial  tangent  in 
Example  4 


y = C\  cos  x + C2  sin  x. 

This  will  turn  out  to  be  a general  solution  as  defined  below. 

Step  2.  Particular  solution.  We  need  the  derivative  y = — C\  sin  x + c%  cos  x.  From  this  and  the 
initial  values  we  obtain,  since  cos  0=1  and  sin  0 = 0, 

y(0)  c'i  = 3.0  and  y;(0)  = c2  = —0.5. 

This  gives  as  the  solution  of  our  initial  value  problem  the  particular  solution 

y = 3.0  cos  x — 0.5  sinx. 

Figure  29  shows  that  at  x — 0 it  has  the  value  3.0  and  the  slope  —0.5,  so  that  its  tangent  intersects 
the  x-axis  atx  = 3.0/0.5  = 6.0  . (The  scales  on  the  axes  differ!) 


Observation.  Our  choice  of  yi  and  y 2 was  general  enough  to  satisfy  both  initial 
conditions.  Now  let  us  take  instead  two  proportional  solutions  yi  = cos  x and  y2  = k cos  x , 
so  that  yi/y2  = 1 Ik  = const.  Then  we  can  write  y = ciyi  + C2^2  in  the  form 


y = Ci  cos  x + C2(k  cos  x)  = C cos  x where  C = c\  + C2 k. 
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Hence  we  are  no  longer  able  to  satisfy  two  initial  conditions  with  only  one  arbitrary 
constant  C.  Consequently,  in  defining  the  concept  of  a general  solution,  we  must  exclude 
proportionality.  And  we  see  at  the  same  time  why  the  concept  of  a general  solution  is  of 
importance  in  connection  with  initial  value  problems. 


DEFINITION 


General  Solution,  Basis,  Particular  Solution 

A general  solution  of  an  ODE  (2)  on  an  open  interval  I is  a solution  (5)  in  which 
y1  and  y2  are  solutions  of  (2)  on  I that  are  not  proportional,  and  c | and  C2  are  arbitrary 
constants.  These  jq,  y2  are  called  a basis  (or  a fundamental  system)  of  solutions 
of  (2)  on  /. 

A particular  solution  of  (2)  on  / is  obtained  if  we  assign  specific  values  to  cq 
and  c2  in  (5). 


For  the  definition  of  an  interval  see  Sec.  1.1.  Furthermore,  as  usual,  y 1 and  y2  are  called 
proportional  on  I if  for  all  x on  /, 

(6)  (a)  y!  = ky2  or  (b)  y2  = fyi 

where  k and  / are  numbers,  zero  or  not.  (Note  that  (a)  implies  (b)  if  and  only  if  k A 0). 

Actually,  we  can  reformulate  our  definition  of  a basis  by  using  a concept  of  general 
importance.  Namely,  two  functions  >q  and  y2  are  called  linearly  independent  on  an 
interval  I where  they  are  defined  if 

(7)  k j Vj Cv)  + k2y2(x)  = 0 everywhere  on  I implies  k j = 0 and  k2  = 0. 

And  yq  and  y2  are  called  linearly  dependent  on  I if  (7)  also  holds  for  some  constants  k 
k 2 not  both  zero.  Then,  if  k j A 0 or  k2  A 0,  we  can  divide  and  see  that  yi  and  y2  are 
proportional, 


or 


In  contrast,  in  the  case  of  linear  independence  these  functions  are  not  proportional  because 
then  we  cannot  divide  in  (7).  This  gives  the  following 


DEFINITION 


Basis  (Reformulated) 

A basis  of  solutions  of  (2)  on  an  open  interval  I is  a pair  of  linearly  independent 
solutions  of  (2)  on  /. 


If  the  coefficients  p and  q of  (2)  are  continuous  on  some  open  interval  /,  then  (2)  has  a 
general  solution.  It  yields  the  unique  solution  of  any  initial  value  problem  (2),  (4).  It 
includes  all  solutions  of  (2)  on  /;  hence  (2)  has  no  singular  solutions  (solutions  not 
obtainable  from  of  a general  solution;  see  also  Problem  Set  1.1).  All  this  will  be  shown 
in  Sec.  2.6. 


SEC.  2.1  Homogeneous  Linear  ODEs  of  Second  Order 
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EXAMPLE  5 


EXAMPLE  6 


EXAMPLE  7 


Basis,  General  Solution,  Particular  Solution 

cos  x and  sin  x in  Example  4 form  a basis  of  solutions  of  the  ODE  y + y = 0 for  all  x because  their 
quotient  is  cot  x # const  (or  tan  x i const).  Hence  y = c,  cos  x + c2  sin  x is  a general  solution.  The  solution 
y = 3.0  cos  x — 0.5  sin  x of  the  initial  value  problem  is  a particular  solution. 

Basis,  General  Solution,  Particular  Solution 

Verify  by  substitution  that  yy  = ex  and  y2  = e~x  are  solutions  of  the  ODE  y"  — y = 0.  Then  solve  the  initial 
value  problem 

y"  ~ y = 0,  y(0)  = 6,  y'(0)  = -2. 

Solution.  (ex)"  - ex  = 0 and  (e~x)"  - e~x  = 0 show  that  ex  and  e~x  are  solutions.  They  are  not 
proportional,  exle~x  = e2x  =£  const.  Hence  ex,  e~x  form  a basis  for  all  x.  We  now  write  down  the  corresponding 
general  solution  and  its  derivative  and  equate  their  values  at  0 to  the  given  initial  conditions, 

y = c±ex  + c2e~x,  y = cxex  - c2e~x , y(0)  = a + c2  = 6,  y'(0)  = cx  - c2  = ~2. 

By  addition  and  subtraction,  C\  = 2,  c2  = 4,  so  that  the  answer  is  y = 2ex  + 4e~x.  This  is  the  particular  solution 
satisfying  the  two  initial  conditions. 

Find  a Basis  if  One  Solution  Is  Known. 

Reduction  of  Order 

It  happens  quite  often  that  one  solution  can  be  found  by  inspection  or  in  some  other  way. 
Then  a second  linearly  independent  solution  can  be  obtained  by  solving  a first-order  ODE. 
This  is  called  the  method  of  reduction  of  order.1  We  first  show  how  this  method  works 
in  an  example  and  then  in  general. 

Reduction  of  Order  if  a Solution  Is  Known.  Basis 

Find  a basis  of  solutions  of  the  ODE 


(x2  ~ x)y"  — xy'  + y = 0. 

Solution.  Inspection  shows  that  yi  = x is  a solution  because  y±  = l and  yf{  = 0,  so  that  the  first  term 
vanishes  identically  and  the  second  and  third  terms  cancel.  The  idea  of  the  method  is  to  substitute 

y = uyi  = ux,  y'  = u x + u,  y"  = u"x  + 2 u 


into  the  ODE.  This  gives 

(x2  — x){u"  x + 2 u)  — x{u'  x + u)  + ux  = 0. 

ux  and  —xu  cancel  and  we  are  left  with  the  following  ODE,  which  we  divide  by  x,  order,  and  simplify, 

(x2  — x){u'  x + 2 u)  — x2u  = 0,  (x2  — x)u"  + (x  — 2 )u  = 0. 

This  ODE  is  of  first  order  in  u = u , namely,  (x2  — x)vr  + (x  — 2)v  = 0.  Separation  of  variables  and  integration 
gives 

— — — o dx  = ( 1 dx.  In  | v | = In  |x  — 1 1 — 2 In  |x|  = In  - — 2 — ■ 

V X X \X  1 X J X 


1Credited  to  the  great  mathematician  JOSEPH  LOUIS  LAGRANGE  (1736-1813),  who  was  bom  in  Turin, 

of  French  extraction,  got  his  first  professorship  when  he  was  19  (at  the  Military  Academy  of  Turin),  became 
director  of  the  mathematical  section  of  the  Berlin  Academy  in  1766,  and  moved  to  Paris  in  1787.  His  important 
major  work  was  in  the  calculus  of  variations,  celestial  mechanics,  general  mechanics  ( Mecanique  analytique, 
Paris,  1788),  differential  equations,  approximation  theory,  algebra,  and  number  theory. 
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We  need  no  constant  of  integration  because  we  want  to  obtain  a particular  solution;  similarly  in  the  next 
integration.  Taking  exponents  and  integrating  again,  we  obtain 

v = — o — = 2’  u = I v dx  = In  \x\  H — , hence  y%  = ux  = x In  \x\  + 1. 

xxx  J x 

Since  yi  = x and  y%  = x In  \x\  + 1 are  linearly  independent  (their  quotient  is  not  constant),  we  have  obtained 
a basis  of  solutions,  valid  for  all  positive  x. 


In  this  example  we  applied  reduction  of  order  to  a homogeneous  linear  ODE  [see  (2)] 

y"  + p(x)y'  + q(x)y  = 0. 

Note  that  we  now  take  the  ODE  in  standard  form,  with  y",  not  f(x)y” — this  is  essential 
in  applying  our  subsequent  formulas.  We  assume  a solution  yi  of  (2),  on  an  open  interval 
/,  to  be  known  and  want  to  find  a basis.  For  this  we  need  a second  linearly  independent 
solution  yi  of  (2)  on  I.  To  get  yi,  we  substitute 

y = y2  = uyi,  y = y2  = « yi  + uyi,  y = V2  = M yi  + 2u  yx  + uyx 

into  (2).  This  gives 

(8)  u"yi  + 2 u'y[  + uy'{  + p{u'y\  + uy[)  + quyi  = 0. 

Collecting  terms  in  u",  u' , and  u,  we  have 

u"yi  + u\2y[  + pyy)  + u(y"  + py[  + qyy)  = 0. 


Now  comes  the  main  point.  Since  yi  is  a solution  of  (2),  the  expression  in  the  last 
parentheses  is  zero.  Hence  u is  gone,  and  we  are  left  with  an  ODE  in  u and  u" . We  divide 
this  remaining  ODE  by  yi  and  set  w ' = U,u  = U , 


, 2 y[  + py1 

u = 0, 

yi 


thus 


U'  + 


2y[ 

yi 


+ p U=  0. 


This  is  the  desired  first-order  ODE,  the  reduced  ODE.  Separation  of  variables  and 
integration  gives 


dU 


U 


2 yi 


yi 


— = — I — h p)  dx  and  In  | U\  = —2  In  |vil 


p dx. 


By  taking  exponents  we  finally  obtain 


(9) 


—fpdx 


Here  U = u',  so  that  u = J’  U dx.  Hence  the  desired  second  solution  is 


V2  = yi  u = y i 


U dx. 


The  quotient  y2/y\  = u = f U dx  cannot  be  constant  (since  U > 0),  so  that  yt  and  y2  form 
a basis  of  solutions. 
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REDUCTION  OF  ORDER  is  important  because  it 
gives  a simpler  ODE.  A general  second-order  ODE 
F(x,  y,y , y" ) = 0,  linear  or  not,  can  be  reduced  to  first 
order  if  y does  not  occur  explicitly  (Prob.  1)  or  if  x does  not 
occur  explicitly  (Prob.  2)  or  if  the  ODE  is  homogeneous 
linear  and  we  know  a solution  (see  the  text). 

1.  Reduction.  Show  that  F(x,y',y")  = 0 can  be 

reduced  to  first  order  in  z — y (from  which  y follows 
by  integration).  Give  two  examples  of  your  own. 

2.  Reduction.  Show  that  F(y,  y,  y")  = 0 can  be 

reduced  to  a first-order  ODE  with  y as  the  independent 
variable  and  y"  = ( dzJdy)z , where  z = y\  derive  this 
by  the  chain  rule.  Give  two  examples. 


3-10 


REDUCTION  OF  ORDER 


Reduce  to  first  order  and  solve,  showing  each  step  in  detail. 


3. 

// 

v 

o 

II 

+ 

4. 

2xy 

rr  ->  r 

= 3y 

5. 

n 

yy 

II 

to 

6. 

n 

xy 

+ 2y'  + xy  = 

0,  yi 

= (cos  x)/x 

7. 

ft 

y 

+ y'3  sin  y = 0 

8. 

n 

y 

= 1 + y'2 

9. 

x2y 

" — 5xy'  + 9y 

= o. 

II 

* 

W 

10. 

y" 

+ (l  + i/v)y'2 

= 0 

11 

-14 

APPLICATH 

DNS  OF  REDUCIBLE  ODEs 

curve  through  the  origin  in 

11. 

Curve.  Find  the 

xy-plane  which  satisfies  y"  — 2 y'  and  whose  tangent 
at  the  origin  has  slope  1 . 


12.  Hanging  cable.  It  can  be  shown  that  the  curve  y{x) 
of  an  inextensible  flexible  homogeneous  cable  hanging 
between  two  fixed  points  is  obtained  by  solving 


y"  = k\/ 1 + y'2,  where  the  constant  k depends  on  the 
weight.  This  curve  is  called  catenary  (from  Latin 
catena  = the  chain).  Find  and  graph  y(x),  assuming  that 
k = 1 and  those  fixed  points  are  (—1,  0)  and  (1,0)  in 
a vertical  xy-plane. 

13.  Motion.  If,  in  the  motion  of  a small  body  on  a 
straight  line,  the  sum  of  velocity  and  acceleration  equals 
a positive  constant,  how  will  the  distance  y(f)  depend 
on  the  initial  velocity  and  position? 

14.  Motion.  In  a straight-line  motion,  let  the  velocity  be 
the  reciprocal  of  the  acceleration.  Find  the  distance  y(f) 
for  arbitrary  initial  position  and  velocity. 


15-19 


GENERAL  SOLUTION.  INITIAL  VALUE 
PROBLEM  (IVP) 


(More  in  the  next  set.)  (a)  Verify  that  the  given  functions 
are  linearly  independent  and  form  a basis  of  solutions  of 
the  given  ODE.  (b)  Solve  the  IVP.  Graph  or  sketch  the 
solution. 

15.  4y"  + 25y  = 0,  y(0)  = 3.0,  y'(0)  = -2.5, 
cos  2.5x,  sin  2.5x 

16.  y"  + 0.6 y + 0.09y  = 0,  y(0)  = 2.2,  y'(0)  = 0.14, 
e~03x,xe~03x 

17.  4x2y”  - 3y  = 0,  y(l)  = -3,  y'(l)  = 0, 
x^x"1'2 

18.  x2y"  - xy'  + y = 0,  y(l)  = 4.3,  y'(l)  = 0.5, 
x,  x In  x 

19.  y"  + 2 y + 2y  = 0,  y(0)  = 0,  y'(0)  = 15, 
e~x  cos  x,  e~x  sin  x 

20.  CAS  PROJECT.  Linear  Independence.  Write  a 
program  for  testing  linear  independence  and  depen- 
dence. Try  it  out  on  some  of  the  problems  in  this  and 
the  next  problem  set  and  on  examples  of  your  own. 


Homogeneous  Linear  ODEs 
with  Constant  Coefficients 


We  shall  now  consider  second-order  homogeneous  linear  ODEs  whose  coefficients  a and 
b are  constant, 

(1)  y"  + ay'  + by  = 0. 

These  equations  have  important  applications  in  mechanical  and  electrical  vibrations,  as 
we  shall  see  in  Secs.  2.4,  2.8,  and  2.9. 

To  solve  (1),  we  recall  from  Sec.  1.5  that  the  solution  of  the  first-order  linear  ODE  with 
a constant  coefficient  k 


y'  + ky  = 0 


54 


CHAP.  2 Second-Order  Linear  ODEs 


is  an  exponential  function  y = ce  kx.  This  gives  us  the  idea  to  try  as  a solution  of  (1)  the 
function 

(2)  y = eAx 

Substituting  (2)  and  its  derivatives 

y'  = AeAx  and  y"  = A2eAx 
into  our  equation  (1),  we  obtain 


(A2  + aA  + b)eAx  — 0. 

Hence  if  A is  a solution  of  the  important  characteristic  equation  (or  auxiliary  equation) 

(3)  A2  + a\  + b = 0 

then  the  exponential  function  (2)  is  a solution  of  the  ODE  (1).  Now  from  algebra  we  recall 
that  the  roots  of  this  quadratic  equation  (3)  are 

(4)  Ai  = ^(-a  + Va2  - 4b),  A2  = |(- a - V a2  - 4b). 

(3)  and  (4)  will  be  basic  because  our  derivation  shows  that  the  functions 

(5)  y1  = eXlX  and  y2  = eAzX 

are  solutions  of  (1).  Verify  this  by  substituting  (5)  into  (1). 

From  algebra  we  further  know  that  the  quadratic  equation  (3)  may  have  three  kinds  of 
roots,  depending  on  the  sign  of  the  discriminant  a2  — 4b,  namely, 


(Case  I) 
(Case  II) 
(Case  III) 


Two  real  roots  if  a2  — 4b  > 0, 

A real  double  root  if  a2  — 4b  = 0, 
Complex  conjugate  roots  if  a2  — 4b  < 0. 


Case  I.  Two  Distinct  Real-Roots  and  A2 

In  this  case,  a basis  of  solutions  of  (1)  on  any  interval  is 

yi  = eAl'1  and  y2  = eAzX 

because  vi  and  y2  are  defined  (and  real)  for  all  x and  their  quotient  is  not  constant.  The 
corresponding  general  solution  is 


y = CleAlX  + c2eA2X. 


(6) 
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EXAMPLE  1 


EXAMPLE  2 


General  Solution  in  the  Case  of  Distinct  Real  Roots 

We  can  now  solve  y'  — y = 0 in  Example  6 of  Sec.  2.1  systematically.  The  characteristic  equation  is 
a — 1 = 0.  Its  roots  are  Ai  = 1 and  A2  = — 1 . Hence  a basis  of  solutions  is  ex  and  e~x  and  gives  the  same 
general  solution  as  before, 


y = C\ex  + c2e  x.  I 

Initial  Value  Problem  in  the  Case  of  Distinct  Real  Roots 

Solve  the  initial  value  problem 

/ + y - 2y  = 0,  y(0)  = 4,  y'(0)  = -5. 

Solution.  Step  1.  General  solution.  The  characteristic  equation  is 

A2  + A — 2 = 0. 


Its  roots  are 

Ai  = i(— 1 + V9)  = 1 and  A2  = §(- 1 - V9)  = -2 
so  that  we  obtain  the  general  solution 

x I —2.x 

y = c-xe  + c2e 

Step  2.  Particular  solution.  Since  y\x)  = C\ex  — 2c2e~2x,  we  obtain  from  the  general  solution  and  the  initial 
conditions 


y(0)  = ci  + c2  = 4, 
y'(0)  = ci  - 2c2  = -5. 

Hence  c\  = 1 and  c2  = 3.  This  gives  the  answer  y = ex  + 3e~2x.  Figure  30  shows  that  the  curve  begins  at 
y = 4 with  a negative  slope  (—5,  but  note  that  the  axes  have  different  scales!),  in  agreement  with  the  initial 
conditions. 


0 


0 


i 

0.5 


_L 

1 


I 

1.5 


2 


x 


Fig.  30.  Solution  in  Example  2 


Case  II.  Real  Double  Root  A = — a/2 

If  the  discriminant  a2  — 4 b is  zero,  we  see  directly  from  (4)  that  we  get  only  one  root, 
A = A!  = A2  = —a/2,  hence  only  one  solution, 


yi  = e 


— (a/2)x 


To  obtain  a second  independent  solution  y2  (needed  for  a basis),  we  use  the  method  of 
reduction  of  order  discussed  in  the  last  section,  setting  y2  = uy\-  Substituting  this  and  its 
derivatives  y'2  = u y\  + uy\  and  y2  into  (1),  we  first  have 


(u"y  i + 2u'y[  + uy'{)  + a(u'y x + uy[)  + buy1  = 0. 
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EXAMPLE  3 


EXAMPLE  4 


Collecting  terms  in  u”,  u,  and  u,  as  in  the  last  section,  we  obtain 

u"yi  + u'(2y'i  + ayi)  + u(y'[  4-  ay[  + by{)  = 0. 

The  expression  in  the  last  parentheses  is  zero,  since  vi  is  a solution  of  (1).  The  expression 
in  the  first  parentheses  is  zero,  too,  since 

o f —ax/2 

2yi  = —ae  = —ay  i. 

We  are  thus  left  with  u"y±  = 0.  Hence  u"  = 0.  By  two  integrations,  u = cjx  + c2.  To 
get  a second  independent  solution  y2  = uyi,  we  can  simply  choose  Ci  = 1,  c2  = 0 and 
take  u = x.  Then  y2  = xy\.  Since  these  solutions  are  not  proportional,  they  form  a basis. 
Hence  in  the  case  of  a double  root  of  (3)  a basis  of  solutions  of  (1)  on  any  interval  is 

—ax 1 2 —ax/2 

e , xe 

The  corresponding  general  solution  is 

(7)  y = (ci  + c2x)e-ax/2. 

WARNING!  If  A is  a simple  root  of  (4),  then  (ci  + C‘2X)ekx  with  c2  ^ 0 is  not  a solution 

of  (1). 

General  Solution  in  the  Case  of  a Double  Root 

The  characteristic  equation  of  the  ODE  y"  + 6 y + 9y  = 0 is  A2  + 6A  + 9 = (A  + 3)2  = 0.  It  has  the  double 
root  A = —3.  Hence  a basis  is  e '1'1  and  xe~3x.  The  corresponding  general  solution  is  y = (ci  + C2x)e~3x.  I 

Initial  Value  Problem  in  the  Case  of  a Double  Root 

Solve  the  initial  value  problem 

y"  + y + 0.25.y  = 0,  y(0)  = 3.0,  /(0)  = -3.5. 

Solution.  The  characteristic  equation  is  A2  + A + 0.25  = (A  + 0.5) 2 = 0.  It  has  the  double  root  A = —0.5. 
This  gives  the  general  solution 

y = (cx  + c2x)e~°-5x. 


We  need  its  derivative 

/ = c2e~05x  ~ 0.5(cj  + c2x)e~05x. 
From  this  and  the  initial  conditions  we  obtain 


y(0)  = C\  = 3.0,  /(0)  = c2  — 0.5cx  = 3.5;  hence  c2  = —2. 


The  particular  solution  of  the  initial  value  problem  is  y = (3  — 1x)e  °'5x.  See  Fig. 


31. 


12  14  * 


Fig.  31.  Solution  in  Example  4 
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EXAMPLE  5 


EXAMPLE  6 


Case  III.  Complex  Roots  — \o  + ico  and  — \a  — ico 

This  case  occurs  if  the  discriminant  a2  — 4b  of  the  characteristic  equation  (3)  is  negative. 
In  this  case,  the  roots  of  (3)  are  the  complex  A = — ± ico  that  give  the  complex  solutions 

of  the  ODE  (1).  However,  we  will  show  that  we  can  obtain  a basis  of  real  solutions 

(8)  yi  = e~ax/2  cos  cox,  y2  = e~ax/2  sin  cox  ( co  > 0) 

2 12 

where  co  = b — 4a  . It  can  be  verified  by  substitution  that  these  are  solutions  in  the 
present  case.  We  shall  derive  them  systematically  after  the  two  examples  by  using  the 
complex  exponential  function.  They  form  a basis  on  any  interval  since  their  quotient 
cot  cox  is  not  constant.  Hence  a real  general  solution  in  Case  III  is 

(9)  y = e~ax/2  ( A cos  cox  + B sin  cox)  (A,  B arbitrary). 

Complex  Roots.  Initial  Value  Problem 

Solve  the  initial  value  problem 

y"  + 0.4/  + 9.04 y = 0,  y(0)  = 0,  /( 0)  = 3. 

Solution.  Step  1.  General  solution.  The  characteristic  equation  is  A2  + 0.4A  + 9.04  = 0.  It  has  the  roots 
—0.2  ± 3 i.  Hence  (o  = 3,  and  a general  solution  (9)  is 

y = e~0m2x (A  cos  3 x + B sin  3x). 

Step  2.  Particular  solution.  The  first  initial  condition  gives  y(0)  = A = 0.  The  remaining  expression  is 
y = Be~  ' x sin  3x.  We  need  the  derivative  (chain  rule!) 

y'  = B(—0.2e~°'2x  sin  3x  + 3e~0  2x  cos  3x). 

From  this  and  the  second  initial  condition  we  obtain  y (0)  = 3B  = 3.  Hence  5=1.  Our  solution  is 

y = e sin  3x. 

Figure  32  shows  y and  the  curves  of  e~0  2x  and  — e~°'2x  (dashed),  between  which  the  curve  of  y oscillates. 
Such  “damped  vibrations”  (with  x = t being  time)  have  important  mechanical  and  electrical  applications,  as  we 
shall  soon  see  (in  Sec.  2.4). 


Complex  Roots 

A general  solution  of  the  ODE 

y"  + (o2y  = 0 (a)  constant,  not  zero) 


y = A cos  cjx  + B sin  cox. 
With  co  = 1 this  confirms  Example  4 in  Sec.  2.1. 
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Summary  of  Cases  l-lll 


Case 

Roots  of  (2) 

Basis  of  (1) 

General  Solution  of  (1) 

I 

Distinct  real 
4i,  A2 

g\xx  g\2X 

y = CleAlX  + C2eA2* 

II 

Real  double  root 
A = ~2a 

xg-ax/2 

y = (ci  + c2x)e~ax/2 

III 

Complex  conjugate 
Ai  = — \a  + ito, 
A2  = 2u  ia> 

£-ax/2  CQS  ^ 
g-ax/2  sjn  wx 

y = e~ax/2(A  cos  cox  + B sin  cox) 

It  is  very  interesting  that  in  applications  to  mechanical  systems  or  electrical  circuits, 
these  three  cases  correspond  to  three  different  forms  of  motion  or  flows  of  current, 
respectively.  We  shall  discuss  this  basic  relation  between  theory  and  practice  in  detail  in 
Sec.  2.4  (and  again  in  Sec.  2.8). 


Derivation  in  Case  III.  Complex  Exponential  Function 

If  verification  of  the  solutions  in  (8)  satisfies  you,  skip  the  systematic  derivation  of  these 
real  solutions  from  the  complex  solutions  by  means  of  the  complex  exponential  function 
ez  of  a complex  variable  z = r + it.  We  write  r + it,  not  x + iy  because  x and  y occur 
in  the  ODE.  The  definition  of  ez  in  terms  of  the  real  functions  er,  cos  t,  and  sin  t is 

(10)  ez  = er+lt  = e el  = er(cos  t + i sin  t ). 


This  is  motivated  as  follows.  For  real  z = r,  hence  t = 0,  cos  0 = 1,  sinO  = 0,  we  get 

the  real  exponential  function  e . It  can  be  shown  that  eZl+Zz  = ez'ez'2,  just  as  in  real.  (Proof 

in  Sec.  13.5.)  Finally,  if  we  use  the  Maclaurin  series  of  ez  with  z = it  as  well  as 
2 3 4 

i = — 1,  i = —i,i  =1,  etc.,  and  reorder  the  terms  as  shown  (this  is  permissible,  as 
can  be  proved),  we  obtain  the  series 


2! 


(i :tf  , (iff  , (if) 


3! 


+ 


4! 


+ 


5! 


+ ■ 


_ , t2  r4 

— 1 — + — + ' • 

2!  4! 


f3  r5 
~F  / [ r — — T — 
3!  5! 


= cos  t + i sin  t. 


(Look  up  these  real  series  in  your  calculus  book  if  necessary.)  We  see  that  we  have  obtained 
the  formula 


(ID 


it 


e 


= cos  t + i sin  t. 


called  the  Euler  formula.  Multiplication  by  er  gives  (10). 
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For  later  use  we  note  that  e lt  = cos  (—t)  + i sin  (— t)  = cos  t — i sin  t,  so  that  by 
addition  and  subtraction  of  this  and  (11), 

(12)  cos  t = \ ( elt  + e_lt),  sin  t = — ( elt  — e~lt). 

2 i 

After  these  comments  on  the  definition  (10),  let  us  now  turn  to  Case  III. 

2 2 

In  Case  III  the  radicand  a — 4b  in  (4)  is  negative.  Hence  4b  — a is  positive  and, 
using  V—  1 = ;,  we  obtain  in  (4) 

gV a2  — 4b  = \\/  ~ (4b  — a2)  = y/—(b  — 4a2)  = ;V b — \aZ  = ico 
with  co  defined  as  in  (8).  Hence  in  (4), 

Ai  = | a + ico  and,  similarly,  A2  = \a  — ioj. 

Using  (10)  with  r = - 2 ax  and  t = cox,  we  thus  obtain 

= e-(a/2)*  + iM*  = g-(a/2)x(cos  ^ + ■ sin  ^ 

eX2x  = e—(a/2)x  — io>x  = g-(a/2)x(cos  QJX  _ ; sjn  wx) 

We  now  add  these  two  lines  and  multiply  the  result  by  |.  This  gives  as  in  (8).  Then 
we  subtract  the  second  line  from  the  first  and  multiply  the  result  by  1/(2;').  This  gives  >’2 

as  in  (8).  These  results  obtained  by  addition  and  multiplication  by  constants  are  again 

solutions,  as  follows  from  the  superposition  principle  in  Sec.  2.1.  This  concludes  the 
derivation  of  these  real  solutions  in  Case  III. 


P-R-QBL=EM==S=ET— 


1-15 


GENERAL  SOLUTION 


Find  a general  solution.  Check  your  answer  by  substitution. 
ODEs  of  this  kind  have  important  applications  to  be 
discussed  in  Secs.  2.4,  2.7,  and  2.9. 

1.  4y"  - 25y  = 0 

2.  y"  + 36y  = 0 

3.  y"  + by'  + 8.96y  = 0 

4.  y"  + 4y  + (7 r2  + 4)y  = 0 

5.  y"  + 2TTy  + 7 r2y  = 0 

6.  lOy"  - 32/  + 25.6y  = 0 

7.  y"  + 4.5y'  = 0 

8.  y"  + y'  + 3.25y  = 0 

9.  y"  + 1.8y'  - 2.08y  = 0 

10.  lOOv"  + 240/  + (19677-2  + 144)v  = 0 

11.  4y"  - 4/  - 3y  = 0 

12.  y"  + 9y'  + 20y  = 0 

13.  9/  - 30/  + 25 y = 0 


14.  y"  + 2 k2y'  + lc\  = 0 

15.  y"  + 0.54v'  + (0.0729  + 7r)y  = 0 


16-20 


FIND  AN  ODE 


y"  + ay'  + by  = 0 for  the  given  basis. 


16.  e2'6x,  e~4-3x  17.  e~V5x, 

18.  cos  27r.r,  sin  2ttx  19.  e(~2+i>x,  e(~2~^x 

20.  e_3'lxcos  2. lx,  e~3  lx  sin  2.1a 


21-30 


INITIAL  VALUES  PROBLEMS 


Solve  the  IVP.  Check  that  your  answer  satisfies  the  ODE  as 
well  as  the  initial  conditions.  Show  the  details  of  your  work. 

21.  y"  + 25 y = 0,  y(0)  = 4.6,  /(0)  = -1.2 

22.  The  ODE  in  Prob.  4,  y(|)  =1,  /(g)  = -2 

23.  y"  + y - 6y  = 0,  y(0)  = 10,  /(0)  = 0 

24.  4y"  — 4/  — 3y  = 0,  v( — 2)  = e,  y,( — 2)  = —e/2 

25.  y"  - y = 0,  y(0)  = 2,  /(O)  = -2 

26.  y"  - k2y  = 0 (k  * 0),  y(0)  =1,  y'(0)  = 1 
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27.  The  ODE  in  Prob.  5, 

y(0)  = 4.5,  y'( 0)  = -4.5t t - 1 = 13.137 

28.  8/  - 2 y - y = 0,  >>(0)  = -0.2,  y'(0)  = -0.325 

29.  The  ODE  in  Prob.  15,  v(0)  = 0,  y'(0)  = 1 

30.  9y"  - 30v'  + 25v  = 0,  y(0)  = 3.3,  y'(0)  = 10.0 

LINEAR  INDEPENDENCE  is  of  basic  impor- 
tance, in  this  chapter,  in  connection  with  general  solutions, 
as  explained  in  the  text.  Are  the  following  functions  linearly 
independent  on  the  given  interval?  Show  the  details  of  your 
work. 

31.  ekx,  xekx,  any  interval 

32.  e0*^"0*,  x>0 

33.  x2,  x2  In  x,  x > 1 

34.  In  x.  In  (x3),  x > 1 

35.  sin  2x,  cos  x sin  x,  x < 0 

36.  e~x  cos  |x,  0,  — 1 S x = 1 

37.  Instability.  Solve  y"  — y = 0 for  the  initial  conditions 
y(0)  = l,y(0)  = — 1 . Then  change  the  initial  conditions 
toy(0)  = 1.001,  y^O)  = —0.999  and  explain  why  this 
small  change  of  0.001  at  t = 0 causes  a large  change  later. 


e.g.,  22  at  t = 10.  This  is  instability:  a small  initial 
difference  in  setting  a quantity  (a  current,  for  in- 
stance) becomes  larger  and  larger  with  time  t.  This  is 
undesirable. 

38.  TEAM  PROJECT.  General  Properties  of  Solutions 

(a)  Coefficient  formulas.  Show  how  a and  b in  (1) 
can  be  expressed  in  terms  of  Ai  and  A2.  Explain  how 
these  formulas  can  be  used  in  constructing  equations 
for  given  bases. 

(b)  Root  zero.  Solve  y"  + Ay'  = 0 (i)  by  the  present 
method,  and  (ii)  by  reduction  to  first  order.  Can  you 
explain  why  the  result  must  be  the  same  in  both 
cases?  Can  you  do  the  same  for  a general  ODE 
y + ay  = 0? 

(c)  Double  root.  Verify  directly  that  xeKx  with  A = 
—a/2  is  a solution  of  (1)  in  the  case  of  a double  root. 
Verify  and  explain  why  y = e~2'1  is  a solution  of 
y"  — y — 6y  = 0 but  xe~2x  is  not. 

(d)  Limits.  Double  roots  should  be  limiting  cases  of 
distinct  roots  Al5  A2  as,  say,  A2  —*  Aj.  Experiment  with 
this  idea.  (Remember  l’Hopital’s  rule  from  calculus.) 
Can  you  arrive  at  xeAlX?  Give  it  a try. 


Differential  Operators.  Optional 

This  short  section  can  be  omitted  without  interrupting  the  flow  of  ideas.  It  will  not  be 
used  subsequently,  except  for  the  notations  Dy,D2y,  etc.  to  stand  for  3/,  y" , etc. 

Operational  calculus  means  the  technique  and  application  of  operators.  Here,  an 
operator  is  a transformation  that  transforms  a function  into  another  function.  Hence 
differential  calculus  involves  an  operator,  the  differential  operator  D,  which 
transforms  a (differentiable)  function  into  its  derivative.  In  operator  notation  we  write 
D = ic  and 

/ dy 

(1)  Dy  = y = ,. 

ax 


Similarly,  for  the  higher  derivatives  we  write  D2y  = D(Dy)  = y",  and  so  on.  For  example, 
D sin  = cos,  D2  sin  = —sin,  etc. 

For  a homogeneous  linear  ODE  y + ay  + by  = 0 with  constant  coefficients  we  can 
now  introduce  the  second-order  differential  operator 

L = P(D)  = D2  + aD  + bl. 


where  I is  the  identity  operator  defined  by  Iy  = y.  Then  we  can  write  that  ODE  as 
(2)  Ly  = P(D)y  = (D2  + aD  + bl)y  = 0. 
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P suggests  “polynomial.”  L is  a linear  operator.  By  definition  this  means  that  if  Ly  and 
Lw  exist  (this  is  the  case  if  y and  w are  twice  differentiable),  then  L(cy  + kw)  exists  for 
any  constants  c and  k,  and 


L(cy  + kw)  = cLy  + kLw. 

Let  us  show  that  from  (2)  we  reach  agreement  with  the  results  in  Sec.  2.2.  Since 
( DeA)(x ) = XeAx  and  (D2eA)(x)  = X2eAx,  we  obtain 

Le\x)  = P(D)e\x)  = ( D 2 + aD  + bl)e\x) 

(3) 

= (A2  + aX  + b)e**  = P(  X)eAx  = 0. 

This  confirms  our  result  of  Sec.  2.2  that  eAx  is  a solution  of  the  ODE  (2)  if  and  only  if  X 
is  a solution  of  the  characteristic  equation  P(X)  = 0. 

P( A)  is  a polynomial  in  the  usual  sense  of  algebra.  If  we  replace  A by  the  operator  D, 
we  obtain  the  “operator  polynomial”  P(D).  The  point  of  this  operational  calculus  is  that 
P(D)  can  be  treated  just  like  an  algebraic  quantity.  In  particular,  we  can  factor  it. 

EXAMPLF  Factorization,  Solution  of  an  ODE 

Factor  P(D)  = D2  — 3D  — 40/  and  solve  P(D)y  = 0. 

Solution.  D2  - 3D  - 40 1 = (D  - 8 1){D  + 51)  because  I2  = I.  Now  (D  - 8 1)y  = y - 8y  = 0 has  the 
solution  yi  = e8x.  Similarly,  the  solution  of  (D  + 5 1)y  = 0 is  >’2  = e~5v.  This  is  a basis  of  P(D)y  = 0 on  any 
interval.  From  the  factorization  we  obtain  the  ODE,  as  expected, 

(D  - 8 1)(D  + 5 1)y  = (D  — 8 1)(y'  + 5y)  = D( y + 5y)  - 8(y'  + 5 y) 

= y"  + 5y  - 8y'  - 40y  = y"  - 3'  - 40 y = 0. 

Verify  that  this  agrees  with  the  result  of  our  method  in  Sec.  2.2.  This  is  not  unexpected  because  we  factored 
P(D)  in  the  same  way  as  the  characteristic  polynomial  P( A)  = A2  — 3A  — 40. 


It  was  essential  that  L in  (2)  had  constant  coefficients.  Extension  of  operator  methods  to 
variable-coefficient  ODEs  is  more  difficult  and  will  not  be  considered  here. 

If  operational  methods  were  limited  to  the  simple  situations  illustrated  in  this  section, 
it  would  perhaps  not  be  worth  mentioning.  Actually,  the  power  of  the  operator  approach 
appears  in  more  complicated  engineering  problems,  as  we  shall  see  in  Chap.  6. 


PROBLEM  SET2. 1 


1-5 


APPLICATION  OF  DIFFERENTIAL 
OPERATORS 


Apply  the  given  operator  to  the  given  functions.  Show  all 
steps  in  detail. 

1.  D2  + 2D;  cosh  2x,  e~x  + e2x,  cos  x 

2.  D — 3/;  3a2  + 3x,  3e3x,  cos  4x  — sin  4x 

3.  (D-2/)2;  eZx,  xe2x,  e~2x 

4.  (D  + 6/)2;  6x  + sin  6x,  xe~6x 

5.  (D  - 2/)(D  + 3/);  e2x,  xe2x,  £>-3x 


6-12 


GENERAL  SOLUTION 


Factor  as  in  the  text  and  solve. 


6. 

(D2 

+ 

4.00D 

+ 

ON 

II 

0 

7. 

(4D: 

2 _ 

-iyy  = 

0 

8. 

(D2 

+ 

3/)y  = 

0 

9. 

(D2 

- 

4.20D 

+ 

-1^ 

II 

0 

10. 

(D2 

+ 

§ 

oo 

'st- 

+ 

5.76/)y  = 

0 

11. 

(D2 

- 

4.00D 

+ 

3.84/)y  = 

0 

12. 

(D2 

+ 

3.0D  + 2 

:.5 1)y  = 0 
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13.  Linear  operator.  Illustrate  the  linearity  of  L in  (2)  by 
taking  c = 4,  k = — 6,  y = e2x,  and  w = cos  2x. 
Prove  that  L is  linear. 

14.  Double  root.  If  D2  + ciD  + bl  has  distinct  roots 
/x  and  A,  show  that  a particular  solution  is 
y = {e^  — eXx)/(p,  — A).  Obtain  from  this  a solution 
xeKx  by  letting  /a  — » A and  applying  l’Hopital’s  rule. 


15.  Definition  of  linearity.  Show  that  the  definition  of 
linearity  in  the  text  is  equivalent  to  the  following.  If 
L[y\  and  L[w]  exist,  then  L[y  + w]  exists  and  L[cy\ 
and  L[kw ] exist  for  all  constants  c and  k,  and 
L[y  + w]  = L[y ] + L[w]  as  well  as  L[cy}  = cL[y\ 
and  L[kw]  = kL\w\. 


2 A Modeling  of  Free  Oscillations 
of  a Mass-Spring  System 

Linear  ODEs  with  constant  coefficients  have  important  applications  in  mechanics,  as  we 
show  in  this  section  as  well  as  in  Sec.  2.8,  and  in  electrical  circuits  as  we  show  in  Sec.  2.9. 
In  this  section  we  model  and  solve  a basic  mechanical  system  consisting  of  a mass  on  an 
elastic  spring  (a  so-called  “mass-spring  system,”  Fig.  33),  which  moves  up  and  down. 

Setting  Up  the  Model 

We  take  an  ordinary  coil  spring  that  resists  extension  as  well  as  compression.  We  suspend 
it  vertically  from  a fixed  support  and  attach  a body  at  its  lower  end,  for  instance,  an  iron 
ball,  as  shown  in  Fig.  33.  We  let  y = 0 denote  the  position  of  the  ball  when  the  system 
is  at  rest  (Fig.  33b).  Furthermore,  we  choose  the  downward  direction  as  positive,  thus 
regarding  downward  forces  as  positive  and  upward  forces  as  negative. 


System  in 
motion 


(a)  (b)  (c) 

Fig.  33.  Mechanical  mass-spring  system 

We  now  let  the  ball  move,  as  follows.  We  pull  it  down  by  an  amount  y > 0 (Fig.  33c). 
This  causes  a spring  force 

(1)  Fy  = -ky  (Hooke’s  law2) 


proportional  to  the  stretch  y,  with  k ( > 0)  called  the  spring  constant.  The  minus  sign 
indicates  that  Fi  points  upward,  against  the  displacement.  It  is  a restoring  force:  It  wants 
to  restore  the  system,  that  is,  to  pull  it  back  to  y = 0.  Stiff  springs  have  large  k. 


zROBERT  HOOKE  (1635-1703),  English  physicist,  a forerunner  of  Newton  with  respect  to  the  law  of 
gravitation. 
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Note  that  an  additional  force  — F0  is  present  in  the  spring,  caused  by  stretching  it  in 
fastening  the  ball,  but  F0  has  no  effect  on  the  motion  because  it  is  in  equilibrium  with 
the  weight  W of  the  ball,  — F0  = W = mg,  where  g = 980  cm/sec2  = 9.8  m/sec2  = 
32.17  ft/sec2  is  the  constant  of  gravity  at  the  Earth’s  surface  (not  to  be  confused  with 
the  universal  gravitational  constant  G = gRZ/M  = 6.67  • 10-11  nt  m2/kg2,  which  we 
shall  not  need;  here  R = 6.37  • 106  m and  M = 5.98  • 1024  kg  are  the  Earth’s  radius  and 
mass,  respectively). 

The  motion  of  our  mass-spring  system  is  determined  by  Newton’s  second  law 

(2)  Mass  X Acceleration  = my"  = Force 


where  y"  = d2y/dtz  and  “Force”  is  the  resultant  of  all  the  forces  acting  on  the  ball.  (For 
systems  of  units,  see  the  inside  of  the  front  cover.) 

ODE  of  the  Undamped  System 

Every  system  has  damping.  Otherwise  it  would  keep  moving  forever.  But  if  the  damping 
is  small  and  the  motion  of  the  system  is  considered  over  a relatively  short  time,  we 
may  disregard  damping.  Then  Newton’s  law  with  F = — F\  gives  the  model 
my  = —Ei  = — ky\  thus 

(3)  my"  + ky  = 0. 

This  is  a homogeneous  linear  ODE  with  constant  coefficients.  A general  solution  is 
obtained  as  in  Sec.  2.2,  namely  (see  Example  6 in  Sec.  2.2) 


(4)  y(t)  = A cos  co0t  + B sin  co0t  <u0  = 

This  motion  is  called  a harmonic  oscillation  (Fig.  34).  Its  frequency  is/  = w0/2tt  Hertz3 
(=  cycles/sec)  because  cos  and  sin  in  (4)  have  the  period  2tt/u>0.  The  frequency/is  called 
the  natural  frequency  of  the  system.  (We  write  a>0  to  reserve  co  for  Sec.  2.8.) 


(T)  Positive  1 

(2)  Zero  Initial  velocity 

(3)  Negative  J 


Fig.  34.  Typical  harmonic  oscillations  (4)  and  (4*)  with  the  same  y(0)  = A and 
different  initial  velocities  y'( 0)  = w0B,  positive  (l),  zero  (2),  negative  (?) 


3HEINRICH  HERTZ  (1857-1894),  German  physicist,  who  discovered  electromagnetic  waves,  as  the  basis 

of  wireless  communication  developed  by  GUGLIELMO  MARCONI  (1874-1937),  Italian  physicist  (Nobel  prize 

in  1909). 
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An  alternative  representation  of  (4),  which  shows  the  physical  characteristics  of  amplitude 
and  phase  shift  of  (4),  is 


(4*)  y(t)  = C cos  (co0t  — 8) 

with  C = V A2  + B2  and  phase  angle  8,  where  tan  8 = B/A.  This  follows  from  the 
addition  formula  (6)  in  App.  3.1. 


EXAMPLE  1 Harmonic  Oscillation  of  an  Undamped  Mass-Spring  System 

If  a mass-spring  system  with  an  iron  ball  of  weight  W = 98  nt  (about  22  lb)  can  be  regarded  as  undamped,  and 
the  spring  is  such  that  the  ball  stretches  it  1 .09  m (about  43  in.),  how  many  cycles  per  minute  will  the  system 
execute?  What  will  its  motion  be  if  we  pull  the  ball  down  from  rest  by  16  cm  (about  6 in.)  and  let  it  start  with 
zero  initial  velocity? 

Solution.  Hooke’s  law  (1)  with  W as  the  force  and  1.09  meter  as  the  stretch  gives  W = 1.09 k\  thus 
k = W/ 1.09  = 98/1.09  = 90  [kg/sec2]  = 90  [nt/ meter].  The  mass  is  m = W/g  = 98/9.8  = 10  [kg].  This 
gives  the  frequency  (jb0/(2tt)  = \Zk/m/{2TT ) = 3/(277)  = 0.48  [Hz]  = 29  [cycles/ min]. 

From  (4)  and  the  initial  conditions,  y(0)  = A = 0.16  [meter]  andy  (0)  = coqB  = 0.  Hence  the  motion  is 

y(t)  = 0.16  cos  3 1 [meter]  or  0.52  cos  3 1 [ft]  (Fig.  35). 

If  you  have  a chance  of  experimenting  with  a mass-spring  system,  don’t  miss  it.  You  will  be  surprised  about 
the  good  agreement  between  theory  and  experiment,  usually  within  a fraction  of  one  percent  if  you  measure 
carefully. 


Fig.  35.  Harmonic  oscillation  in  Example  1 


Fig.  36. 

Damped  system 


ODE  of  the  Damped  System 

To  our  model  my"  = —ky  we  now  add  a damping  force 

F2  = -cy', 

obtaining  my"  = —Icy  — cy' ; thus  the  ODE  of  the  damped  mass-spring  system  is 
(5)  my"  + cy'  + ky  = 0.  (Fig.  36) 

Physically  this  can  be  done  by  connecting  the  ball  to  a dashpot;  see  Fig.  36.  We  assume 
this  damping  force  to  be  proportional  to  the  velocity  y = dy/dt.  This  is  generally  a good 
approximation  for  small  velocities. 
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The  constant  c is  called  the  damping  constant.  Let  us  show  that  c is  positive.  Indeed, 
the  damping  force  F2  = —cy  acts  against  the  motion;  hence  for  a downward  motion  we 
have  y > 0 which  for  positive  c makes  F negative  (an  upward  force),  as  it  should  be. 
Similarly,  for  an  upward  motion  we  have  y < 0 which,  for  c > 0 makes  F2  positive  (a 
downward  force). 

The  ODE  (5)  is  homogeneous  linear  and  has  constant  coefficients.  Hence  we  can  solve 
it  by  the  method  in  Sec.  2.2.  The  characteristic  equation  is  (divide  (5)  by  m) 


A!  + il  + t = G. 

m m 


By  the  usual  formula  for  the  roots  of  a quadratic  equation  we  obtain,  as  in  Sec.  2.2, 

c 1 

(6)  Aj  = —a  + f3,  A2  = —a  — (3 , where  a = — and  /3  = — V c2  — 4mk. 

2m  2m 

It  is  now  interesting  that  depending  on  the  amount  of  damping  present — whether  a lot  of 
damping,  a medium  amount  of  damping  or  little  damping — three  types  of  motions  occur, 
respectively: 

Case  I.  c2  > 4 ink.  Distinct  real  roots  A|,  A2 

Case  II.  c2  = 4 mk.  A real  double  root. 

Caselll.  c2  < 4mk.  Complex  conjugate  roots. 

They  correspond  to  the  three  Cases  I,  II,  III  in  Sec.  2.2. 

Discussion  of  the  Three  Cases 

Case  I.  Overdamping 

2 

If  the  damping  constant  c is  so  large  that  c > 4mk,  then  Ai  and  A2  are  distinct  real  roots. 
In  this  case  the  corresponding  general  solution  of  (5)  is 

(7)  y{t)  = Cle-(“-/w  + c2e_(“+/3)t. 


(Overdamping) 
(Critical  damping) 
(Underdamping) 


We  see  that  in  this  case,  damping  takes  out  energy  so  quickly  that  the  body  does  not 
oscillate.  For  t > 0 both  exponents  in  (7)  are  negative  because  a > 0,  (3  > 0,  and 
[3  = a — k/m  < a . Hence  both  terms  in  (7)  approach  zero  as  Practically 

speaking,  after  a sufficiently  long  time  the  mass  will  be  at  rest  at  the  static  equilibrium 
position  ( y = 0).  Figure  37  shows  (7)  for  some  typical  initial  conditions. 
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(2)  Zero  [ Initial  velocity 

(3)  Negative  J 

Fig.  37.  Typical  motions  (7)  in  the  overdamped  case 

(a)  Positive  initial  displacement 

(b)  Negative  initial  displacement 

Case  II.  Critical  Damping 

Critical  damping  is  the  border  case  between  nonoscillatory  motions  (Case  I)  and  oscillations 
(Case  III).  It  occurs  if  the  characteristic  equation  has  a double  root,  that  is,  if  cz  = 4mk, 
so  that  (3  = 0,  Ai  = A2  = —a.  Then  the  corresponding  general  solution  of  (5)  is 


(8) 


y(t)  = (ci  + c2t)e  at. 


This  solution  can  pass  through  the  equilibrium  position  y = 0 at  most  once  because  e~at 
is  never  zero  and  ci  + c2t  can  have  at  most  one  positive  zero.  If  both  c\  and  c2  are  positive 
(or  both  negative),  it  has  no  positive  zero,  so  that  y does  not  pass  through  0 at  all.  Figure  38 
shows  typical  forms  of  (8).  Note  that  they  look  almost  like  those  in  the  previous  figure. 


Fig.  38.  Critical  damping  [see  (8)] 
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EXAMPLE  2 


Case  III.  Underdamping 

This  is  the  most  interesting  case.  It  occurs  if  the  damping  constant  c is  so  small  that 
c2  < 4mA:.  Then  (i  in  (6)  is  no  longer  real  but  pure  imaginary,  say, 


i / i Z 

(9)  /3  = ico*  where  co*  = — V 4mk  - c2  = ,/ / 2 (>0). 

2m  V m 4 m 

(We  now  write  co*  to  reserve  co  for  driving  and  electromotive  forces  in  Secs.  2.8  and  2.9.) 
The  roots  of  the  characteristic  equation  are  now  complex  conjugates, 

Ai  = —a  + ico*,  A2  = —a  — ico* 

with  a = c/(2m ),  as  given  in  (6).  Hence  the  corresponding  general  solution  is 

(10)  y(t)  = e~at(A  cos  co*t  + B sin  co*t)  = Ce~at  cos  ( co*t  — 8) 

where  C2  = A2  + B2  and  tan  8 = B/A,  as  in  (4*). 

This  represents  damped  oscillations.  Their  curve  lies  between  the  dashed  curves 
y = Ce~nl  and  y = —Ce~"1  in  Tig.  39,  touching  them  when  of':t  — 8 is  an  integer  multiple 
of  77  because  these  are  the  points  at  which  cos  (co*t  — 8)  equals  1 or  —1. 

The  frequency  is  co*/(2tt)  Hz  (hertz,  cycles/sec).  From  (9)  we  see  that  the  smaller 
c (>0)  is,  the  larger  is  co*  and  the  more  rapid  the  oscillations  become.  If  c approaches  0, 
then  co*  approaches  co0  = \/k/m , giving  the  harmonic  oscillation  (4),  whose  frequency 
co0/(2tt)  is  the  natural  frequency  of  the  system. 


Fig.  39.  Damped  oscillation  in  Case  III  [see  (10)] 


The  Three  Cases  of  Damped  Motion 

How  does  the  motion  in  Example  1 change  if  we  change  the  damping  constant  c from  one  to  another  of  the 
following  three  values,  with  y(O)  = 0.16  andy  (0)  = 0 as  before? 

(I)  c = lOOkg/sec,  (II)  c = 60kg/sec,  (III)  c = lOkg/sec. 

Solution.  It  is  interesting  to  see  how  the  behavior  of  the  system  changes  due  to  the  effect  of  the  damping, 
which  takes  energy  from  the  system,  so  that  the  oscillations  decrease  in  amplitude  (Case  III)  or  even  disappear 
(Cases  II  and  I). 

(I)  With  m = 10  and  k = 90,  as  in  Example  1,  the  model  is  the  initial  value  problem 
lOy"  + 100/  + 90y  = 0,  y(0)  = 0.16  [meter],  /(0)  = 0. 
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The  characteristic  equation  is  10 A2  + 100A  + 90  = 10(A  + 9)(A  + 1)  = 0.  It  has  the  roots  —9  and  — 1.  This 
gives  the  general  solution 


y = C]£  9t  + c^e  t.  We  also  need  y = —9c\e  9t  — c^e 

The  initial  conditions  give  c\  + = 0.16,  — 9ci  — = 0.  The  solution  is  Ci  = —0.02,  C2  — 0.18.  Hence  in 

the  overdamped  case  the  solution  is 


y = —0.02e~9t  + 0.18e_t. 

It  approaches  0 as  The  approach  is  rapid;  after  a few  seconds  the  solution  is  practically  0,  that  is,  the 

iron  ball  is  at  rest. 

(II)  The  model  is  as  before,  with  c = 60  instead  of  100.  The  characteristic  equation  now  has  the  form 
10A2  + 60A  + 90  = 10(A  + 3)  =0.  It  has  the  double  root  —3.  Hence  the  corresponding  general  solution  is 

y = (ci  + C2t)e~3t.  We  also  need  y'  = ( c?2  — 3c?i  — 3c2t)e~3t. 

The  initial  conditions  give  y(0)  = c\  = 0.16,  y (0)  = C2  — 3<?i  = 0,  = 0.48.  Hence  in  the  critical  case  the 

solution  is 

y = (0.16  + 0.48f)e_3t. 

It  is  always  positive  and  decreases  to  0 in  a monotone  fashion. 

(HI)  The  model  now  is  lOy”  + 10yr  + 90y  = 0.  Since  c = 10  is  smaller  than  the  critical  c,  we  shall  get 
oscillations.  The  characteristic  equation  is  10A2  + 10A  + 90  = 10[(A  + ^)2  + 9 — J]  = 0.  It  has  the  complex 
roots  [see  (4)  in  Sec.  2.2  with  <2=1  and  b = 9] 

A = -0.5  ± Vo.52  - 9 = -0.5  ± 2.96/'. 


This  gives  the  general  solution 


y = e °'5t(A  cos  2.96r  + B sin  2.9 6t). 

Thus  y(0)  = A = 0.16.  We  also  need  the  derivative 

y = <T°-5t(-0.5A  cos  2.96 1 - 0.5 B sin  2.96 1 - 2.96 A sin  2.96/  + 2.96 B cos  2.96/). 

Hence  y;(0)  = — 0.5A  + 2.96 B = 0,  B = 0.5A/2.96  = 0.027.  This  gives  the  solution 

y = e- o.5t(0  16  cos  2 96t  + Q Q27  sin  2.96/)  = 0.162e“°-5t  cos  (2.96/  - 0.17). 

We  see  that  these  damped  oscillations  have  a smaller  frequency  than  the  harmonic  oscillations  in  Example  1 by 
about  1%  (since  2.96  is  smaller  than  3.00  by  about  1%).  Their  amplitude  goes  to  zero.  See  Fig.  40. 


This  section  concerned  free  motions  of  mass-spring  systems.  Their  models  are  homo- 
geneous linear  ODEs.  Nonhomo geneous  linear  ODEs  will  arise  as  models  of  forced 
motions,  that  is,  motions  under  the  influence  of  a “driving  force.”  We  shall  study  them 
in  Sec.  2.8,  after  we  have  learned  how  to  solve  those  ODEs. 
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P R O B L E M 5 E 


1-10 


HARMONIC  OSCILLATIONS 
(UNDAMPED  MOTION) 


1.  Initial  value  problem.  Find  the  harmonic  motion  (4) 
that  starts  from  yo  with  initial  velocity  Vo ■ Graph  or 
sketch  the  solutions  for  a>0  = tt,  y0  = 1,  and  various 
Vo  of  your  choice  on  common  axes.  At  what  r-values 
do  all  these  curves  intersect?  Why? 

2.  Frequency.  If  a weight  of  20  nt  (about  4.5  lb)  stretches 
a certain  spring  by  2 cm,  what  will  the  frequency  of  the 
corresponding  harmonic  oscillation  be?  The  period? 

3.  Frequency.  How  does  the  frequency  of  the  harmonic 
oscillation  change  if  we  (i)  double  the  mass,  (ii)  take 
a spring  of  twice  the  modulus?  First  find  qualitative 
answers  by  physics,  then  look  at  formulas. 

4.  Initial  velocity.  Could  you  make  a harmonic  oscillation 
move  faster  by  giving  the  body  a greater  initial  push? 

5.  Springs  in  parallel.  What  are  the  frequencies  of 
vibration  of  a body  of  mass  m — 5 kg  (i)  on  a spring 
of  modulus  k1  = 20  nt/m,  (ii)  on  a spring  of  modulus 
k2  = 45  nt/m,  (iii)  on  the  two  springs  in  parallel?  See 
Fig.  41. 


Fig.  41.  Parallel  springs  (Problem  5) 

6.  Spring  in  series.  If  a body  hangs  on  a spring  i]  of 
modulus  k1  = 8,  which  in  turn  hangs  on  a spring  s2 
of  modulus  k2  = 12,  what  is  the  modulus  k of  this 
combination  of  springs? 

7.  Pendulum.  Find  the  frequency  of  oscillation  of  a 
pendulum  of  length  L (Fig.  42),  neglecting  air 
resistance  and  the  weight  of  the  rod,  and  assuming  6 
to  be  so  small  that  sin  6 practically  equals  6. 


The  cylindrical  buoy  of  diameter  60  cm  in  Fig.  43  is 
floating  in  water  with  its  axis  vertical.  When  depressed 
downward  in  the  water  and  released,  it  vibrates  with 
period  2 sec.  What  is  its  weight? 


Fig.  43.  Buoy  (Problem  8) 


9.  Vibration  of  water  in  a tube.  If  1 liter  of  water  (about 
1.06  US  quart)  is  vibrating  up  and  down  under  the 
influence  of  gravitation  in  a U-shaped  tube  of  diameter 
2 cm  (Fig.  44),  what  is  the  frequency?  Neglect  friction. 
First  guess. 


Fig.  44.  Tube  (Problem  9) 

10.  TEAM  PROJECT.  Harmonic  Motions  of  Similar 
Models.  The  unifying  power  of  mathematical  meth- 
ods results  to  a large  extent  from  the  fact  that  different 
physical  (or  other)  systems  may  have  the  same  or  very 
similar  models.  Illustrate  this  for  the  following  three 
systems 

(a)  Pendulum  clock.  A clock  has  a 1 -meter  pendulum. 
The  clock  ticks  once  for  each  time  the  pendulum 
completes  a full  swing,  returning  to  its  original  position. 
How  many  times  a minute  does  the  clock  tick? 

(b)  Flat  spring  (Fig.  45).  The  harmonic  oscillations 
of  a flat  spring  with  a body  attached  at  one  end  and 
horizontally  clamped  at  the  other  are  also  governed  by 
(3).  Find  its  motions,  assuming  that  the  body  weighs 
8 nt  (about  1 .8  lb),  the  system  has  its  static  equilibrium 
1 cm  below  the  horizontal  line,  and  we  let  it  start  from 
this  position  with  initial  velocity  10  cm/sec. 


Fig.  42.  Pendulum  (Problem  7) 

8.  Archimedian  principle.  This  principle  states  that  the 
buoyancy  force  equals  the  weight  of  the  water 
displaced  by  the  body  (partly  or  totally  submerged). 


y y 

Fig.  45.  Flat  spring 


70 


CHAP.  2 Second-Order  Linear  ODEs 


(c)  Torsional  vibrations  (Fig.  46).  Undamped 
torsional  vibrations  (rotations  back  and  forth)  of  a 
wheel  attached  to  an  elastic  thin  rod  or  wire  are 
governed  by  the  equation  Io0"  + K6  = 0,  where  8 
is  the  angle  measured  from  the  state  of  equilibrium. 
Solve  this  equation  for  K/Iq  = 13.69  sec-2,  initial 
angle  30°(=  0.5235  rad)  and  initial  angular  velocity 
20°  sec-1  (=  0.349  rad  • sec-1). 


F'g-  46.  Torsional  vibrations 


DAMPED  MOTION 

11.  Overdamping.  Show  that  for  (7)  to  satisfy  initial  condi- 
tions y(0)  = yo  and  u(0)  = u 0 we  must  have  C\  = 
[(1  + a/P)y o + Vo/P\/ 2 and  c2  = [(1  - a/P)y0  - 
Vo/fl/2. 

12.  Overdamping.  Show  that  in  the  overdamped  case,  the 
body  can  pass  through  y = 0 at  most  once  (Fig.  37). 

13.  Initial  value  problem.  Find  the  critical  motion  (8) 
that  starts  from  Vo  with  initial  velocity  Vq.  Graph 
solution  curves  for  a = 1,  y0  = 1 and  several  u0  such 
that  (i)  the  curve  does  not  intersect  the  f-axis,  (ii)  it 
intersects  it  at  t = 1,  2,  . . . , 5,  respectively. 

14.  Shock  absorber.  What  is  the  smallest  value  of  the 
damping  constant  of  a shock  absorber  in  the  suspen- 
sion of  a wheel  of  a car  (consisting  of  a spring  and  an 
absorber)  that  will  provide  (theoretically)  an  oscillation- 
free  ride  if  the  mass  of  the  car  is  2000  kg  and  the  spring 
constant  equals  4500  kg/ sec2? 

15.  Frequency.  Find  an  approximation  formula  for  w*  in 
terms  of  &>0  by  applying  the  binomial  theorem  in  (9) 
and  retaining  only  the  first  two  terms.  How  good  is  the 
approximation  in  Example  2,  III? 

16.  Maxima.  Show  that  the  maxima  of  an  underdamped 
motion  occur  at  equidistant  t-values  and  find  the 
distance. 

17.  Underdamping.  Determine  the  values  of  t corre- 
sponding to  the  maxima  and  minima  of  the  oscillation 
y(t)  = e-t  sin  t.  Check  your  result  by  graphing  y(f). 

18.  Logarithmic  decrement.  Show  that  the  ratio  of 
two  consecutive  maximum  amplitudes  of  a damped 
oscillation  (10)  is  constant,  and  the  natural  logarithm 
of  this  ratio  called  the  logarithmic  decrement, 


equals  A = hra/w*.  Find  A for  the  solutions  of 
y"  + 2y'  + 5y  = 0. 

19.  Damping  constant.  Consider  an  underdamped  motion 
of  a body  of  mass  m = 0.5  kg.  If  the  time  between  two 
consecutive  maxima  is  3 sec  and  the  maximum 
amplitude  decreases  to  2 its  initial  value  after  10  cycles, 
what  is  the  damping  constant  of  the  system? 

20.  CAS  PROJECT.  Transition  Between  Cases  I,  II, 
III.  Study  this  transition  in  terms  of  graphs  of  typical 
solutions.  (Cf.  Fig.  47.) 

(a)  Avoiding  unnecessary  generality  is  part  of  good 
modeling.  Show  that  the  initial  value  problems  (A) 
and  (B), 

(A)  y"  + cy  + y = 0,  y(0)  = 1,  v'(0)  = 0 

(B)  the  same  with  different  c and  y^O)  = —2  (instead 
of  0),  will  give  practically  as  much  information  as  a 
problem  with  other  m,  k,  y(0),  yr(0). 

(b)  Consider  (A).  Choose  suitable  values  of  c, 
perhaps  better  ones  than  in  Fig.  47,  for  the  transition 
from  Case  III  to  II  and  I.  Guess  c for  the  curves  in  the 
figure. 

(c)  Time  to  go  to  rest.  Theoretically,  this  time  is 
infinite  (why?).  Practically,  the  system  is  at  rest  when 
its  motion  has  become  very  small,  say,  less  than  0.1% 
of  the  initial  displacement  (this  choice  being  up  to  us), 
that  is  in  our  case, 

(11)  y(f)  < 0.001  for  all  t greater  than  some  t\. 

In  engineering  constructions,  damping  can  often  be 
varied  without  too  much  trouble.  Experimenting  with 
your  graphs,  find  empirically  a relation  between  t\ 
and  c. 

(d)  Solve  (A)  analytically.  Give  a reason  why  the 
solution  c of  y(f2)  = — 0.001,  with  r2  the  solution  of 
y'  (t)  = 0,  will  give  you  the  best  possible  c satisfying 
(11). 

(e)  Consider  (B)  empirically  as  in  (a)  and  (b).  What 
is  the  main  difference  between  (B)  and  (A)? 
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Euler-Cauchy  Equations 

Euler-Cauchy  equations4  are  ODEs  of  the  form 
(1)  x2y " + axy'  + by  = 0 


with  given  constants  a and  b and  unknown  function  y(x).  We  substitute 


y = 


m 


X , 


y 


mx 


m— 1 


ft  / i \ m— 2 

y = — 1 )x 


into  (1).  This  gives 


2 / i \ m — 2 . m—  1 , / m 

x m{m  — l)x  + axmx  + bx  = (J 

and  we  now  see  that  y = xm  was  a rather  natural  choice  because  we  have  obtained  a com- 
mon factor  xm.  Dropping  it,  we  have  the  auxiliary  equation  m(m  — 1)  + am  + b = 0 or 

(2)  m2  + (a  — 1 )m  + b = 0.  (Note  a — 1,  not  a.) 


Hence  y = xm  is  a solution  of  (1)  if  and  only  if  m is  a root  of  (2).  The  roots  of  (2)  are 
(3)  m\  = |(1  ~ a)  + \/|(l  — a)2  — b , m2  = g(l  — a)  — Vj(l  — a)2  — b. 
Case  I.  Real  different  roots  m 1 and  m2  give  two  real  solutions 
yi(x)  = xmi  and  y2(x)  = x™2. 

These  are  linearly  independent  since  their  quotient  is  not  constant.  Hence  they  constitute 
a basis  of  solutions  of  (1)  for  all  x for  which  they  are  real.  The  corresponding  general 
solution  for  all  these  x is 


(4) 


y = Clxmi  + C2Xmz  (cl5  c2  arbitrary). 


EXAMPLE  General  Solution  in  the  Case  of  Different  Real  Roots 

The  Euler-Cauchy  equation  x2y"  + 1.5xy,  — 0.5y  = 0 has  the  auxiliary  equation  m2  + 0.5 m — 0.5  = 0.  The 
roots  are  0.5  and  — 1.  Hence  a basis  of  solutions  for  all  positive  x is  yq  = x 5 and  y2  = 1/jc  and  gives  the  general 
solution 


y = oVi  + — 


(x  > 0).  ■ 


4LEONHARD  EULER  (1707-1783)  was  an  enormously  creative  Swiss  mathematician.  He  made 
fundamental  contributions  to  almost  all  branches  of  mathematics  and  its  application  to  physics.  His  important 
books  on  algebra  and  calculus  contain  numerous  basic  results  of  his  own  research.  The  great  French 
mathematician  AUGUSTIN  LOUIS  CAUCHY  (1789-1857)  is  the  father  of  modem  analysis.  He  is  the  creator 
of  complex  analysis  and  had  great  influence  on  ODEs,  PDEs,  infinite  series,  elasticity  theory,  and  optics. 
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EXAMPLE  2 


EXAMPLE  3 


Case  II.  A real  double  root  nr  i = |(1  — a)  occurs  if  and  only  if/?  = \(a  — l)2  because 

1 2 

then  (2)  becomes  [m  + 2(a  — 1)]  , as  can  be  readily  verified.  Then  a solution  is 
= x' 1 ~a>/27  and  (1)  is  of  the  form 

,r,  2 « | ' i l/i  \2  n //  , a / , A 

(5)  x v + axy  + 4(1  — a)  y = 0 or  y H y H —5 — v = 0. 

x Ax 


A second  linearly  independent  solution  can  be  obtained  by  the  method  of  reduction  of 
order  from  Sec.  2.1,  as  follows.  Starting  from  y2  = uy  1 . we  obtain  for  u the  expression 
(9)  Sec.  2.1,  namely, 


U dx 


where 


U = 


2 exP 

>'1 


p dx 


From  (5)  in  standard  form  (second  ODE)  we  see  that  p = a/x  (not  ax:  this  is  essential!). 
Hence  exp/(—  p dx)  = exp  (—a  In  x)  = exp  (In  x~a)  = \/xa.  Division  by  y2  = x1_a 
gives  U = 1/x,  so  that  u = ln  x by  integration.  Thus,  y2  = uy  1 = yi  In  x,  and  yi  and  y2 
are  linearly  independent  since  their  quotient  is  not  constant.  The  general  solution 
corresponding  to  this  basis  is 


(6) 


y = (ci  + c2  In  x)  xm, 


m = 2(1  - a). 


General  Solution  in  the  Case  of  a Double  Root 

The  Euler-Cauchy  equation  xzy"  — 5xy'  + 9y  = 0 has  the  auxiliary  equation  m2  — 6m  + 9 = 0.  It  has  the 
double  root  m = 3,  so  that  a general  solution  for  all  positive  x is 

y = (ci  + C2  In  jt)  x 3. 

Case  III.  Complex  conjugate  roots  are  of  minor  practical  importance,  and  we  discuss 
the  derivation  of  real  solutions  from  complex  ones  just  in  terms  of  a typical  example. 

Real  General  Solution  in  the  Case  of  Complex  Roots 

The  Euler-Cauchy  equation  x2y"  + 0.6 xy'  + 16.04y  = 0 has  the  auxiliary  equation  m2  — 0.4m  + 16.04  = 0. 
The  roots  are  complex  conjugate,  m i = 0.2  + 4 i and  = 0.2  — 4 i,  where  i = V- 1.  We  now  use  the  trick 
of  writing  x = eln  x and  obtain 

xmi  _ ^.0.2  + 4*  _ ^0.2^1na;y4z  _ ^0.2^(4  In  x)i 
xm2  — x°'2-4i  = x°-2(elnx)~4i  = x°'2e~(4  ln  x)i 

Next  we  apply  Euler’s  formula  (11)  in  Sec.  2.2  with  t = 4 lnx  to  these  two  formulas.  This  gives 

xmi  = x0  2[cos  (4  ln  x)  + i sin  (4  ln  jc)], 
x1712  = x°'2[cos  (4  ln  x)  — i sin  (4  ln  x)]. 

We  now  add  these  two  formulas,  so  that  the  sine  drops  out,  and  divide  the  result  by  2.  Then  we  subtract  the 
second  formula  from  the  first,  so  that  the  cosine  drops  out,  and  divide  the  result  by  2 i.  This  yields 

x0  2 cos  (4  ln  x)  and  x0  2 sin  (4  ln  x) 

respectively.  By  the  superposition  principle  in  Sec.  2.2  these  are  solutions  of  the  Euler-Cauchy  equation  (1). 
Since  their  quotient  cot  (4  ln  x)  is  not  constant,  they  are  linearly  independent.  Hence  they  form  a basis  of  solutions, 
and  the  corresponding  real  general  solution  for  all  positive  x is 

(8)  y = jc°'2[A  cos  (4  ln  x)  + B sin  (4  ln  *)]. 
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Figure  48  shows  typical  solution  curves  in  the  three  cases  discussed,  in  particular  the  real  basis  functions  in 
Examples  1 and  3. 


EXAMPLE  4 Boundary  Value  Problem.  Electric  Potential  Field  Between  Two  Concentric  Spheres 

Find  the  electrostatic  potential  v — v(r ) between  two  concentric  spheres  of  radii  r i = 5 cm  and  r^  = 10  cm 
kept  at  potentials  V\  = 110  V and  V2  — 0,  respectively. 

Physical  Information.  v(r)  is  a solution  of  the  Euler-Cauchy  equation  rv"  + 2v'  =0,  where  v = dv/ dr. 

Solution.  The  auxiliary  equation  is  m2  + m = 0.  It  has  the  roots  0 and  — 1 . This  gives  the  general  solution 
v(r)  = Ci  + c^J r.  From  the  “boundary  conditions”  (the  potentials  on  the  spheres)  we  obtain 

v(5)  = Cl  + y = 110.  U(10)  = a + ^ = 0. 

By  subtraction,  C2/IO  = 110,  C2  — 1100.  From  the  second  equation,  Ci  = — C2/IO  = —110.  Answer: 
v(r)  = —110+  1 100/r  V.  Figure  49  shows  that  the  potential  is  not  a straight  line,  as  it  would  be  for  a potential 
between  two  parallel  plates.  For  example,  on  the  sphere  of  radius  7.5  cm  it  is  not  110/2  = 55  V,  but  considerably 
less.  (What  is  it?)  I 


FROB  L 


1.  Double  root.  Verify  directly  by  substitution  that 
x<i  —a)/2  jn  x js  a soiution  of  (1)  if  (2)  has  a double  root, 
but  jc™1  In  x and  a ™2  In  x are  not  solutions  of  (1)  if  the 
roots  m1  and  m2  of  (2)  are  different. 


2-11 


GENERAL  SOLUTION 


Find  a real  general  solution.  Show  the  details  of  your  work. 

2.  x2y"  - 20y  = 0 

3.  5x2y"  + 23a/  + 16.2y  = 0 


4.  x y"  + 2y  — 0 

5.  4a  2y"  + 5y  = 0 

6.  A2y”  + 0.7a / — O.lv  = 0 

7.  (a  2D2  - 4a D + 6 1)y  = C 

8.  (a2/)2  - 3a D + 4I)y  = 0 

9.  (a 2D2  - 0.2a D + 0.36/)y  = 0 

10.  (a  2D2  - x D + 5I)y  = 0 

11.  (a2/)2  - 3a D + 1 ()/)y  = 0 
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INITIAL  VALUE  PROBLEM 

Solve  and  graph  the  solution.  Show  the  details  of  your  work. 

12.  x2y"  - 4xy'  + 6 y = 0,  y(l)  = 0.4,  y'(l)  = 0 

13.  x2y"  + 3xy'  + 0.75y  = 0,  y(l)  = 1, 

v'(l)  = -1.5 

14.  x2y"  + xy'  + 9y  = 0,  y(l)  = 0,  y' (l)  = 2.5 

15.  x2y"  + 3 xy'  + y = 0,  y(l)  = 3.6,  y'(l)  = 0.4 

16.  (x2D2  - 3 xD  + 4 l)y  = 0,  y(l)  = -77,  y'(l)  = 2t t 

17.  (x2D2  + xD  + 1 )y  = 0,  y(l)  = 1,  y'(l)  = 1 

18.  (9 x2D2  + 3 xD  + I)y  = 0,  y(l)  = 1,  v'(l)  = 0 

19.  (x2D2  - xD  - 15 1)y  = 0,  y(l)  = 0.1, 

/(l)  = -4.5 


20.  TEAM  PROJECT.  Double  Root 

(a)  Derive  a second  linearly  independent  solution  of 
(1)  by  reduction  of  order;  but  instead  of  using  (9),  Sec. 
2.1,  perform  all  steps  directly  for  the  present  ODE  (1). 

(b)  Obtain  xm  In  x by  considering  the  solutions  xm 
and  xm  s of  a suitable  Euler-Cauchy  equation  and 
letting  s —■ ► 0. 

(c)  Verify  by  substitution  that*™  In  x,  m = (1  — a)/ 2, 
is  a solution  in  the  critical  case. 

(d)  Transform  the  Euler-Cauchy  equation  (1)  into 
an  ODE  with  constant  coefficients  by  setting 
x = eb  (x  > 0). 

(e)  Obtain  a second  linearly  independent  solution  of 
the  Euler-Cauchy  equation  in  the  “critical  case”  from 
that  of  a constant-coefficient  ODE. 


2.6  Existence  and  Uniqueness 
of  Solutions.  Wronskian 


In  this  section  we  shall  discuss  the  general  theory  of  homogeneous  linear  ODEs 

(1)  y"  + p{x)y'  + q(x)y  = 0 

with  continuous,  but  otherwise  arbitrary,  variable  coefficients  p and  q.  This  will  concern 
the  existence  and  form  of  a general  solution  of  (1)  as  well  as  the  uniqueness  of  the  solution 
of  initial  value  problems  consisting  of  such  an  ODE  and  two  initial  conditions 

(2)  y(x0)  = K0,  y'  (x0)  = Kx 
with  given  x0,  K0,  and  K-^. 

The  two  main  results  will  be  Theorem  1,  stating  that  such  an  initial  value  problem 
always  has  a solution  which  is  unique,  and  Theorem  4,  stating  that  a general  solution 

(3)  >’  = cdt  + c2y2  (O,  c2  arbitrary) 

includes  all  solutions.  Hence  linear  ODEs  with  continuous  coefficients  have  no  “singular 
solutions’’  (solutions  not  obtainable  from  a general  solution). 

Clearly,  no  such  theory  was  needed  for  constant-coefficient  or  Euler-Cauchy  equations 
because  everything  resulted  explicitly  from  our  calculations. 

Central  to  our  present  discussion  is  the  following  theorem. 


THEOREM  1 


Existence  and  Uniqueness  Theorem  for  Initial  Value  Problems 

If  p(x)  and  q(x)  are  continuous  functions  on  some  open  inten’al  I (see  Sec.  1.1)  and 
x0  is  in  I,  then  the  initial  value  problem  consisting  of  ( 1)  and  (2)  has  a unique 
solution  y(x)  on  the  interval  I. 
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THEOREM  2 


PROOF 


The  proof  of  existence  uses  the  same  prerequisites  as  the  existence  proof  in  Sec.  1.7 
and  will  not  be  presented  here;  it  can  be  found  in  Ref.  [A1 1 ] listed  in  App.  1.  Uniqueness 
proofs  are  usually  simpler  than  existence  proofs.  But  for  Theorem  1,  even  the  uniqueness 
proof  is  long,  and  we  give  it  as  an  additional  proof  in  App.  4. 


Linear  Independence  of  Solutions 

Remember  from  Sec.  2.1  that  a general  solution  on  an  open  interval  I is  made  up  from  a 
basis  y±,  y2  on  I,  that  is,  from  a pair  of  linearly  independent  solutions  on  I.  Here  we  call 
y i , y2  linearly  independent  on  I if  the  equation 

(4)  hyfx)  + k 2y2(x)  = 0 on  I implies  ki  = 0,  k2  = 0. 

We  call  yi,  y2  linearly  dependent  on  I if  this  equation  also  holds  for  constants  k\,  k2 
not  both  0.  In  this  case,  and  only  in  this  case,  yi  and  y2  are  proportional  on  /,  that  is  (see 
Sec.  2.1), 

(5)  (a)  yi  = ky2  or  (b)  y2  = ly  1 for  all  on  I. 

For  our  discussion  the  following  criterion  of  linear  independence  and  dependence  of 
solutions  will  be  helpful. 


Linear  Dependence  and  Independence  of  Solutions 

Let  the  ODE  (1)  have  continuous  coefficients  p(x ) and  q(x)  on  an  open  interval  I. 
Then  two  solutions  y\andy2  of  { 1)  on  I are  linearly  dependent  on  I if  and  only  if 
their  “Wronskian” 

(6)  Wj!,  y2)  = yu4  - >uVi' 

is  0 at  some  xq  in  I.  Furthermore,  if  W = 0 at  an  x = Xq  in  I,  then  W = 0 on  I; 
hence,  if  there  is  an  x 1 in  I at  which  W is  not  0,  then  yi,  y2  are  linearly  independent 
on  I. 


(a)  Let  yi  and  y2  be  linearly  dependent  on  I.  Then  (5a)  or  (5b)  holds  on  I.  If  (5a)  holds, 
then 


W(yi.  yi)  = ym  - >:2>  i = ky2y2  - y2ky2  = 0. 


Similarly  if  (5b)  holds. 

(b)  Conversely,  we  let  Wf  v 1 , y2)  = 0 for  some  x = x0  and  show  that  this  implies  linear 
dependence  of  yi  and  y2  on  I.  We  consider  the  linear  system  of  equations  in  the  unknowns 
k 1,  k2 


(7) 


^lJiOo)  + k2y2(x0)  = 0 
k\y'\(,xo)  + k2y2(x0 ) = 0. 
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To  eliminate  k2,  multiply  the  first  equation  by  y2  and  the  second  by  — y2  and  add  the 
resulting  equations.  This  gives 

kiyi(x0)y2(xo)  - fci.yi(-*o)>'2(*o)  = = 0. 

Similarly,  to  eliminate  k i,  multiply  the  first  equation  by  —y{  and  the  second  by  yi  and 
add  the  resulting  equations.  This  gives 


k2W (yi(x0),  y2(*o))  = 0. 


If  W were  not  0 at  xo,  we  could  divide  by  W and  conclude  that  k\  = k2  = 0.  Since  W is 
0,  division  is  not  possible,  and  the  system  has  a solution  for  which  k\  and  k2  are  not  both 
0.  Using  these  numbers  k i,  k2,  we  introduce  the  function 

y(x)  = £iyi(x)  + k2y2(x). 

Since  (1)  is  homogeneous  linear,  Fundamental  Theorem  1 in  Sec.  2.1  (the  superposition 
principle)  implies  that  this  function  is  a solution  of  (1)  on  I.  From  (7)  we  see  that  it  satisfies 
the  initial  conditions  yCvo)  = 0,  y (xo)  = 0.  Now  another  solution  of  (1)  satisfying  the 
same  initial  conditions  is  y*  = 0.  Since  the  coefficients  p and  q of  (1)  are  continuous, 
Theorem  1 applies  and  gives  uniqueness,  that  is,  y = y*,  written  out 

kiyi  + k 2y2  = 0 on  /. 

Now  since  k j and  k2  are  not  both  zero,  this  means  linear  dependence  of  yi,  y2  on  I. 

(c)  We  prove  the  last  statement  of  the  theorem.  If  W(xo)  = 0 at  an  xo  in  /,  we  have 
linear  dependence  of  yi,  y2  on  I by  part  (b),  hence  W = 0 by  part  (a)  of  this  proof.  Hence 
in  the  case  of  linear  dependence  it  cannot  happen  that  W(x  |)  ^ 0 at  an  x\  in  I.  If  it  does 
happen,  it  thus  implies  linear  independence  as  claimed. 

For  calculations,  the  following  formulas  are  often  simpler  than  (6). 

(6*)  W(y1,y2)  = (a)  y\  (yi  =h  0)  or  (b)  yi  (^2  * 0)- 

These  formulas  follow  from  the  quotient  rule  of  differentiation. 

Remark.  Determinants.  Students  familiar  with  second-order  determinants  may  have 
noticed  that 


W(yi,v2) 


yi  y2 

I t 

yi  y2 


yiy2  - y2>’i- 


This  determinant  is  called  the  Wronski  determinant5  or,  briefly,  the  Wronskian,  of  two 
solutions  yi  and  y2  of  (1),  as  has  already  been  mentioned  in  (6).  Note  that  its  four  entries 
occupy  the  same  positions  as  in  the  linear  system  (7). 


introduced  by  WRONSKI  (JOSEF  MARIA  HONE,  1776-1853),  Polish  mathematician. 
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EXAMPLE  1 


EXAMPLE  2 


THEOREM  3 


PROOF 


Illustration  of  Theorem  2 

The  functions  v i = cos  cox  and  y2  = sin  cox  are  solutions  of  y"  + a>2y  = 0.  Their  Wronskian  is 


VV(cos  (ox,  sin  cox)  = 


cos  (ox  sin  (ox 
-co  sin  cox  co  cos  cox 


= Vi  V’2  — V2.V i = co  cos2  (ox  + oi  sin2  orv  = at. 


Theorem  2 shows  that  these  solutions  are  linearly  independent  if  and  only  if  co  0.  Of  course,  we  can  see 
this  directly  from  the  quotient  y2/yi  = tan  cox.  For  to  = 0 we  have  y2  = 0,  which  implies  linear  dependence 
(why?). 


Illustration  of  Theorem  2 for  a Double  Root 

A general  solution  of  y — 2 y'  + y — 0 on  any  interval  is  y = (ci  + c2x)ex.  (Verify!).  The  corresponding 
Wronskian  is  not  0,  which  shows  linear  independence  of  ex  and  xex  on  any  interval.  Namely, 


W(x,  xex)  = 


e xe 

ex  (x  + \)ex 


= (x  + l)e2x  - xeZx  = e2x  ± 0. 


A General  Solution  of  (1)  Includes  All  Solutions 

This  will  be  our  second  main  result,  as  announced  at  the  beginning.  Let  us  start  with 
existence. 


Existence  of  a General  Solution 

Ifp(x ) and  q{x)  are  continuous  on  an  open  interval  I,  then  (1)  has  a general  solution 
on  I. 


By  Theorem  1,  the  ODE  (1)  has  a solution  yi(x)  on  I satisfying  the  initial  conditions 

viOo)  = 1,  ViOo)  = 0 

and  a solution  y2(v)  on  I satisfying  the  initial  conditions 


j2(*o)  = 0,  yz(xo)  = 1. 


The  Wronskian  of  these  two  solutions  has  at  x = Xo  the  value 

W>i(0),y2(0))  = yi(x0)yi(x0)  - y2(x0)yi(x0)  = 1- 


Hence,  by  Theorem  2,  these  solutions  are  linearly  independent  on  I.  They  form  a basis  of 
solutions  of  (1)  on  /,  and  y = C]Vi  + C2V2  with  arbitrary  c±,  c2  is  a general  solution  of  (1) 
on  /,  whose  existence  we  wanted  to  prove. 
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THEOREM  4 


PROOF 


We  finally  show  that  a general  solution  is  as  general  as  it  can  possibly  be. 


A General  Solution  Includes  All  Solutions 

If  the  ODE  (1)  has  continuous  coefficients  p(x)  and  q(x ) on  some  open  interval  /, 
then  every  solution  y = Y(x)  of  { 1)  on  I is  of  the  form 

(8)  Y(x)  = Cm(x)  + C2y2(x ) 

where  yi,  y2  is  any  basis  of  solutions  of  { 1)  on  I and  C i,  C2  are  suitable  constants. 

Hence  (1)  does  not  have  singular  solutions  ( that  is,  solutions  not  obtainable  from 
a general  solution). 


Let  y = Y(x)  be  any  solution  of  (1)  on  I.  Now,  by  Theorem  3 the  ODE  (1)  has  a general 
solution 


(9)  y(x)  = ciyfx)  + c2y2(x) 

on  I.  We  have  to  find  suitable  values  of  ci,  c2  such  that  y(x)  = Y(x)  on  I.  We  choose  any 
x0  in  I and  show  first  that  we  can  find  values  of  c1;  c2  such  that  we  reach  agreement  at 
x0,  that  is,  y(x0)  = K(x0)  and  y (x0)  = Y (x0).  Written  out  in  terms  of  (9),  this  becomes 


(10) 


(a)  c1y1(x0)  + c2y2(x0 ) = 7(x0) 

(b)  ciy{(x0)  + C2y2(x0)  = Y'(x0). 


We  determine  the  unknowns  ci  and  c2.  To  eliminate  c2,  we  multiply  (10a)  by  y2(xo)  and 
(10b)  by  — y2(xo)  and  add  the  resulting  equations.  This  gives  an  equation  for  C\.  Then  we 
multiply  (10a)  by  — yi(xo)  and  (10b)  by  yi(xo)  and  add  the  resulting  equations.  This  gives 
an  equation  for  c2.  These  new  equations  are  as  follows,  where  we  take  the  values  of 

yi,  yi,  yz,  yz,  Y,  y'  at  x0. 


ci(vLV2  - yzyi)  = ciW(yi,  y2)  = Yy 2 - y2Yr 
Cziytyz  - yzyi)  = c2w(y^  y2)  = y±Y'  - Yy[. 

Since  yq,  y2  is  a basis,  the  Wronskian  W in  these  equations  is  not  0,  and  we  can  solve  for 
Ci  and  c2.  We  call  the  (unique)  solution  ci  = C i,  c2  = C2.  By  substituting  it  into  (9)  we 
obtain  from  (9)  the  particular  solution 

y*(x)  = CjyqCx)  + C2y2(x). 

Now  since  C i,  C2  is  a solution  of  (10),  we  see  from  (10)  that 
y*(x0)  = Y(x0),  y*'(x0)  = Y'(x0). 

From  the  uniqueness  stated  in  Theorem  1 this  implies  that  y*  and  Y must  be  equal 
everywhere  on  /,  and  the  proof  is  complete. 
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Reflecting  on  this  section,  we  note  that  homogeneous  linear  ODEs  with  continuous  variable 
coefficients  have  a conceptually  and  structurally  rather  transparent  existence  and  uniqueness 
theory  of  solutions.  Important  in  itself,  this  theory  will  also  provide  the  foundation  for  our 
study  of  nonhomogeneous  linear  ODEs,  whose  theory  and  engineering  applications  form 
the  content  of  the  remaining  four  sections  of  this  chapter. 


PROBLEM— 


1.  Derive  (6*)  from  (6). 


2-8 


BASIS  OF  SOLUTIONS.  WRONSKIAN 


Find  the  Wronskian.  Show  linear  independence  by  using 
quotients  and  confirm  it  by  Theorem  2. 

4.0a:  —1.5a; 

2.  e ,e 

7 — 0.4a;  —2.6a: 

3.  e , e 


4.  x,  \/x 

5.  x3,  x2 

6.  e~x  cos  (ox , e~x  sin  cox 

7.  cosh  ax,  sinh  ax 

8.  xk  cos  (In  x),  xk  sin  (In  x) 


9-15 


ODE  FOR  GIVEN  BASIS.  WRONSKIAN.  IVP 


(a)  Find  a second-order  homogeneous  linear  ODE  for 
which  the  given  functions  are  solutions,  (b)  Show  linear 
independence  by  the  Wronskian.  (c)  Solve  the  initial  value 
problem. 

9.  cos  5x,  sin  5x,  v(0)  = 3,  y'(O)  = -5 

10.  xm\  x™2,  y(l)  = -2,  y'(l)  = 2m  1 - 4 m2 

11.  e_2'5x  cos  0.3x,  e~2'5x  sin  0.3x,  v(0)  = 3, 
y'(0)  = -7.5 

12.  x2,x2lnx,  y(l)  = 4,  y'(l)  = 6 

13.  1,  e~2x,  y(0)  = 1,  y'(0)  = -1 

14.  e~kx cos  irx,e~kx  simrx,  y(0)  = 1, 
y'(0)  = k 7r 

15.  cosh  1.8x,  sinh  1.8*,  y(0)  = 14.20,  y'(0)  = 16.38 


16.  TEAM  PROJECT.  Consequences  of  the  Present 
Theory.  This  concerns  some  noteworthy  general 
properties  of  solutions.  Assume  that  the  coefficients  p 
and  q of  the  ODE  (1)  are  continuous  on  some  open 
interval  /,  to  which  the  subsequent  statements  refer. 

(a)  Solve  y"  — y = 0 (a)  by  exponential  functions, 

(b)  by  hyperbolic  functions.  How  are  the  constants  in 
the  corresponding  general  solutions  related? 

(b)  Prove  that  the  solutions  of  a basis  cannot  be  0 at 
the  same  point. 

(c)  Prove  that  the  solutions  of  a basis  cannot  have  a 
maximum  or  minimum  at  the  same  point. 

(d)  Why  is  it  likely  that  formulas  of  the  form  (6*) 
should  exist? 

(e)  Sketch  yi(x)  = x3  if  * £ 0 and  0 if  x < 0, 
jy2(x)  = 0 if  x £ 0 and  x3  if  x < 0.  Show  linear 
independence  on  — 1 < * < 1 . What  is  their 
Wronskian?  What  Euler-Cauchy  equation  do  yq,  y2 
satisfy?  Is  there  a contradiction  to  Theorem  2? 

(f)  Prove  Abel’s  formula6 


Wfyi (x),  y2(x))  = c exp 

where  c = W(y i(*o).  ^(^o))-  Apply  it  to  Prob.  6.  Hint: 
Write  (1)  for  y1  and  for  y2.  Eliminate  q algebraically 
from  these  two  ODEs,  obtaining  a first-order  linear 
ODE.  Solve  it. 


- p(t)  dt 


2.1  Nonhomogeneous  ODEs 

We  now  advance  from  homogeneous  to  nonhomogeneous  linear  ODEs. 
Consider  the  second-order  nonhomogeneous  linear  ODE 


(1) 


y"  + p(x)y'  + q(x)y  = r(x) 


where  r(x)  # 0.  We  shall  see  that  a “general  solution”  of  ( 1 ) is  the  sum  of  a general 
solution  of  the  corresponding  homogeneous  ODE 


6NIELS  HENRIK  ABEL  (1802-1829),  Norwegian  mathematician. 
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DEFINITION 


THEOREM  1 


PROOF 


THEOREM  2 


PROOF 


(2)  y"  + p(x)y'  + q(x)y  = 0 

and  a “particular  solution”  of  (1).  These  two  new  terms  “general  solution  of  (1)”  and 
“particular  solution  of  (1)”  are  defined  as  follows. 


General  Solution,  Particular  Solution 

A general  solution  of  the  nonhomogeneous  ODE  (1)  on  an  open  interval  / is  a 
solution  of  the  form 

(3)  y(x)  = yh(x ) + yp(x); 

here,  yh  = cpyi  + c2y2  is  a general  solution  of  the  homogeneous  ODE  (2)  on  I and 
yv  is  any  solution  of  (1)  on  I containing  no  arbitrary  constants. 

A particular  solution  of  (1)  on  / is  a solution  obtained  from  (3)  by  assigning 
specific  values  to  the  arbitrary  constants  cy  and  C2  in  yyL. 


Our  task  is  now  twofold,  first  to  justify  these  definitions  and  then  to  develop  a method 
for  finding  a solution  yp  of  (1). 

Accordingly,  we  first  show  that  a general  solution  as  just  defined  satisfies  (1)  and  that 
the  solutions  of  (1)  and  (2)  are  related  in  a very  simple  way. 


Relations  of  Solutions  of  (1)  to  Those  of  (2) 

(a)  The  sum  of  a solution  y of  ( 1)  on  some  open  interval  I and  a solution  y of 
(2)  on  I is  a solution  of  { 1)  on  I.  In  particular,  (3)  is  a solution  of(  1)  on  I. 

(b)  The  difference  of  two  solutions  0/(1)  on  I is  a solution  of  { 2)  on  I. 


(a)  Let  L[v]  denote  the  left  side  of  (1).  Then  for  any  solutions  y of  (1)  and  y of  (2)  on  /, 

L[y  + y]  = L[y]  + L[y]  = r + 0 = r. 

(b)  For  any  solutions  y and  y * of  (1)  on  I we  have  L[y  — v*]  = L[y\  — L[y*]  = 
r - r = 0. 

Now  for  homogeneous  ODEs  (2)  we  know  that  general  solutions  include  all  solutions. 
We  show  that  the  same  is  true  for  nonhomogeneous  ODEs  (1). 


A General  Solution  of  a Nonhomogeneous  ODE  Includes  All  Solutions 

If  the  coefficients  p(x),  q(x),  and  the  function  r(x)  in  (1)  are  continuous  on  some 
open  interval  I,  then  every  solution  of  ( 1)  on  I is  obtained  by  assigning  suitable 
values  to  the  arbitrary  constants  Cy  and  C2  in  a general  solution  (3)  of  ( 1)  on  I. 


Let  y*  be  any  solution  of  (1)  on  I and  x0  any  x in  I.  Let  (3)  be  any  general  solution  of 
(1)  on  I.  This  solution  exists.  Indeed,  y;,.  = cyyi  + c2y2  exists  by  Theorem  3 in  Sec.  2.6 
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because  of  the  continuity  assumption,  and  yp  exists  according  to  a construction  to  be 
shown  in  Sec.  2.10.  Now,  by  Theorem  1(b)  just  proved,  the  difference  Y = y*  — yv  is  a 
solution  of  (2)  on  I.  At  xq  we  have 

Y(x 0)  = y*(x0)  - yp(x0).  Y'(x0)  = y*\x0)  - yp{x0). 

Theorem  1 in  Sec.  2.6  implies  that  for  these  conditions,  as  for  any  other  initial  conditions 
in  I,  there  exists  a unique  particular  solution  of  (2)  obtained  by  assigning  suitable  values 
to  Ci,  C2  in  yy,.  From  this  and  y*  = Y + yp  the  statement  follows. 

Method  of  Undetermined  Coefficients 

Our  discussion  suggests  the  following.  To  solve  the  nonhomogeneous  ODE  (1 ) or  an  initial 
value  problem  for  (1),  we  have  to  solve  the  homogeneous  ODE  (2)  and  find  any  solution 
yp  of  ( 1),  so  that  we  obtain  a general  solution  (3)  of  (1). 

How  can  we  find  a solution  yp  of  (1)?  One  method  is  the  so-called  method  of 
undetermined  coefficients.  It  is  much  simpler  than  another,  more  general,  method  (given 
in  Sec.  2.10).  Since  it  applies  to  models  of  vibrational  systems  and  electric  circuits  to  be 
shown  in  the  next  two  sections,  it  is  frequently  used  in  engineering. 

More  precisely,  the  method  of  undetermined  coefficients  is  suitable  for  linear  ODEs 
with  constant  coefficients  a and  b 

(4)  y"  + ay'  + by  = r(x) 

when  r(x)  is  an  exponential  function,  a power  of  x,  a cosine  or  sine,  or  sums  or  products 
of  such  functions.  These  functions  have  derivatives  similar  to  r(x)  itself.  This  gives  the 
idea.  We  choose  a form  for  yp  similar  to  r(x),  but  with  unknown  coefficients  to  be 
determined  by  substituting  that  yp  and  its  derivatives  into  the  ODE.  Table  2.1  on  p.  82 
shows  the  choice  of  yp  for  practically  important  forms  of  r (x).  Corresponding  rules  are 
as  follows. 


Choice  Rules  for  the  Method  of  Undetermined  Coefficients 

(a)  Basic  Rule.  If  r{x)  in  (4)  is  one  of  the  functions  in  the  first  column  in 
Table  2.1,  choose  yp  in  the  same  line  and  determine  its  undetermined 
coefficients  by  substituting  yp  and  its  derivatives  into  (4). 

(b)  Modification  Rule.  If  a term  in  your  choice  for  yp  happens  to  be  a 
solution  of  the  homogeneous  ODE  corresponding  to  (4),  multiply  this  term 
by  x ( or  by  xZ  if  this  solution  corresponds  to  a double  root  of  the 
characteristic  equation  of  the  homogeneous  ODE). 

(c)  Sum  Rule.  If  r(x)  is  a sum  of  functions  in  the  first  column  of  Table  2.1, 
choose  for  yp  the  sum  of  the  functions  in  the  corresponding  lines  of  the 
second  column. 


The  Basic  Rule  applies  when  r(x)  is  a single  term.  The  Modification  Rule  helps  in  the 
indicated  case,  and  to  recognize  such  a case,  we  have  to  solve  the  homogeneous  ODE 
first.  The  Sum  Rule  follows  by  noting  that  the  sum  of  two  solutions  of  (1)  with  r = r\ 
and  r = r2  (and  the  same  left  side!)  is  a solution  of  (1)  with  r = rq  + r2.  (Verify!) 
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EXAMPLE  1 


The  method  is  self-correcting.  A false  choice  for  yp  or  one  with  too  few  terms  will  lead 
to  a contradiction.  A choice  with  too  many  terms  will  give  a correct  result,  with  superfluous 
coefficients  coming  out  zero. 

Let  us  illustrate  Rules  (a)-(c)  by  the  typical  Examples  1-3. 


Table  2.1  Method  of  Undetermined  Coefficients 


Term  in  r(x) 

Choice  for  yp{x) 

keyx 

kxn  (n  = 0,  1,  ■ ■ ■ ) 
k cos  cox 
k sin  cox 
keaX  cos  cox 
keaX  sin  cox 

Ceyx 

Knxn  + Kn—-±xn  *■  + •••  + K\X  + Kq 
| K cos  cox  + M sin  cox 
| eaX{K  cos  cox  + M sin  cox) 

Application  of  the  Basic  Rule  (a) 

Solve  the  initial  value  problem 

(5)  y"  + y = O.OOlx2,  y(0)  = 0,  y'(0)  - 1.5. 

Solution.  Step  1.  General  solution  of  the  homogeneous  ODE.  The  ODEy^  + y = 0 has  the  general  solution 

yh  = A cos  x + B sin  x. 

Step  2.  Solution  yp  of  the  nonhomo geneous  ODE.  We  first  try  yp  = Kx2.  Then  yp  = 2 K.  By  substitution, 
2 K + Kx2  — 0.00 lx2.  For  this  to  hold  for  all  x,  the  coefficient  of  each  power  of  x (x2  and  x°)  must  be  the  same 
on  both  sides;  thus  K = 0.001  and  2 K = 0,  a contradiction. 

The  second  line  in  Table  2.1  suggests  the  choice 

yp  = K2x2  + KlX  + Kq.  Then  yp  + yp  = 2 K2  + K2x2  + K±x  + Kq  = 0.00  lx2. 

Equating  the  coefficients  of  x2,  x,  x°  on  both  sides,  we  have  K 2 — 0.001,  if  1 = 0,  2K2  + A^o  — 0.  Hence 
K0  = -2K2  = -0.002.  This  gives  yp  = O.OOlx2  - 0.002,  and 

y = yh  + yp  = A cos  x + B sinx  + O.OOlx2  — 0.002. 

Step  3.  Solution  of  the  initial  value  problem.  Setting  x = 0 and  using  the  first  initial  condition  gives 
y(0)  = A — 0.002  = 0,  hence  A — 0.002.  By  differentiation  and  from  the  second  initial  condition, 

y'  — y'h  + yp  — ~A  sinx  + B cos  x + 0.002x  and  yr(0)  = B = 1.5. 

This  gives  the  answer  (Fig.  50) 

y = 0.002  cosx  + 1.5  sinx  + O.OOlx2  — 0.002. 

Figure  50  shows  y as  well  as  the  quadratic  parabola  yp  about  which  y is  oscillating,  practically  like  a sine  curve 
since  the  cosine  term  is  smaller  by  a factor  of  about  1/1000. 


Fig.  50.  Solution  in  Example  1 
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EXAMPLE  2 


EXAMPLE  3 


Application  of  the  Modification  Rule  (b) 

Solve  the  initial  value  problem 

(6)  y"  + 3/  + 2.25  y = -lOe-1-5*,  y{  0)  = 1,  /( 0)  = 0. 

Solution.  Step  1.  General  solution  of  the  homogeneous  ODE.  The  characteristic  equation  of  the  homogeneous 
ODE  is  A2  + 3 A + 2.25  = (A  + 1.5)2  = 0.  Hence  the  homogeneous  ODE  has  the  general  solution 

yh  = (A  + c2x)e~15x. 


Step  2.  Solution  yp  of  the  nonhomogeneous  ODE.  The  function  e~15x  on  the  right  would  normally  require 
the  choice  Ce~15x.  But  we  see  from  yh  that  this  function  is  a solution  of  the  homogeneous  ODE,  which 
corresponds  to  a double  root  of  the  characteristic  equation.  Hence,  according  to  the  Modification  Rule  we  have 
to  multiply  our  choice  function  by  *.  That  is,  we  choose 

yp  = Cx2e~15x.  Then  y'p  = C(2x  - l.5x2)e~15x,  yp  = C(2  - 3x  - 3x  + 2.25*2)e_1-5x. 

We  substitute  these  expressions  into  the  given  ODE  and  omit  the  factor  e~15x.  This  yields 

C(2  - 6x  + 2.25x2)  + 3C(2x  - l.5x2)  + 2.25C*2  = -10. 

Comparing  the  coefficients  of  x2,  x,  x°  gives  0 = 0,  0 = 0,  2C  = —10,  hence  C = —5.  This  gives  the  solution 
yp  = —5xze~  ' x.  Hence  the  given  ODE  has  the  general  solution 

y — yu  + yp  = (a  + c2x)e  151  — 5*2<?  15x. 

Step  3.  Solution  of  the  initial  value  problem.  Setting  x = 0 in  y and  using  the  first  initial  condition,  we  obtain 
y(0)  = ci  = 1 . Differentiation  of  y gives 

/ = (c2  - 1.5 Ci  - 1.5 C2x)e~15x  - like-1-5*  + 1.5x2e~1Sx. 

From  this  and  the  second  initial  condition  we  have  yr(0)  — c2—  1.5ci  = 0.  Hence  c2  — 1.5ci  = 1.5.  This  gives 
the  answer  (Fig.  51) 


y = (1  + l.5x)e~15x  - 5x2e~15x  = (1  + l.5x  - 5x2)e~15x. 

The  curve  begins  with  a horizontal  tangent,  crosses  the  x-axis  at  x = 0.6217  (where  1 + 1.5*  — 5x2  = 0)  and 
approaches  the  axis  from  below  as  * increases. 


* 


Fig.  51.  Solution  in  Example  2 

Application  of  the  Sum  Rule  (c) 

Solve  the  initial  value  problem 

(7)  y"  + 2y  + 0.75y  = 2 cos*  - 0.25  sin*  + 0.09*,  y(0)  = 2.78,  y'(0)  = -0.43. 

Solution.  Step  1.  General  solution  of  the  homogeneous  ODE.  The  characteristic  equation  of  the  homogeneous 
ODE  is 


A2  + 2A  + 0.75  = (A  + I)  (A  + 1)  = 0 


which  gives  the  general  solution  yy  = c^e  x-'2  + c2e 
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Step  2.  Particular  solution  of  the  nonhomogeneous  ODE.  We  write  yp  = ypi  + yp 2 and,  following  Table  2.1, 
(C)  and  (B), 


ypl  = K cos  x + M sin  x and  yp 2 = K±x  + Kq. 

Differentiation  gives  ypi  = —K  sinx  + Mcosx,ypi  = —Kcosx  — M sin  x and  y'P2  — 1,  yP2  — 0.  Substitution 
of  ypi  into  the  ODE  in  (7)  gives,  by  comparing  the  cosine  and  sine  terms, 

-K+2M  + 0.75A:  = 2,  —M  — 2AT  + 0.75M  = -0.25, 

hence  K = 0 and  M = 1.  Substituting  yp 2 into  the  ODE  in  (7)  and  comparing  the  x-  and  jc°-terms  gives 

0.75^1  = 0.09,  2^i  + 0.75Ko  = 0,  thus  Ki  = 0.12,  = -0.32. 

Hence  a general  solution  of  the  ODE  in  (7)  is 

>>  = c\e~x^  + C2«_3x/2  + sin*  + 0.12*  - 0.32. 

Step  3.  Solution  of  the  initial  value  problem.  From  v,yr  and  the  initial  conditions  we  obtain 

y( 0)  = ci  + c2  - 0.32  = 2.78,  /(0)  = -|c!  - §c2  + 1 + 0.12  = -0.4. 

Hence  c*  = 3.1,  c2  = 0.  This  gives  the  solution  of  the  IVP  (Fig.  52) 

y = 3.1<r*/2  + sin*  + 0.12*  - 0.32. 


Fig.  52.  Solution  in  Example  3 


Stability.  The  following  is  important.  If  (and  only  if)  all  the  roots  of  the  characteristic 
equation  of  the  homogeneous  ODE  y'  + ay  + by  = 0 in  (4)  are  negative,  or  have  a negative 
real  part,  then  a general  solution  yyL  of  this  ODE  goes  to  0 as  x — > oo,  so  that  the  “transient 
solution”  y = yu  + yp  of  (4)  approaches  the  “steady-state  solution”  yp.  In  this  case  the 
nonhomogeneous  ODE  and  the  physical  or  other  system  modeled  by  the  ODE  are  called 
stable;  otherwise  they  are  called  unstable.  For  instance,  the  ODE  in  Example  1 is  unstable. 

Applications  follow  in  the  next  two  sections. 


FR~Q-B^E^M=S^E-T=2^7 
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NONHOMOGENEOUS  LINEAR  ODEs: 
GENERAL  SOLUTION 


Find  a (real)  general  solution.  State  which  rule  you  are 
using.  Show  each  step  of  your  work. 

1.  y"  + 5y'  + 4y  = 10e-3x 


2.  lOv"  + SOv7  + 57.6y  = cos  * 

3.  y"  + 3y'  + 2y  = 12*2 

4.  y"  — 9y  = 18  cos  ttx 

5.  y"  + 4 y'  + 4y  = e-xcos* 

6.  y"  + y'  + (7 r2  + |)y  = e~x ^ sin  tt  x 
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7.  (D2  + 2 D + | I)y  = 3ex  + fx 

8.  (3D2  + 21I)y  = 3 cos  x + cos  3x 

9.  (D2  - 16 1)y  = 9.6e4x  + 30ex 

10.  (D2  + 2D  + I)y  = 2x  sin  x 

NONHOMOGENEOUS  LINEAR 
ODEs:  IVPs 

Solve  the  initial  value  problem.  State  which  rule  you  are 

using.  Show  each  step  of  your  calculation  in  detail. 

11.  y"  + 3y  = 18x2,  y(0)  = -3,  y'(0)  = 0 

12.  y"  + 4y  = -12  sin  2x,  y(0)  = 1.8,  y'(0)  = 5.0 

13.  8y”  — 6y;  + y = 6 cosh  x,  y(0)  = 0.2, 
y'(0)  = 0.05 

14.  y"  + Ay'  + Ay  = e-2*  sin  2x,  v(0)  = 1, 

y'(0)  = -1.5 

15.  (x2D2  - 3 xD  + 3 1)y  = 3 lnx  - 4, 
y(l)  = 0,  y'(l)  =1;  yv  = lnx 

16.  (D2  - 2D)y  = 6e2x  - Ae~Zx,  y(0)  = -1,  y'(0)  = 6 

17.  (D2  + 0.2 D + 0.26 l)y  = l.22eOSx,  y(0)  = 3.5, 
y'(0)  = 0.35 


18.  (D2  + 2 D + 10 1)y  = 17  sinx  - 37  sin  3x, 
y(0)  = 6.6,  y'(0)  = -2.2 

19.  CAS  PROJECT.  Structure  of  Solutions  of  Initial 
Value  Problems.  Using  the  present  method,  find, 
graph,  and  discuss  the  solutions  y of  initial  value 
problems  of  your  own  choice.  Explore  effects  on 
solutions  caused  by  changes  of  initial  conditions. 
Graph  yp,  y,  y — yp  separately,  to  see  the  separate 
effects.  Find  a problem  in  which  (a)  the  part  of  y 
resulting  from  y ^ decreases  to  zero,  (b)  increases, 
(c)  is  not  present  in  the  answer  y.  Study  a problem  with 
y(0)  = 0,  y^O)  = 0.  Consider  a problem  in  which 
you  need  the  Modification  Rule  (a)  for  a simple  root, 
(b)  for  a double  root.  Make  sure  that  your  problems 
cover  all  three  Cases  I,  II,  III  (see  Sec.  2.2). 

20.  TEAM  PROJECT.  Extensions  of  the  Method  of 
Undetermined  Coefficients,  (a)  Extend  the  method 
to  products  of  the  function  in  Table  2.1,  (b)  Extend 
the  method  to  Euler-Cauchy  equations.  Comment  on 
the  practical  significance  of  such  extensions. 


2.8  Modeling:  Forced  Oscillations.  Resonance 

In  Sec.  2.4  we  considered  vertical  motions  of  a mass-spring  system  (vibration  of  a mass 
m on  an  elastic  spring,  as  in  Figs.  33  and  53)  and  modeled  it  by  the  homogeneous  linear 
ODE 


(1)  my"  + cy  + ky  = 0. 

Here  y(t)  as  a function  of  time  t is  the  displacement  of  the  body  of  mass  m from  rest. 

The  mass-spring  system  of  Sec.  2.4  exhibited  only  free  motion.  This  means  no  external 
forces  (outside  forces)  but  only  internal  forces  controlled  the  motion.  The  internal  forces 
are  forces  within  the  system.  They  are  the  force  of  inertia  my' , the  damping  force  cy 
(if  c > 0),  and  the  spring  force  ky,  a restoring  force. 


Fig.  53.  Mass  on  a spring 
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We  now  extend  our  model  by  including  an  additional  force,  that  is,  the  external  force 
r(t),  on  the  right.  Then  we  have 

(2*)  my”  + cy  + ky  = r(t). 

Mechanically  this  means  that  at  each  instant  t the  resultant  of  the  internal  forces  is  in 
equilibrium  with  r(t).  The  resulting  motion  is  called  a forced  motion  with  forcing  function 
r(t).  which  is  also  known  as  input  or  driving  force,  and  the  solution  y(t)  to  be  obtained 
is  called  the  output  or  the  response  of  the  system  to  the  driving  force. 

Of  special  interest  are  periodic  external  forces,  and  we  shall  consider  a driving  force 
of  the  form 


r(f)  = F0  cos  cot  ( F0  > 0,  co  > 0). 

Then  we  have  the  nonhomogeneous  ODE 
(2)  my”  + cy'  + ky  = Fq  cos  cot. 

Its  solution  will  reveal  facts  that  are  fundamental  in  engineering  mathematics  and  allow 
us  to  model  resonance. 


Solving  the  Nonhomogeneous  ODE  (2) 

From  Sec.  2.7  we  know  that  a general  solution  of  (2)  is  the  sum  of  a general  solution 
of  the  homogeneous  ODE  (1)  plus  any  solution  yp  of  (2).  To  find  yp,  we  use  the  method 
of  undetermined  coefficients  (Sec.  2.7),  starting  from 

(3)  yP(t)  = a cos  cot  + b sin  cot. 

By  differentiating  this  function  (chain  rule!)  we  obtain 

y'p  = —a>a  sin  cot  + cob  cos  cot, 
yp  = —co2 a cos  cot  — co2b  sin  cot. 

Substituting  yp,  yp,  and  y"p  into  (2)  and  collecting  the  cosine  and  the  sine  terms,  we  get 

[(£  — mco2)a  + cocb]  cos  cot  + [—coca  + (k  — mco2)b\  sin  cot  = F0  cos  cot. 

The  cosine  terms  on  both  sides  must  be  equal,  and  the  coefficient  of  the  sine  term 
on  the  left  must  be  zero  since  there  is  no  sine  term  on  the  right.  This  gives  the  two 
equations 


( k — mco2)a  + cocb  = F0 
2 

—coca  + (k  — mco  )b  = 0 


(4) 
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for  determining  the  unknown  coefficients  a and  b.  This  is  a linear  system.  We  can  solve 
it  by  elimination.  To  eliminate  b,  multiply  the  first  equation  by  k — mco2  and  the  second 
by  — coc  and  add  the  results,  obtaining 

(, k — mo)2)za  + co2c2a  = Ffk  — mco2). 

Similarly,  to  eliminate  a,  multiply  (the  first  equation  by  coc  and  the  second  by  k — mco2 
and  add  to  get 


co2c2b  + (k  — mco2)2b  = F0coc. 

If  the  factor  ( k — mco2)2  + co2c2  is  not  zero,  we  can  divide  by  this  factor  and  solve  for  a 
and  b, 


a = F0 


k — mco 

/ j 2\2  I 2 2 5 

{k  — JflCO  ) CO  C 


b = F0 


/;  2\2  i 2 2 

(k  — mco  ) + (o  c 


If  we  set  V k/m  = coq  (>  0)  as  in  Sec.  2.4,  then  k = mco o and  we  obtain 


(5) 


a = F, 


m(cOo  co2) 


0 2,2  2,2  , 2 2’ 
m (co o — co  ) + co  c 


b = F, 


0 2,  2 2,2  , 2 2' 
m (co o — co  ) + co  c 


We  thus  obtain  the  general  solution  of  the  nonhomogeneous  ODE  (2)  in  the  form 


(6)  y(t)  = yh(t)  + yp(t). 

Here  y;t  is  a general  solution  of  the  homogeneous  ODE  (1)  and  yp  is  given  by  (3)  with 
coefficients  (5). 

We  shall  now  discuss  the  behavior  of  the  mechanical  system,  distinguishing  between 
the  two  cases  c = 0 (no  damping)  and  c > 0 (damping).  These  cases  will  correspond  to 
two  basically  different  types  of  output. 


Case  1.  Undamped  Forced  Oscillations.  Resonance 

If  the  damping  of  the  physical  system  is  so  small  that  its  effect  can  be  neglected  over  the 
time  interval  considered,  we  can  set  c = 0.  Then  (5)  reduces  to  a = F0/[/7i(<wo  ~ C’J2)] 
and  b = 0.  Hence  (3)  becomes  (use  co02  = k/m) 


(7) 


yP(t) 


F0 

„ „ cos  cot  = 

m(coQ  — co2) 


Fo 

„ cos  cot. 

k[  1 - (co/a)0 )2] 


Here  we  must  assume  that  co2  + oj02;  physically,  the  frequency  co/(2tt)  [cycles/sec]  of 
the  driving  force  is  different  from  the  natural  frequency  co0/(2tt)  of  the  system,  which  is 
the  frequency  of  the  free  undamped  motion  [see  (4)  in  Sec.  2.4].  From  (7)  and  from  (4*) 
in  Sec.  2.4  we  have  the  general  solution  of  the  “undamped  system” 


F0 

(8)  y(t)  = C cos  (&)0f  — 8)  H z — cos  cot. 

m(a>Q  — co2) 

We  see  that  this  output  is  a superposition  of  two  harmonic  oscillations  of  the  frequencies 
just  mentioned. 
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Resonance.  We  discuss  (7).  We  see  that  the  maximum  amplitude  of  yp  is  (put  cos  cot  = 1) 


(9) 


where 


1 


1 - (co/coo) 


2 ■ 


a()  depends  on  at  and  &>o-  If  at  —*  at o,  then  p and  Oq  tend  to  infinity.  This  excitation  of  large 
oscillations  by  matching  input  and  natural  frequencies  (oj  = <u0)  is  called  resonance,  p is 
called  the  resonance  factor  (Fig.  54),  and  from  (9)  we  see  that  p/k  = uq/ Fq  is  the  ratio 
of  the  amplitudes  of  the  particular  solution  yp  and  of  the  input  Fq  cos  cot.  We  shall  see 
later  in  this  section  that  resonance  is  of  basic  importance  in  the  study  of  vibrating  systems. 

In  the  case  of  resonance  the  nonhomogeneous  ODE  (2)  becomes 


(10) 


n . 2 

y + u)0y 


F0 

— COS  COnt. 

vn  ^ 


Then  (7)  is  no  longer  valid,  and,  from  the  Modification  Rule  in  Sec.  2.7,  we  conclude  that 
a particular  solution  of  (10)  is  of  the  form 

yp(t)  = t(a  cos  co0t  + b sin  co0t). 


1 

1 

/ 1 
1 

y i 

i 

, i 

% 

i / 
i 
i 
i 

1 1 

C 0 

Fig.  54.  Resonance  factor  p[w) 


By  substituting  this  into  (10)  we  find  a = 0 and  b = F0/(2mco0).  Hence  (Fig.  55) 


(ID 


yP(t) 


Fq 

2/770)0 


t sin  co0t. 


Fig.  55.  Particular  solution  in  the  case  of  resonance 

We  see  that,  because  of  the  factor  t,  the  amplitude  of  the  vibration  becomes  larger  and 
larger.  Practically  speaking,  systems  with  very  little  damping  may  undergo  large  vibrations 
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that  can  destroy  the  system.  We  shall  return  to  this  practical  aspect  of  resonance  later  in 
this  section. 

Beats.  Another  interesting  and  highly  important  type  of  oscillation  is  obtained  if  co  is 
close  to  coo.  Take,  for  example,  the  particular  solution  [see  (8)] 


(12) 


y(t) 


Fq 

tnio^Q  w2) 


(cos  wt  — cos  wot) 


Using  (12)  in  App.  3.1,  we  may  write  this  as 


y(t) 


2F0 

m{w  o — w2) 


COq  + CO 


t sin 


(w  A w0). 


Since  w is  close  to  w0,  the  difference  w0  — w is  small.  Hence  the  period  of  the  last  sine 
function  is  large,  and  we  obtain  an  oscillation  of  the  type  shown  in  Fig.  56,  the  dashed 
curve  resulting  from  the  first  sine  factor.  This  is  what  musicians  are  listening  to  when 
they  tune  their  instruments. 


Fig.  56.  Forced  undamped  oscillation  when  the  difference  of  the  input 
and  natural  frequencies  is  small  (“beats”) 


Case  2.  Damped  Forced  Oscillations 

If  the  damping  of  the  mass-spring  system  is  not  negligibly  small,  we  have  c > 0 and 
a damping  term  cy  in  (1)  and  (2).  Then  the  general  solution  y h of  the  homogeneous 
ODE  (1)  approaches  zero  as  t goes  to  infinity,  as  we  know  from  Sec.  2.4.  Practically, 
it  is  zero  after  a sufficiently  long  time.  Hence  the  “transient  solution”  (6)  of  (2), 
given  by  y = yn  + vp,  approaches  the  “steady-state  solution”  yp.  This  proves  the 
following. 


THEOREM  1 


Steady-State  Solution 

After  a sufficiently  long  time  the  output  of  a damped  vibrating  system  under  a purely 
sinusoidal  driving  force  [see  (2)]  will  practically  be  a harmonic  oscillation  whose 
frequency  is  that  of  the  input. 
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Amplitude  of  the  Steady-State  Solution.  Practical  Resonance 

Whereas  in  the  undamped  case  the  amplitude  of  yp  approaches  infinity  as  oj  approaches 
co0,  this  will  not  happen  in  the  damped  case.  In  this  case  the  amplitude  will  always  be 
finite.  But  it  may  have  a maximum  for  some  co  depending  on  the  damping  constant  c. 
This  may  be  called  practical  resonance.  It  is  of  great  importance  because  if  c is  not  too 
large,  then  some  input  may  excite  oscillations  large  enough  to  damage  or  even  destroy 
the  system.  Such  cases  happened,  in  particular  in  earlier  times  when  less  was  known  about 
resonance.  Machines,  cars,  ships,  airplanes,  bridges,  and  high-rising  buildings  are  vibrating 
mechanical  systems,  and  it  is  sometimes  rather  difficult  to  find  constructions  that  are 
completely  free  of  undesired  resonance  effects,  caused,  for  instance,  by  an  engine  or  by 
strong  winds. 

To  study  the  amplitude  of  yp  as  a function  of  co,  we  write  (3)  in  the  form 
(13)  yp(t)  = C*  cos  (cot  — 77). 

C*  is  called  the  amplitude  of  yp  and  77  the  phase  angle  or  phase  lag  because  it  measures 
the  lag  of  the  output  behind  the  input.  According  to  (5),  these  quantities  are 


(14) 


C*(«)  = Va2  + b2  = 


Fo 


'\Zm2(co2  — co2)2  + 


2 2’ 
co  c 


, . b wc 

tan  77  (co)  = - = 2 2~  ■ 

a m(co 0 — co  ) 


Let  us  see  whether  C*(co)  has  a maximum  and,  if  so,  find  its  location  and  then  its  size. 
We  denote  the  radicand  in  the  second  root  in  C*  by  R.  Equating  the  derivative  of  C*  to 
zero,  we  obtain 


^ = Fo(-|fl-3/2)[2m2(a>i  - co2)(-2co)  + 2 coc2]. 

The  expression  in  the  brackets  [.  . .]  is  zero  if 

(15)  c2  = 2w72(a>o  — co2)  (ojf)  = k/m). 

By  reshuffling  terms  we  have 

2m2  co2  = 2m2a>o2  — c2  = 2m  k — c2. 

The  right  side  of  this  equation  becomes  negative  if  c2  > 2mk,  so  that  then  (15)  has  no 
real  solution  and  C*  decreases  monotone  as  co  increases,  as  the  lowest  curve  in  Fig.  57 
shows.  If  c is  smaller,  c2  < 2 ink,  then  (15)  has  a real  solution  co  = comax,  where 

c2 

(15*)  wmax  — w0  — ~ 2' 


From  (15*)  we  see  that  this  solution  increases  as  c decreases  and  approaches  coo  as  c 
approaches  zero.  See  also  Fig.  57. 
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The  size  of  C*(wmax)  is  obtained  from  (14),  with  co1 2 *  = <u2lax  given  by  (15*).  For  this 
co2  we  obtain  in  the  second  radicand  in  (14)  from  (15*) 

m\(4  ~ "Lx)2  = 7-7  and  «2ma xc2  = ~ 7-7)  c2. 

4 in  \ 2 m J 

The  sum  of  the  right  sides  of  these  two  formulas  is 

(c4  + 4m 2u>qc2  - 2c4)/ (4m2)  = c2(4m2«o  - c2)/(4m2). 

Substitution  into  (14)  gives 


(16) 


C*Kax) 


2mF0 

A /2  2 2 2 ' 

c V 4m  wq  — c 


We  see  that  C*(wmax)  is  always  finite  when  c > 0.  Furthermore,  since  the  expression 

c24m2wo  — c4  = c2(4mk  — c2) 

in  the  denominator  of  (16)  decreases  monotone  to  zero  as  c2  «2mk)  goes  to  zero,  the  maximum 
amplitude  (16)  increases  monotone  to  infinity,  in  agreement  with  our  result  in  Case  1.  Figure  57 
shows  the  amplification  C*/F0  (ratio  of  the  amplitudes  of  output  and  input)  as  a function  of 
co  for  m = 1 , k = 1 , hence  <u0  = 1 , and  various  values  of  the  damping  constant  c. 

Figure  58  shows  the  phase  angle  (the  lag  of  the  output  behind  the  input),  which  is  less 
than  77/2  when  co  < co0 , and  greater  than  77/2  for  co  > co0. 


Fig.  57.  Amplification  C*/F0  as  a function  of  Fig.  58.  Phase  lag  tj  as  a function  of  co  for 
co  for  m = 1,  k = 1,  and  various  values  of  the  m = 1,  k = 1,  thus  w0  = 1,  and  various  values 
damping  constant  c of  the  damping  constant  c 


1.  WRITING  REPORT.  Free  and  Forced  Vibrations. 

Write  a condensed  report  of  2-3  pages  on  the  most 
important  similarities  and  differences  of  free  and  forced 
vibrations,  with  examples  of  your  own.  No  proofs. 

2.  Which  of  Probs.  1-18  in  Sec.  2.7  (with  x = time  t) 

can  be  models  of  mass-spring  systems  with  a harmonic 

oscillation  as  steady-state  solution? 


3-7 


STEADY-STATE  SOLUTIONS 


Find  the  steady-state  motion  of  the  mass-spring  system 
modeled  by  the  ODE.  Show  the  details  of  your  work. 

3.  y"  + 6y'  + 8y  = 42.5  cos  2t 

4.  y"  + 2.5y,  + 10y  = —13.6  sin  4? 

5.  (D2  + D + 4.25 1)y  = 22.1  cos  4.5f 
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6.  ( D 2 + A D + 3 1)y  = cos  t + \ cos  3 / 

7.  (4D2  + 12D  + 9 1)y  = 225  - 75  sin  3/ 


8-15 


TRANSIENT  SOLUTIONS 


Find  the  transient  motion  of  the  mass-spring  system 
modeled  by  the  ODE.  Show  the  details  of  your  work. 

8.  2y"  + 4y;  + 6.5y  = 4 sin  1.5/ 

9.  y"  + 3y'  + 3.25y  = 3 cos  t — 1.5  sin  / 

10.  y"  + 16y  = 56  cos  At 

11.  ( D 2 + 2 1)y  = cos  V2 / + sinV2/ 

12.  (D2  + 2D  + 5 l)y  = 4 cos  / + 8 sin  / 

13.  ( D 2 + l)y  = cos  to/,  w2  f 1 

14.  (D2  + I )y  = 5c_t  cos  / 

15.  ( D 2 + AD  + 8 1)y  = 2 cos  2/  + sin  2/ 


16-20 


INITIAL  VALUE  PROBLEMS 


Find  the  motion  of  the  mass-spring  system  modeled  by  the 
ODE  and  the  initial  conditions.  Sketch  or  graph  the  solution 
curve.  In  addition,  sketch  or  graph  the  curve  of  y — yp  to 
see  when  the  system  practically  reaches  the  steady  state. 


16.  y"  + 25_v  = 24  sin  /,  v(0)  =1,  y'(0)  = 1 

17.  (D2  + 4/jy  = sin  f + g sin  3/  + § sin  5/, 
v(0)  = 0,  y'(  0)  = i 

18.  (D2  + 8 D + 17/)y  = 474.5  sin  0.5/,  y(0)  = -5.4, 
v'(0)  = 9.4 

19.  (D2  + 2 D + 2 1)y  = e~t/2  sin^/,  y(0)  = 0, 
v'(0)  = 1 

20.  (D2  + 5I)y  — cos  77/  — sin  777,  y(0)  = 0,  y'(0)  = 0 


21.  Beats.  Derive  the  formula  after  (12)  from  (12).  Can 
we  have  beats  in  a damped  system? 

22.  Beats.  Solve  y"  + 25y  = 99  cos  4.9/,  y(0)  = 2, 
)/(0)  = 0.  How  does  the  graph  of  the  solution  change 
if  you  change  (a)  y(0),  (b)  the  frequency  of  the  driving 
force? 


23.  TEAM  EXPERIMENT.  Practical  Resonance. 

(a)  Derive,  in  detail,  the  crucial  formula  (16). 

(b)  By  considering  dC*/dc  show  that  C*(tomax)  in- 
creases as  c (S  V2 mk)  decreases. 

(c)  Illustrate  practical  resonance  with  an  ODE  of  your 
own  in  which  you  vary  c,  and  sketch  or  graph 
corresponding  curves  as  in  Fig.  57. 

(d)  Take  your  ODE  with  c fixed  and  an  input  of  two 
terms,  one  with  frequency  close  to  the  practical 
resonance  frequency  and  the  other  not.  Discuss  and 
sketch  or  graph  the  output. 

(e)  Give  other  applications  (not  in  the  book)  in  which 
resonance  is  important. 


24.  Gun  barrel.  Solve  y"  + y = 1 — /2/7T2  if  0 S 
/ S 77  and  0 if  / — » oo;  here,  y(0)  = 0,  y'(0)  = 0.  This 
models  an  undamped  system  on  which  a force  F acts 
during  some  interval  of  time  (see  Fig.  59),  for  instance, 
the  force  on  a gun  barrel  when  a shell  is  fired,  the  barrel 
being  braked  by  heavy  springs  (and  then  damped  by  a 
dashpot,  which  we  disregard  for  simplicity).  Hint:  At  77 
both  y and  y'  must  be  continuous. 


m - 1 k = 1 

m=JWh 


2 

0 


t 


Fig.  59.  Problem  24 

25.  CAS  EXPERIMENT.  Undamped  Vibrations. 

(a)  Solve  the  initial  value  problem  y"  + y = cos  tot, 
to2  A l,y(0)  = 0,y'(0)  = 0.  Show  that  the  solution 
can  be  written 

2 i , 

y(l)  = ^ srn  |2  (I  + to)/]  srn  [g  (1  - to)/]. 

1 — to 

(b)  Experiment  with  the  solution  by  changing  to  to 
see  the  change  of  the  curves  from  those  for  small 
to  (>0)  to  beats,  to  resonance,  and  to  large  values  of 
to  (see  Fig.  60). 


m = 0.2 


to  = 6 

Fig.  60.  Typical  solution  curves  in  CAS  Experiment  25 
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2.*  Modeling:  Electric  Circuits 

Designing  good  models  is  a task  the  computer  cannot  do.  Hence  setting  up  models  has 
become  an  important  task  in  modern  applied  mathematics.  The  best  way  to  gain  experience 
in  successful  modeling  is  to  carefully  examine  the  modeling  process  in  various  fields  and 
applications.  Accordingly,  modeling  electric  circuits  will  be  profitable  for  all  students, 
not  just  for  electrical  engineers  and  computer  scientists. 

Figure  61  shows  an  A/X-circuit,  as  it  occurs  as  a basic  building  block  of  large  electric 
networks  in  computers  and  elsewhere.  An  ALC-circuit  is  obtained  from  an  AL-circuit  by 
adding  a capacitor.  Recall  Example  2 on  the  AL-circuit  in  Sec.  1.5:  The  model  of  the 
AL-circuit  is  Ll'  + RI  = E(t).  It  was  obtained  by  KVL  (Kirchhoff  s Voltage  Law)7  by 
equating  the  voltage  drops  across  the  resistor  and  the  inductor  to  the  EMF  (electromotive 
force).  Hence  we  obtain  the  model  of  the  ALC-circuit  simply  by  adding  the  voltage  drop 
QfC  across  the  capacitor.  Here,  C F (farads)  is  the  capacitance  of  the  capacitor.  Q coulombs 
is  the  charge  on  the  capacitor,  related  to  the  current  by 


m 


dQ 

dt  ’ 


equivalently 


Q(f ) 


/(?)  dt. 


See  also  Fig.  62.  Assuming  a sinusoidal  EMF  as  in  Fig.  61,  we  thus  have  the  model  of 
the  ALC-circuit 


C 


E(t ) = Eq  sin(»£ 

Fig.  61.  REC-circuit 


Name 

Symbol 

Notation 

Unit 

Voltage  Drop 

Ohm’s  Resistor 

Inductor 

Capacitor 

— )(— 

R 

L 

C 

Ohm’s  Resistance 

Inductance 

Capacitance 

ohms  (fl) 
henrys (H) 
farads  (F) 

RI 
T dl 
L dt 
QIC 

Fig.  62.  Elements  in  an  REC-circuit 


7GUSTAV  ROBERT  KIRCHHOFF  (1824-1887),  German  physicist.  Later  we  shall  also  need  Kirchhoff’s 
Current  Law  (KCL): 

At  any  point  of  a circuit,  the  sum  of  the  itiflowing  currents  is  equal  to  the  sum  of  the  outflowing  currents. 

The  units  of  measurement  of  electrical  quantities  are  named  after  ANDRE  MARIE  AMPERE  (1775-1836), 
French  physicist,  CHARLES  AUGUSTIN  DE  COULOMB  (1736-1806),  French  physicist  and  engineer, 
MICHAEL  FARADAY  (1791-1867),  English  physicist,  JOSEPH  HENRY  (1797-1878),  American  physicist, 
GEORG  SIMON  OHM  (1789-1854),  German  physicist,  and  ALESSANDRO  VOLTA  (1745-1827),  Italian 
physicist. 
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(1') 


LI'  + RI  + 


J_  ' 
C. 


Idt  = E(t) 


Eq  sin  cot. 


This  is  an  “integro-differential  equation.”  To  get  rid  of  the  integral,  we  differentiate  (1 ') 
with  respect  to  t,  obtaining 


(1)  Li"  + Rl'  + ^ / = E'(t ) = E0co  cos  cot. 

This  shows  that  the  current  in  an  A7X’-circuit  is  obtained  as  the  solution  of  this 
nonhomogeneous  second-order  ODE  (1)  with  constant  coefficients. 

In  connection  with  initial  value  problems,  we  shall  occasionally  use 

(1")  LQ"  + RQ"  + = E(t), 

obtained  from  (lr)  and  / = Q' . 


Solving  the  ODE  (1)  for  the  Current  in  an  RLC- Circuit 

A general  solution  of  (1)  is  the  sum  1=1},  + Ip,  where  Iy,  is  a general  solution  of  the 
homogeneous  ODE  corresponding  to  (1)  and  Ip  is  a particular  solution  of  (1).  We  first 
determine  Ip  by  the  method  of  undetermined  coefficients,  proceeding  as  in  the  previous 
section.  We  substitute 

(2)  7p  = a cos  cot  + b sin  cot 

Ip  = co(—a  sin  cot  + b cos  cot) 

Ip  = coz(—a  cos  cot  — b sin  cot ) 

into  (1).  Then  we  collect  the  cosine  terms  and  equate  them  to  E0co  cos  cot  on  the  right, 
and  we  equate  the  sine  terms  to  zero  because  there  is  no  sine  term  on  the  right, 

Leo2 {—a)  + Rcob  + a/ C = E0co  (Cosine  terms) 

Lcoz(—b)  + Rco(—a)  + b/C  = 0 (Sine  terms). 

Before  solving  this  system  for  a and  b,  we  first  introduce  a combination  of  L and  C,  called 

the  reactance 


(3) 


S = 


coL 


1 

coC 


Dividing  the  previous  two  equations  by  co,  ordering  them,  and  substituting  S gives 

— Sa  + Rb  = E0 


—Ra  — Sb  = 0. 
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We  now  eliminate  b by  multiplying  the  first  equation  by  S and  the  second  by  R,  and 
adding.  Then  we  eliminate  a by  multiplying  the  first  equation  by  R and  the  second  by 
—S,  and  adding.  This  gives 

-(S2  + R2)a  = E0S,  (R2  + S2)b  = E0R. 

We  can  solve  for  a and  b, 


(4) 


EpS  _ E0R 

R2  + S2’  R2  + S2' 


Equation  (2)  with  coefficients  a and  b given  by  (4)  is  the  desired  particular  solution  Ip  of 
the  nonhomogeneous  ODE  (1)  governing  the  current  I in  an  ATC-circuit  with  sinusoidal 
electromotive  force. 

Using  (4),  we  can  write  Ip  in  terms  of  “physically  visible”  quantities,  namely,  amplitude 
I0  and  phase  lag  9 of  the  current  behind  the  EMF,  that  is, 


(5)  Ip(t ) = I0  sin  ( cot  — 0) 

where  [see  (14)  in  App.  A3.1] 


Iq  = Va2  + b2  = — , ° , tan  9 = — — = — . 

VR2  + S2  b R 

The  quantity  X'R2  + .S'2  is  called  the  impedance.  Our  formula  shows  that  the  impedance 
equals  the  ratio  Eq/Iq.  This  is  somewhat  analogous  to  E/I  = R (Ohm’s  law)  and,  because 
of  this  analogy,  the  impedance  is  also  known  as  the  apparent  resistance. 

A general  solution  of  the  homogeneous  equation  corresponding  to  (1)  is 

4 = OeA,t  + c2e^ 


where  Ai  and  A2  are  the  roots  of  the  characteristic  equation 


A2  + - A + — = 0. 
L LC 


We  can  write  these  roots  in  the  form  A]  = —a  + fi  and  A2  = —a  — /3,  where 


a = 


R 

2 V 


P 


4L 
C ' 


Now  in  an  actual  circuit,  R is  never  zero  (hence  R > 0).  From  this  it  follows  that  ly, 
approaches  zero,  theoretically  as  t — > 00,  but  practically  after  a relatively  short  time.  Hence 
the  transient  current  / = + Ip  tends  to  the  steady-state  current  Ip,  and  after  some  time 

the  output  will  practically  be  a harmonic  oscillation,  which  is  given  by  (5)  and  whose 
frequency  is  that  of  the  input  (of  the  electromotive  force). 
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EXAMPLE  1 


RLC-  Circuit 

Find  the  current  I(t)  in  an  /?LC-circuit  with  R = 11  Ll  (ohms),  L = 0.1  H (henry),  C = 10-2F  (farad),  which 
is  connected  to  a source  of  EMF  E(t ) = 1 10  sin  (60  • 2777)  = 1 10  sin  377  t (hence  60  Hz  = 60  cycles/  sec,  the 
usual  in  the  U.S.  and  Canada;  in  Europe  it  would  be  220  V and  50  Hz).  Assume  that  current  and  capacitor 
charge  are  0 when  t = 0. 

Solution.  Step  1.  General  solution  of  the  homogeneous  ODE.  Substituting  R,  L , C and  the  derivative  E\t) 
into  (1),  we  obtain 


0.1/"  + 11  /'  + 100/  = 110  • 377  cos  yilt. 

Hence  the  homogeneous  ODE  is  0.1/"  + 11/  + 100/  = 0.  Its  characteristic  equation  is 

0.1  A2  + 11A  + 100  = 0. 

The  roots  are  Ai  = —10  and  A2  = —100.  The  corresponding  general  solution  of  the  homogeneous  ODE  is 

W)  - ci e + c2e 

Step  2.  Particular  solution  Ip  of  (1).  We  calculate  the  reactance  S = 37.7  — 0.3  = 37.4  and  the  steady-state 
current 


/p(0  = a cos  377 1 + b sin  377 1 
with  coefficients  obtained  from  (4)  (and  rounded) 


_ -110  • 37.4  _ 
ll2  + 37.42 


110  ■ 11 
ll2  + 37.42 


0.796. 


Hence  in  our  present  case,  a general  solution  of  the  nonhomogeneous  ODE  (1)  is 

(6)  /(/)  = Cie“10t  + C2e~100t  - 2.71  cos  377 / + 0.796  sin  377/. 

Step  3.  Particular  solution  satisfying  the  initial  conditions.  How  to  use  (7(0)  = 0?  We  finally  determine  C\ 
and  c2  from  the  in  initial  conditions  /(0)  = 0 and  Q( 0)  = 0.  From  the  first  condition  and  (6)  we  have 

(7)  1(0)  = ci  + c2  - 2.71  = 0,  hence  c2  = 2.71  - cv 

We  turn  to  (7(0)  = 0.  The  integral  in  (1  ) equals  J I dt  = (7(0;  see  near  the  beginning  of  this  section.  Hence  for 
t = 0,  Eq.  (1  ) becomes 

Ll'  (0)  + R • 0 = 0,  so  that  /'(0)  = 0. 

Differentiating  (6)  and  setting  / = 0,  we  thus  obtain 

/'( 0)  = -10ci  - 100c2  + 0 + 0.796  • 377  = 0,  hence  by  (7),  -10ci  = 100(2.71  - cf)  - 300.1. 

The  solution  of  this  and  (7)  is  Ci  = —0.323,  c2  = 3.033.  Hence  the  answer  is 

/(f)  = — 0.323e-lot  + 3.033e-loot  - 2.71  cos  377 1 + 0.796  sin  377/. 

You  may  get  slightly  different  values  depending  on  the  rounding.  Figure  63  shows  I(t ) as  well  as  Ip(t),  which 
practically  coincide,  except  for  a very  short  time  near  t = 0 because  the  exponential  terms  go  to  zero  very  rapidly. 
Thus  after  a very  short  time  the  current  will  practically  execute  harmonic  oscillations  of  the  input  frequency 
60  Hz  = 60  cycles/ sec.  Its  maximum  amplitude  and  phase  lag  can  be  seen  from  (5),  which  here  takes  the  form 


Ip(t)  = 2.824  sin  (377 1 - 1.29). 
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Fig.  63.  Transient  (upper  curve)  and  steady-state  currents  in  Example  1 


Analogy  of  Electrical  and  Mechanical  Quantities 

Entirely  different  physical  or  other  systems  may  have  the  same  mathematical  model. 
For  instance,  we  have  seen  this  from  the  various  applications  of  the  ODE  y = ky  in 
Chap.  1 . Another  impressive  demonstration  of  this  unifying  power  of  mathematics  is 
given  by  the  ODE  (1)  for  an  electric  RLC- circuit  and  the  ODE  (2)  in  the  last  section  for 
a mass-spring  system.  Both  equations 

Li"  + Rl'  + —I  = £o<y  cos  cot  and  my"  + cy  + ky  = Focos  cot 

are  of  the  same  form.  Table  2.2  shows  the  analogy  between  the  various  quantities  involved. 
The  inductance  L corresponds  to  the  mass  m and,  indeed,  an  inductor  opposes  a change 
in  current,  having  an  “inertia  effect”  similar  to  that  of  a mass.  The  resistance  R corresponds 
to  the  damping  constant  c,  and  a resistor  causes  loss  of  energy,  just  as  a damping  dashpot 
does.  And  so  on. 

This  analogy  is  strictly  quantitative  in  the  sense  that  to  a given  mechanical  system  we 
can  construct  an  electric  circuit  whose  current  will  give  the  exact  values  of  the  displacement 
in  the  mechanical  system  when  suitable  scale  factors  are  introduced. 

The  practical  importance  of  this  analogy  is  almost  obvious.  The  analogy  may  be  used 
for  constructing  an  “electrical  model”  of  a given  mechanical  model,  resulting  in  substantial 
savings  of  time  and  money  because  electric  circuits  are  easy  to  assemble,  and  electric 
quantities  can  be  measured  much  more  quickly  and  accurately  than  mechanical  ones. 


Table  2.2  Analogy  of  Electrical  and  Mechanical  Quantities 


Electrical  System 

Mechanical  System 

Inductance  L 

Mass  m 

Resistance  R 

Damping  constant  c 

Reciprocal  1/C  of  capacitance 

Spring  modulus  k 

Derivative  E0co  cos  cot  of  | 
electromotive  force  j 

Driving  force  F0  cos  cot 

Current  I{t) 

Displacement  y(t) 
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Related  to  this  analogy  are  transducers,  devices  that  convert  changes  in  a mechanical 
quantity  (for  instance,  in  a displacement)  into  changes  in  an  electrical  quantity  that  can 
be  monitored;  see  Ref.  [GenRef  1 1 ] in  App.  1 . 


PRTrBTrE7^s^E~r~y;9 


RLC- CIRCUITS:  SPECIAL  CASES 

1.  RC-Circuit.  Model  the  RC- circuit  in  Fig.  64.  Find  the 
current  due  to  a constant  E. 

I ^ 

E(t) 


C 

Fig.  64.  RC-c  ircuit 

Current  lit) 

C 


t 

Fig.  65.  Current  1 in  Problem  1 


4.  RL-Circuit.  Solve  Prob.  3 when  E = E0  sin  cot  and  R, 
L,  E0,  and  are  arbitrary.  Sketch  a typical  solution. 


in  Problem  4 

5.  LC-Circuit.  This  is  an  RLC-circuit  with  negligibly 
small  R (analog  of  an  undamped  mass-spring  system). 
Find  the  current  when  L = 0.5  H,  C = 0.005  F,  and 
E = sin  t V,  assuming  zero  initial  current  and  charge. 


2.  RC-Circuit.  Solve  Prob.  1 when  E = Eq  sin  cot  and 
R,  C,  Eq,  and  to  are  arbitrary. 

3.  RL-Circuit.  Model  the  RL-circuit  in  Fig.  66.  Find  a 
general  solution  when  R,  L,  E are  any  constants.  Graph 
or  sketch  solutions  when  L = 0.25  FI,  R = 10  fi,  and 
£ = 48  V. 


R 


Fig.  66.  RL-circuit 


E(t) 

Fig.  69.  LC-circuit 


6.  LC-Gircuit.  Find  the  current  when  L = 0.5  H, 
C = 0.005  F,  E = 2 f 2 V,  and  initial  current  and  charge 
zero. 


7-18 


GENERAL  RLC-CIRCUITS 


7.  Tuning.  In  tuning  a stereo  system  to  a radio  station, 
we  adjust  the  tuning  control  (turn  a knob)  that  changes 
C (or  perhaps  L)  in  an  RLC-circuit  so  that  the  amplitude 
of  the  steady-state  current  (5)  becomes  maximum.  For 
what  C will  this  happen? 


8-14  Find  the  steady-state  current  in  the  RLC-circuit 
in  Fig.  61  for  the  given  data.  Show  the  details  of  your  work. 


8.  R = 4fl,L=0.5  H,C  = 0.1  F,  E = 500sin2fV 

9.  R = 4fl,L  = 0.1  H,  C = 0.05  F,  E = 110  V 
10.  R = 2 fi,  L = 1H,C  = ^F,£  = 157  sin  3?  V 
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11.  R = 12  n,  L = 0.4  H,  C = B5  F, 

E = 220  sin  10 /V 

12.  R = 0.2  fl,  L = 0.1  H,  C = 2 F,  E = 220  sin  314r  V 

13.  R = 12,  L = 1.2  H,  C = f • 10-3  F, 

E = 12,000  sin  25r  V 

14.  Prove  the  claim  in  the  text  that  if  R + 0 (hence  R > 0), 
then  the  transient  current  approaches  Ip  as  t — * °°. 

15.  Cases  of  damping.  What  are  the  conditions  for  an 
ULC-circuit  to  be  (I)  overdamped,  (II)  critically  damped, 
(III)  underdamped?  What  is  the  critical  resistance  f?cl-it 
(the  analog  of  the  critical  damping  constant  2 V/nL)? 

Solve  the  initial  value  problem  for  the  RLC- 
circuit  in  Fig.  61  with  the  given  data,  assuming  zero  initial 
current  and  charge.  Graph  or  sketch  the  solution.  Show  the 
details  of  your  work. 


16.  R = 8 H,  Z.  = 0.2  H,  C = 12.5  • 10-3  F, 

E = 100  sin  10/ V 

17.  R = 6 H,  L = 1 H,  C = 0.04  F, 

E = 600  (cos  / + 4 sin  f)  V 

18.  R = 18  12,  L = 1 H,  C = 12.5  • 10-3  F, 

E = 820  cos  10/  V 

19.  WRITING  REPORT.  Mechanic-Electric  Analogy. 

Explain  Table  2.2  in  a 1-2  page  report  with  examples, 
e.g.,  the  analog  (with  L = 1 H)  of  a mass-spring  system 
of  mass  5 kg,  damping  constant  10  kg/ sec,  spring  constant 
60  kg/ sec2,  and  driving  force  220  cos  10/  kg/sec. 

20.  Complex  Solution  Method.  Solve  Ll"  + rT'  + 
1/C  = Eoela>t,  i = V— 1,  by  substituting  Ip  = Kelalt 
( K unknown)  and  its  derivatives  and  taking  the  real 
part  Ip  of  the  solution  Ip.  Show  agreement  with  (2),  (4). 
Hint:  Use  (11)  ela>t  = cos  cot  + i sin  cot,  cf.  Sec.  2.2, 
and  i2  = — 1. 


2.10  Solution  by  Variation  of  Parameters 

We  continue  our  discussion  of  nonhomogeneous  linear  ODEs,  that  is 
(1)  y"  + p(x)y  + q(x)y  = r(x). 


In  Sec.  2.6  we  have  seen  that  a general  solution  of  (1)  is  the  sum  of  a general  solution  \yL 
of  the  corresponding  homogeneous  ODE  and  any  particular  solution  yp  of  (1).  To  obtain  yp 
when  r(x)  is  not  too  complicated,  we  can  often  use  the  method  of  undetermined  coefficients, 
as  we  have  shown  in  Sec.  2.7  and  applied  to  basic  engineering  models  in  Secs.  2.8  and  2.9. 

However,  since  this  method  is  restricted  to  functions  r(x ) whose  derivatives  are  of  a form 
similar  to  r(x)  itself  (powers,  exponential  functions,  etc.),  it  is  desirable  to  have  a method  valid 
for  more  general  ODEs  (1),  which  we  shall  now  develop.  It  is  called  the  method  of  variation 
of  parameters  and  is  credited  to  Lagrange  (Sec.  2.1).  Here  p,  q,  r in  (1)  may  be  variable 
(given  functions  of  x),  but  we  assume  that  they  are  continuous  on  some  open  interval  I. 

Lagrange’s  method  gives  a particular  solution  yp  of  (1)  on  / in  the  form 


(2) 


yP(x)  = -yi 


y2'\t  4- 

— ax  + y2 
W 


y-^dx 

W 


where  yi,  y2  form  a basis  of  solutions  of  the  corresponding  homogeneous  ODE 

(3)  y"  + p{x)y'  + q(x)y  = 0 
on  /,  and  W is  the  Wronskian  of  yi,  >’2, 

(4)  W = yiy2  - y2yi  (see  Sec.  2.6). 

CAUTION!  The  solution  formula  (2)  is  obtained  under  the  assumption  that  the  ODE 
is  written  in  standard  form,  with  y as  the  first  term  as  shown  in  (1).  If  it  starts  with 
f(x)y",  divide  first  by  f(x). 
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EXAMPLE  1 


The  integration  in  (2)  may  often  cause  difficulties,  and  so  may  the  determination  of 
y i , y 2 if  (1)  has  variable  coefficients.  If  you  have  a choice,  use  the  previous  method.  It  is 
simpler.  Before  deriving  (2)  let  us  work  an  example  for  which  you  do  need  the  new 
method.  (Try  otherwise.) 

Method  of  Variation  of  Parameters 

Solve  the  nonhomogeneous  ODE 


Solution.  A basis  of  solutions  of  the  homogeneous  ODE  on  any  interval  is  yi  = cos  x,  y%  = sin  x.  This  gives 
the  Wronskian 


W(yi,yz)  — cos  a cos  x — sinx(— sinx)  = 1. 

From  (2),  choosing  zero  constants  of  integration,  we  get  the  particular  solution  of  the  given  ODE 

yp  = —cos  A'| sin  x sec  xdx  + sin  x| cos  x sec  x dx 

= cos  x In  | cos  x\  + x sin  x (Fig.  70) 

Figure  70  shows  yp  and  its  first  term,  which  is  small,  so  that  x sin  x essentially  determines  the  shape  of  the  curve 
of  yp.  (Recall  from  Sec.  2.8  that  we  have  seen  x sin  x in  connection  with  resonance,  except  for  notation.)  From 
yp  and  the  general  solution  y h = Ciyi  + c^2  of  the  homogeneous  ODE  we  obtain  the  answer 

y = yh  + yp  — (ci  + In  |cos  jc|)  cos  x + (t£  + x)  sinx. 

Had  we  included  integration  constants  ~C\,  c 2 in  (2),  then  (2)  would  have  given  the  additional 
ci  cos  x + C2  sin  x = Ciyi  + that  is,  a general  solution  of  the  given  ODE  directly  from  (2).  This  will 

always  be  the  case. 


y 

10  - 


Fig.  70.  Particular  solution  yp  and  its  first  term  in  Example  1 


Idea  of  the  Method.  Derivation  of  (2) 

What  idea  did  Lagrange  have?  What  gave  the  method  the  name?  Where  do  we  use  the 
continuity  assumptions? 

The  idea  is  to  start  from  a general  solution 


yh(x)  = CiyyCx)  + c2y2W 
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of  the  homogeneous  ODE  (3)  on  an  open  interval  I and  to  replace  the  constants  (“the 
parameters”)  cq  and  c2  by  functions  u(x)  and  v(x);  this  suggests  the  name  of  the  method. 
We  shall  determine  it  and  v so  that  the  resulting  function 

(5)  yp(x)  = u(x)y1(x)  + v(x)y2(x) 

is  a particular  solution  of  the  nonhomogeneous  ODE  (1).  Note  that  y ^ exists  by  Theorem 
3 in  Sec.  2.6  because  of  the  continuity  of  p and  q on  I.  (The  continuity  of  r will  be  used 
later.) 

We  determine  u and  v by  substituting  (5)  and  its  derivatives  into  (1).  Differentiating  (5), 
we  obtain 


y'p  = m'.Vi  + uy[  + v'y2  + vy'2. 

Now  yp  must  satisfy  (1).  This  is  one  condition  for  two  functions  u and  v.  It  seems  plausible 
that  we  may  impose  a second  condition.  Indeed,  our  calculation  will  show  that  we  can 
determine  u and  v such  that  yp  satisfies  (1)  and  u and  v satisfy  as  a second  condition  the 
equation 

(6)  u'y1  + v'y2  = 0. 

This  reduces  the  first  derivative  yp  to  the  simpler  form 

(7)  yp  = uy  i + vy2. 

Differentiating  (7),  we  obtain 

(8)  yp  = u yx  + uyx  + v y2  + vy2 . 

We  now  substitute  yp  and  its  derivatives  according  to  (5),  (7),  (8)  into  (1).  Collecting 
terms  in  u and  terms  in  v,  we  obtain 

u(y"  + py'i  + qyi)  + V(y2  + py2  + qy2)  + u'y[  + v'y2  = r. 

Since  Vi  and  y2  are  solutions  of  the  homogeneous  ODE  (3),  this  reduces  to 

(9a)  u'y[  + v'y2  = r. 

Equation  (6)  is 

(9b)  u'y1  + v'y2  = 0. 

This  is  a linear  system  of  two  algebraic  equations  for  the  unknown  functions  it  and  v' . 
We  can  solve  it  by  elimination  as  follows  (or  by  Cramer’s  rule  in  Sec.  7.6).  To  eliminate 
v , we  multiply  (9a)  by  —y2  and  (9b)  by  y2  and  add,  obtaining 

u'(yiy2  - y2y{)  = “W,  thus  u'w  = -y2r. 

Here,  W is  the  Wronskian  (4)  of  _y  j , y2.  To  eliminate  u'  we  multiply  (9a)  by  yq,  and  (9b) 
by  — vi  and  add,  obtaining 
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v'(yiy2  ~ = ~yi r,  thus  v'W  = y± r. 

Since  yi,  v 2 form  a basis,  we  have  W A 0 (by  Theorem  2 in  Sec.  2.6)  and  can  divide  by  W, 


(10) 

f 

u 

1 

11 

! 

V = 

yir 

w ' 

By  integration, 

u = — 

. 

'yzr  , 

— dx, 
W 

V = 

. 

f ™dx. 
w 

These  integrals  exist  because  r(x)  is  continuous.  Inserting  them  into  (5)  gives  (2)  and 
completes  the  derivation. 


PR  OB  L EMSET  2 . 1 0 


GENERAL  SOLUTION 

Solve  the  given  nonhomogeneous  linear  ODE  by  variation 
of  parameters  or  undetermined  coefficients.  Show  the 
details  of  your  work. 

1.  y"  + 9y  = sec  3x 

2.  y"  + 9y  — esc  3x 

3.  x1 2y"  — 2 xy'  + 2y  = x3  sinre 

4.  y"  — Ay'  + 5y  = e2xcscx 

5.  y"  + y = cos  x — sin  x 

6.  (D2  + 6 D + 9 1)y  = \6e~3x/{x2  + 1) 

7.  ( D 2 - AD  + 41  )y  = 6e2x/x 4 

8.  ( D 2 + 41  )y  = cosh  2x 

9.  (D2  - 2D  + I)y  = 35x:t/2ex 
10.  ( D 2 + 2D  + 2/)_y  = 4e~xsecax 


11.  (x2D2  - AxD  + 6/)y  = 21x-4 

12.  (D2  - /)>■  = 1/cosh  jc 

13.  (x2D2  + xD  ~9I)y  = 48x5 6 

14.  TEAM  PROJECT.  Comparison  of  Methods.  Inven- 
tion. The  undetermined-coefficient  method  should  be 
used  whenever  possible  because  it  is  simpler.  Compare 
it  with  the  present  method  as  follows. 

(a)  Solve  y"  + Ay  + 3y  = 65  cos  2x  by  both  methods, 
showing  all  details,  and  compare. 

(b)  Solve  y"  — 2y'  + y = rj  + r2,  rj  = 35 x3^2exrz  — 
x2  by  applying  each  method  to  a suitable  function  on 
the  right. 

(c)  Experiment  to  invent  an  undetermined-coefficient 
method  for  nonhomogeneous  Euler-Cauchy  equations. 


CTTlflrPTE~R~2^R~EVTEW~QLJ  E S T I O N S AND  PROBLEMS 


1.  Why  are  linear  ODEs  preferable  to  nonlinear  ones  in 
modeling? 

2.  What  does  an  initial  value  problem  of  a second-order 
ODE  look  like?  Why  must  you  have  a general  solution 
to  solve  it? 

3.  By  what  methods  can  you  get  a general  solution  of  a 
nonhomogeneous  ODE  from  a general  solution  of  a 
homogeneous  one? 

4.  Describe  applications  of  ODEs  in  mechanical  systems. 
What  are  the  electrical  analogs  of  the  latter? 

5.  What  is  resonance?  How  can  you  remove  undesirable 
resonance  of  a construction,  such  as  a bridge,  a ship, 
or  a machine? 

6.  What  do  you  know  about  existence  and  uniqueness  of 
solutions  of  linear  second-order  ODEs? 


7-18 


GENERAL  SOLUTION 


Find  a general  solution.  Show  the  details  of  your  calculation. 

7.  Ay"  + 32/  + 63y  = 0 

8.  y"  + y — \2y  = 0 

9.  y"  + 6/  + 34y  = 0 

10.  y"  + 0.20/  + 0. 17y  = 0 

11.  (100D2  - 160D  + 64/ )y  = 0 

12.  (D2  + AttD  + 4T72/)y  = 0 

13.  {x2D2  + 2 xD  - 12 1)y  = 0 

14.  (x2D2  + xD  - 9 1)y  = 0 

15.  (2D2  - 3D  - 2 1)y  = 13  - 2x2 

16.  (D2  + 2D  + 2 1)y  = 3e_xcos  2x 

17.  (4D2  - 12D  + 9 1)y  = 2e15x 

18.  yy"  = 2y'2 


Summary  of  Chapter  2 
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INITIAL  VALUE  PROBLEMS 

Solve  the  problem,  showing  the  details  of  your  work. 

Sketch  or  graph  the  solution. 

19.  y"  + 16y  = \lex,  v(0)  = 6,  y'(0)  = -2 

20.  y"  — 3y'  + 2y  = lOsinx,  y(0)  = 1,  y'(0)  = —6 

21.  (x2D2  + xD  - I)y  = 16x3,  v(l)  = -1,  y'(l)  = 1 

22.  (x2D2  + 15 xD  + 49/  )y  = 0,  y(l)  = 2, 
v'(l)  = -11 

APPLICATIONS 

23.  Find  the  steady-state  current  in  the  RLC-circuit  in  Fig.  71 
when  R = 2 kft  (2000  ft),  L = 1 H,  C = 4 • 10-3  F,  and 
E = 110  sin  415?  V (66  cycles/sec). 

24.  Find  a general  solution  of  the  homogeneous  linear 
ODE  corresponding  to  the  ODE  in  Prob.  23. 

25.  Find  the  steady-state  current  in  the  RLC- circuit 
in  Fig.  71  when  R = 50  ft,  L = 30  H,  C = 0.025  F, 
E = 200  sin  4?  V. 


C 


E(t ) 

Fig.  71.  R/C-circuit 


26.  Find  the  current  in  the  RLC-circuit  in  Fig.  71 
when  R = 40  ft,  L = 0.4  H,  C = 10-4  F,  E = 
220  sin  314?  V (50  cycles/sec). 


27.  Find  an  electrical  analog  of  the  mass-spring  system 
with  mass  4 kg,  spring  constant  10  kg/ sec2,  damping 
constant  20  kg/sec,  and  driving  force  100  sin  4?  nt. 

28.  Find  the  motion  of  the  mass-spring  system  in  Fig.  72 
with  mass  0.125  kg,  damping  0,  spring  constant 
1.125  kg/sec2,  and  driving  force  cos  ? — 4 sin  ? nt,  ass- 
uming zero  initial  displacement  and  velocity.  For  what 
frequency  of  the  driving  force  would  you  get  resonance? 


Fig.  72.  Mass-spring  system 

29.  Show  that  the  system  in  Fig.  72  with  m = 4,  c = 0, 
k = 36,  and  driving  force  61  cos  3.1?  exhibits  beats. 
Hint:  Choose  zero  initial  conditions. 

30.  In  Fig.  72,  let  m = 1 kg,  c = 4kg/sec,  k = 24kg/sec2, 
and  r(t)  =10  cos  cot  nt.  Determine  w such  that  you 
get  the  steady-state  vibration  of  maximum  possible 
amplitude.  Determine  this  amplitude.  Then  find  the 
general  solution  with  this  co  and  check  whether  the  results 
are  in  agreement. 


SUMMA-RY  OT  CH  APTTR 

Second-Order  Linear  ODEs 


Second-order  linear  ODEs  are  particularly  important  in  applications,  for  instance, 
in  mechanics  (Secs.  2.4,  2.8)  and  electrical  engineering  (Sec.  2.9).  A second-order 
ODE  is  called  linear  if  it  can  be  written 

(1)  y"  + p(x)y  + q(x)y  = r(x ) (Sec.  2.1). 

(If  the  first  term  is,  say  ,f(x)y",  divide  by  fix)  to  get  the  “standard  form”  (1)  with 
y"  as  the  first  term.)  Equation  (1)  is  called  homogeneous  if  r(x)  is  zero  for  all  x 
considered,  usually  in  some  open  interval;  this  is  written  r(x)  = 0.  Then 

(2)  y"  + p(x)y'  + q(x)y  = 0. 

Equation  (1)  is  called  nonhomogeneous  if  r(x)  # 0 (meaning  r(x)  is  not  zero  for 
some  x considered). 
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For  the  homogeneous  ODE  (2)  we  have  the  important  superposition  principle  (Sec. 
2.1)  that  a linear  combination  y = kyi  + Zy2  of  two  solutions  yy,  y2  is  again  a solution. 

Two  linearly  independent  solutions  y1;  y2  of  (2)  on  an  open  interval  I form  a basis 
(or  fundamental  system)  of  solutions  on  I.  and  y = cqyi  + C2y2  with  arbitrary 
constants  ci,  c2  a general  solution  of  (2)  on  I.  From  it  we  obtain  a particular 
solution  if  we  specify  numeric  values  (numbers)  for  cy  and  c2,  usually  by  prescribing 
two  initial  conditions 

(3)  y(x0)  = K0,  y^xo)  = K1  (x0,  K0,  K1  given  numbers;  Sec.  2.1). 

(2)  and  (3)  together  form  an  initial  value  problem.  Similarly  for  (1)  and  (3). 

For  a nonhomogeneous  ODE  (1)  a general  solution  is  of  the  form 

(4)  y = yh  + yP  (Sec.  2.7). 

Flere  yy  is  a general  solution  of  (2)  and  yp  is  a particular  solution  of  (1).  Such  a yp 
can  be  determined  by  a general  method  ( variation  of  parameters.  Sec.  2.10)  or  in 
many  practical  cases  by  the  method  of  undetermined  coefficients . The  latter  applies 
when  (1)  has  constant  coefficients  p and  q,  and  r (x)  is  a power  of  x,  sine,  cosine, 
etc.  (Sec.  2.7).  Then  we  write  (1)  as 

(5)  y"  + ay'  + by  = r(x)  (Sec.  2.7). 

The  corresponding  homogeneous  ODE  y'  + ay'  + by  = 0 has  solutions  y = eAx, 
where  A is  a root  of 

(6)  A2  + aA  + b = 0. 


Flence  there  are  three  cases  (Sec.  2.2): 


Case 

Type  of  Roots 

General  Solution 

I 

Distinct  real  A1;  A2 

y = CleAlA'  + c2eA2A 

II 

Double  — g <7 

y = (ci  + c2x)e~ax'2 

III 

Complex  — 2a  ± ia>* 

y = e~ax^2(A  cos  <w*x  + B sin  «*x) 

Here  <w*  is  used  since  co  is  needed  in  driving  forces. 

Important  applications  of  (5)  in  mechanical  and  electrical  engineering  in  connection 
with  vibrations  and  resonance  are  discussed  in  Secs.  2.4,  2.7,  and  2.8. 

Another  large  class  of  ODEs  solvable  “algebraically”  consists  of  the  Euler-Cauchy 
equations 

(7)  x2y"  + axy'  + by  = 0 (Sec.  2.5). 

These  have  solutions  of  the  form  y = xm,  where  m is  a solution  of  the  auxiliary  equation 

(8)  m 2 + (a  — l)m  + b = 0. 

Existence  and  uniqueness  of  solutions  of  (1)  and  (2)  is  discussed  in  Secs.  2.6 
and  2.7,  and  reduction  of  order  in  Sec.  2.1. 


CHAPTER  3 


Higher  Order  Linear  ODEs 


The  concepts  and  methods  of  solving  linear  ODEs  of  order  n = 2 extend  nicely  to  linear 
ODEs  of  higher  order  n , that  is,  n = 3,  4,  etc.  This  shows  that  the  theory  explained  in 
Chap.  2 for  second-order  linear  ODEs  is  attractive,  since  it  can  be  extended  in  a 
straightforward  way  to  arbitrary  n.  We  do  so  in  this  chapter  and  notice  that  the  formulas 
become  more  involved,  the  variety  of  roots  of  the  characteristic  equation  (in  Sec.  3.2) 
becomes  much  larger  with  increasing  n,  and  the  Wronskian  plays  a more  prominent  role. 

The  concepts  and  methods  of  solving  second-order  linear  ODEs  extend  readily  to  linear 
ODEs  of  higher  order. 

This  chapter  follows  Chap.  2 naturally,  since  the  results  of  Chap.  2 can  be  readily 
extended  to  that  of  Chap.  3. 

Prerequisite:  Secs.  2.1,  2.2,  2.6,  2.7,  2.10. 

References  and  Answers  to  Problems:  App.  1 Part  A,  and  App.  2. 

3.1  Homogeneous  Linear  ODEs 

Recall  from  Sec.  1.1  that  an  ODE  is  of  nth  order  if  the  nth  derivative  y n>  = dny/dxn  of 
the  unknown  function  y{x)  is  the  highest  occurring  derivative.  Thus  the  ODE  is  of  the  form 

F(x,y,y',---,y^)  = 0 


where  lower  order  derivatives  and  y itself  may  or  may  not  occur.  Such  an  ODE  is  called 
linear  if  it  can  be  written 

(1)  y(n)  + /7n_1(x)y(n_1)  + • • • + pi(x)y'  + p0(x)y  = r(x). 

(For  n = 2 this  is  (1)  in  Sec.  2.1  with P\  = p and po  — q.)  The  coefficients  p0,  • • • , pn-\ 
and  the  function  r on  the  right  are  any  given  functions  of  x,  and  y is  unknown.  y(n>  has 
coefficient  1.  We  call  this  the  standard  form.  (If  you  have  pn(x)y(n>,  divide  by  pn(x) 
to  get  this  form.)  An  nth-order  ODE  that  cannot  be  written  in  the  form  (1)  is  called 

nonlinear. 

If  r(x)  is  identically  zero,  r(x)  = 0 (zero  for  all  x considered,  usually  in  some  open 
interval  I),  then  (1)  becomes 

(2)  y<n>  + p„_i(x)y(n_1)  + • • • + pi(x)y'  + p0(x)y  = 0 
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and  is  called  homogeneous.  If  r(x ) is  not  identically  zero,  then  the  ODE  is  called 
nonhomogeneous.  This  is  as  in  Sec.  2.1. 

A solution  of  an  nth-order  (linear  or  nonlinear)  ODE  on  some  open  interval  I is  a 
function  y = h(x)  that  is  defined  and  n times  differentiable  on  I and  is  such  that  the  ODE 
becomes  an  identity  if  we  replace  the  unknown  function  y and  its  derivatives  by  h and  its 
corresponding  derivatives. 

Sections  3. 1-3.2  will  be  devoted  to  homogeneous  linear  ODEs  and  Section  3.3  to 
nonhomogeneous  linear  ODEs. 

Homogeneous  Linear  ODE:  Superposition  Principle, 
General  Solution 

The  basic  superposition  or  linearity  principle  of  Sec.  2.1  extends  to  nth  order 
homogeneous  linear  ODEs  as  follows. 


THEOREM  1 


Fundamental  Theorem  for  the  Homogeneous  Linear  ODE  (2) 

For  a homogeneous  linear  ODE  (2),  sums  and  constant  multiples  of  solutions  on 
some  open  interval  I are  again  solutions  on  I.  (This  does  not  hold  for  a 
nonhomogeneous  or  nonlinear  ODE!) 


The  proof  is  a simple  generalization  of  that  in  Sec.  2.1  and  we  leave  it  to  the  student. 

Our  further  discussion  parallels  and  extends  that  for  second-order  ODEs  in  Sec.  2.1. 
So  we  next  define  a general  solution  of  (2),  which  will  require  an  extension  of  linear 
independence  from  2 to  n functions. 


DEFINITION 


General  Solution,  Basis,  Particular  Solution 

A general  solution  of  (2)  on  an  open  interval  I is  a solution  of  (2)  on  I of  the  form 

(3)  y(x)  = cqyiXx)  + ■ ■ ■ + cnyn{x)  (c1;  ■ ■ • , cn  arbitrary) 

where  vi,  • • • , yn  is  a hasis  (or  fundamental  system)  of  solutions  of  (2)  on  I;  that 
is,  these  solutions  are  linearly  independent  on  /,  as  defined  below. 

A particular  solution  of  (2)  on  I is  obtained  if  we  assign  specific  values  to  the 
n constants  c1;  • • • , cn  in  (3). 


DEFINITION 


Linear  Independence  and  Dependence 

Consider  n functions  yi(x),  • • • , yn(x)  defined  on  some  interval  I. 

These  functions  are  called  linearly  independent  on  I if  the  equation 

(4)  tiyiW  + + knyn(x)  = 0 on/ 

implies  that  all  k\,  ■ ■ ■ , kn  are  zero.  These  functions  are  called  linearly  dependent 
on  I if  this  equation  also  holds  on  I for  some  k\,  ■ ■ • , kn  not  all  zero. 


SEC.  3.1  Homogeneous  Linear  ODEs 
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EXAMPLE  2 


EXAMPLE  3 


If  and  only  if  y1;  • ■ ■ , yn  are  linearly  dependent  on  /,  we  can  express  (at  least)  one  of 
these  functions  on  I as  a “linear  combination”  of  the  other  n — 1 functions,  that  is,  as  a 
sum  of  those  functions,  each  multiplied  by  a constant  (zero  or  not).  This  motivates  the 
term  “linearly  dependent.”  For  instance,  if  (4)  holds  with  k i A 0,  we  can  divide  by  k t 
and  express  as  the  linear  combination 

yi  = -j-(k2y2  + •••  + knyn). 
k i 

Note  that  when  n = 2,  these  concepts  reduce  to  those  defined  in  Sec.  2.1. 

Linear  Dependence 

Show  that  the  functions  yi  = x2,y  2 — 5x,  y%  = 2x  are  linearly  dependent  on  any  interval. 

Solution.  y2  — Oyi  + 2.5y3.  This  proves  linear  dependence  on  any  interval. 

Linear  Independence 

Show  that  yi  — x,  y2  — x2,  y%  = x 3 are  linearly  independent  on  any  interval,  for  instance,  on  —1  ^ x ^ 2. 
Solution.  Equation  (4)  is  /cijc  + k2X2  + k&c3  = 0.  Taking  (a)  x = — 1,  (b)  x = 1,  (c)  x = 2,  we  get 

(a)  -ki  + k2  ~ k3  = 0,  (b)  kx  + k2  + k3  = 0,  (c)  2kx  + 4 k2  + 8 k3  = 0. 

^2  — 0 from  (a)  + (b).  Then  k 3 = 0 from  (c)  —2(b).  Then  k 1 = 0 from  (b).  This  proves  linear  independence. 
A better  method  for  testing  linear  independence  of  solutions  of  ODEs  will  soon  be  explained. 

General  Solution.  Basis 

Solve  the  fourth-order  ODE 

yiv  - 5 y"  + 4y  = 0 (where  yiv  = d*y/cbc 4). 

Solution.  As  in  Sec.  2.2  we  substitute  y = eXx . Omitting  the  common  factor  eAx,  we  obtain  the  characteristic 
equation 

A4  — 5 A2  + 4 = 0. 


This  is  a quadratic  equation  in  fi  = A2,  namely, 

/x2  - 5/x  + 4 = (/x  - 1 )(jl  ~ 4)  = 0. 

The  roots  are  /jl  = 1 and  4.  Hence  A = —2,  —1,  1,  2.  This  gives  four  solutions.  A general  solution  on  any 
interval  is 


„ —2x  1 —x  1 x 1 2x 

y — C\C  + C2<z  ' c%e  + c^e 


provided  those  four  solutions  are  linearly  independent.  This  is  true  but  will  be  shown  later. 


Initial  Value  Problem.  Existence  and  Uniqueness 

An  initial  value  problem  for  the  ODE  (2)  consists  of  (2)  and  n initial  conditions 

(5)  y(x0)  = K0,  y\x0)  = K1,  y("_1)(x0)  = ^n-i 


with  given  x0  in  the  open  interval  I considered,  and  given  K0,  • • ■ , Kn_  \ . 
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THEOREM  2 


EXAMPLE  4 


In  extension  of  the  existence  and  uniqueness  theorem  in  Sec.  2.6  we  now  have  the 
following. 


Existence  and  Uniqueness  Theorem  for  Initial  Value  Problems 

If  the  coefficients  Po(pc),  ■ • ■ ,pn- i(x)  of  (2)  are  continuous  on  some  open  interval  I 
and  x0  is  in  I,  then  the  initial  value  problem  (2),  (5)  has  a unique  solution  v(x)  on  I. 


Existence  is  proved  in  Ref.  [All]  in  App.  1.  Uniqueness  can  be  proved  by  a slight 
generalization  of  the  uniqueness  proof  at  the  beginning  of  App.  4. 

Initial  Value  Problem  for  a Third-Order  Euler-Cauchy  Equation 

Solve  the  following  initial  value  problem  on  any  open  interval  I on  the  positive  x-axis  containing  x = 1 . 

x3ym  — 3 x2y"  + 6 xy'  — 6y  = 0,  y(l)  = 2,  )/(l)  — 1,  y\  1)  = —4. 

Solution.  Step  1.  General  solution.  As  in  Sec.  2.5  we  try  y = xm.  By  differentiation  and  substitution, 

m(m  — 1 )(m  — 2)xm  — 3 m(m  — l)xm  + 6 mxm  — 6xm  = 0. 

Dropping  xm  and  ordering  gives  m3  — 6 m2  + 1 \m  — 6 = 0.  If  we  can  guess  the  root  m = 1.  We  can  divide 
by  m — 1 and  find  the  other  roots  2 and  3,  thus  obtaining  the  solutions  x,  x , xd,  which  are  linearly  independent 
on  I (see  Example  2).  [In  general  one  shall  need  a root-finding  method,  such  as  Newton’s  (Sec.  19.2),  also 
available  in  a CAS  (Computer  Algebra  System).]  Hence  a general  solution  is 

y = c\x  + C2X2  + c3x3 

valid  on  any  interval  /,  even  when  it  includes  x = 0 where  the  coefficients  of  the  ODE  divided  by  x3  (to  have 
the  standard  form)  are  not  continuous. 

Step  2.  Particular  solution.  The  derivatives  are  yr  = C\  + 2c2*  + Ic^x2  and  y"  = 2^2  + 6c%x.  From  this,  and 
y and  the  initial  conditions,  we  get  by  setting  x = 1 

(a)  y(l)  = ci  + c2  + c3  = 2 

(b)  /( 1)  = d + 2c2  + 3c3=  1 

(c)  y"{  1)  = 2c2  + 6c3  = -4. 

This  is  solved  by  Cramer’s  rule  (Sec.  7.6),  or  by  elimination,  which  is  simple,  as  follows,  (b)  — (a)  gives 

(d)  c2  + 2 c3  = — 1.  Then  (c)  — 2(d)  gives  c3  = —1.  Then  (c)  gives  C2  = 1.  Finally  C\  = 2 from  (a). 

Answer:  y = 2x  + x2  — x3. 

Linear  Independence  of  Solutions.  Wronskian 

Linear  independence  of  solutions  is  crucial  for  obtaining  general  solutions.  Although  it  can 
often  be  seen  by  inspection,  it  would  be  good  to  have  a criterion  for  it.  Now  Theorem  2 
in  Sec.  2.6  extends  from  order  n = 2 to  any  n.  This  extended  criterion  uses  the  Wronskian 
W of  n solutions  yq,  • ■ • , yn  defined  as  the  nth-order  determinant 


(6) 


W(y  i,  • • ■ , yn) 


y i 

y 2 

yn 

f 

y i 

t 

y-i 

yn 

(n— D 

(n—l) 

_,(n— 

yi 

y 2 

yn 
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PROOF 


EXAMPLE  5 


Note  that  W depends  on  x since  Ji,  • ■ ■ , yn  do.  The  criterion  states  that  these  solutions 
form  a basis  if  and  only  if  W is  not  zero;  more  precisely: 


Linear  Dependence  and  Independence  of  Solutions 

Let  the  ODE  (2)  have  continuous  coefficients  Po(x),  ■ ■ ■ , pn-i(x)  on  an  open  interval 
I.  Then  n solutions  y\,  • • • , yn  of  (2)  on  I are  linearly  dependent  on  I if  and  only  if  their 
Wronskian  is  zero  for  some  x = xq  in  I.  Furthermore,  ifW  is  zero  for  x = Xq,  then  W 
is  identically  zero  on  I.  Hence  if  there  is  an  x i in  I at  which  W is  not  zero,  then  yi,  ■ ■ • , yn 
are  linearly  independent  on  I,  so  that  they  form  a basis  of  solutions  of  (2)  on  I. 


(a)  Let  yi,  ■ ,yn  be  linearly  dependent  solutions  of  (2)  on  I.  Then,  by  definition,  there 
are  constants  k\,  ■ ■ ■ , kn  trot  all  zero,  such  that  for  all  x in  I, 


(7)  ^i>’i  + • • • + knyn  = 0. 

By  n — I differentiations  of  (7)  we  obtain  for  all  x in  I 

kiy[  + • • • + kny'n  = 0 

(8)  : 


*i.vin_1)  + 


+ kny(n  1:)  - 0. 


(7),  (8)  is  a homogeneous  linear  system  of  algebraic  equations  with  a nontrivial  solution 
ki,---,  kn.  Hence  its  coefficient  determinant  must  be  zero  for  every  x on  I,  by  Cramer’s 
theorem  (Sec.  7.7).  But  that  determinant  is  the  Wronskian  W,  as  we  see  from  (6).  Hence 
W is  zero  for  every  x on  I. 

(b)  Conversely,  if  W is  zero  at  an  xq  in  I,  then  the  system  (7),  (8)  with  x = xo  has  a 
solution  k*,  • • • , not  all  zero,  by  the  same  theorem.  With  these  constants  we  define 
the  solution  y*  = k*y i + • ■ ■ + kn\n  of  (2)  on  I.  By  (7),  (8)  this  solution  satisfies  the 
initial  conditions  y*(jto)  = 0,  ■ ■ • , y*(m_ 1 Vxo)  = 0.  But  another  solution  satisfying  the 
same  conditions  is  y = 0.  Hence  y*  = y by  Theorem  2,  which  applies  since  the  coefficients 
of  (2)  are  continuous.  Together,  y*  = k*yi  + • ■ ■ + kfLyn  = 0 on  I.  This  means  linear 
dependence  of  yi,  ■ • • , yn  on  I. 

(c)  If  W is  zero  at  an  jc0  in  I,  we  have  linear  dependence  by  (b)  and  then  W = 0 by  (a). 
Hence  if  W is  not  zero  at  an  .r  j in  I,  the  solutions  yi,  ■ ■ • , yn  must  be  linearly  independent 
on  I. 


Basis,  Wronskian 

We  can  now  prove  that  in  Example  3 we  do  have  a basis.  In  evaluating  W,  pull  out  the  exponential  functions 
columnwise.  In  the  result,  subtract  Column  1 from  Columns  2,  3,  4 (without  changing  Column  1 ).  Then  expand  by 
Row  1.  In  the  resulting  third-order  determinant,  subtract  Column  1 from  Column  2 and  expand  the  result  by  Row  2: 


W = 


£-2x 

e“x 

ex 

e2* 

1 

1 

1 

1 

— 2e-2x 

—e~x 

ex 

le2* 

-2 

-1 

1 

2 

4e~2x 

e~x 

ex 

4e2x 

4 

1 

1 

4 

(N 

1 

OO 

1 

-e~x 

ex 

OO 

-8 

-1 

1 

8 

1 3 4 

-3  -3  0 

7 9 16 


= 72. 


no 
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THEOREM  4 


PROOF 


THEOREM  5 


PROOF 


A General  Solution  of  (2)  Includes  All  Solutions 

Let  us  first  show  that  general  solutions  always  exist.  Indeed,  Theorem  3 in  Sec.  2.6  extends 
as  follows. 


Existence  of  a General  Solution 

If  the  coefficients  Po(x),  • • • , pn_i(x)  of  ( 2)  are  continuous  on  some  open  interval  I, 
then  (2)  has  a general  solution  on  I. 

We  choose  any  fixed  x0  in  /.  By  Theorem  2 the  ODE  (2)  has  n solutions  yi,  • • ■ , yn,  where 
yj  satisfies  initial  conditions  (5)  with  Kj-i  = 1 and  all  other  K’s  equal  to  zero.  Their 
Wronskian  at  Xq  equals  1.  For  instance,  when  n = 3,  then  v i Cx f) J = l,y20o)  = 1> 
y30ro)  = and  the  other  initial  values  are  zero.  Thus,  as  claimed. 


yi(xo) 

>’20*  o) 

>3(*o) 

1 

0 

0 

W(yi(x0),  y2(xQ),  y3(x0))  = 

yitxo) 

>’20t  o) 

>30  o) 

= 

0 

1 

0 

TiOo) 

>2  (*  o) 

>3  Oo) 

0 

0 

1 

Hence  for  any  n those  solutions  , yn  are  linearly  independent  on  /,  by  Theorem  3. 

They  form  a basis  on  /,  and  y = < i Vi  + • • • + cnyn  is  a general  solution  of  (2)  on  I. 

We  can  now  prove  the  basic  property  that,  from  a general  solution  of  (2),  every  solution 
of  (2)  can  be  obtained  by  choosing  suitable  values  of  the  arbitrary  constants.  Hence  an 
nth-order  linear  ODE  has  no  singular  solutions,  that  is,  solutions  that  cannot  be  obtained 
from  a general  solution. 

General  Solution  Includes  All  Solutions 

If  the  ODE  (2)  has  continuous  coefficients  po(x),  ■ ■ • , pn-i(x)  on  some  open  interval 
I,  then  every  solution  y = Y(x)  of  (2)  on  I is  of  the  form 

(9)  Y{x)  = Ci>’i(x)  + ■ • • + Cnyn(x) 

where  yi,  • ,yn  is  a basis  of  solutions  of  (2)  on  land  C\,  ■ • ■ , Cn  are  suitable  constants. 


Let  Y be  a given  solution  and  y = cqyq  + • • • + cnyn  a general  solution  of  (2)  on  I.  We 
choose  any  fixed  x0  in  I and  show  that  we  can  find  constants  C\,  ■ ■ ■ , cn  for  which  y and 
its  first  n — 1 derivatives  agree  with  Y and  its  corresponding  derivatives  at  x0.  That  is, 
we  should  have  at  x = Xq 


Cl>l  + • ' 

+ 

Cnyn 

= Y 

ci>i  + ■ ' 

+ 

i 

Cnyn 

= Y 

c1/in“1)  + 


+ cny^  = ^_1)- 


But  this  is  a linear  system  of  equations  in  the  unknowns  Ci,  • ■ • , cn.  Its  coefficient 
determinant  is  the  Wronskian  W of  y1;  • • • , yn  at  x0.  Since  yi,  ■ ■ • , yn  form  a basis,  they 
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are  linearly  independent,  so  that  W is  not  zero  by  Theorem  3.  Hence  (10)  has  a unique 
solution  ci  = Ci,  • ■ ■ , cn  = Cn  (by  Cramer’s  theorem  in  Sec.  7.7).  With  these  values  we 
obtain  the  particular  solution 

y*{x)  = Ci>’i(.r)  + ■ ■ • + Cnyn(x) 

on  I.  Equation  (10)  shows  that  y*  and  its  first  n — 1 derivatives  agree  at  x0  with  Y and 
its  corresponding  derivatives.  That  is,  y*  and  Y satisfy,  at  x0,  the  same  initial  conditions. 
The  uniqueness  theorem  (Theorem  2)  now  implies  that  y*  = Y on  I.  This  proves  the 
theorem.  ■ 

This  completes  our  theory  of  the  homogeneous  linear  ODE  (2).  Note  that  for  n = 2 it  is 
identical  with  that  in  Sec.  2.6.  This  had  to  be  expected. 


BASES:  TYPICAL  EXAMPLES 

To  get  a feel  for  higher  order  ODEs,  show  that  the  given 
functions  are  solutions  and  form  a basis  on  any  interval. 
Use  Wronskians.  In  Prob.  6,  x > 0, 

1.  l,x,x2,x3,  yiv  = 0 

2.  ex,  e~x , eZx,  y'"  - 2y"  - y'  + 2y  = 0 

3.  cos  x,  sin  x,  x cos  x,  x sin  x,  vlv  + 2y"  + y = 0 

4.  e-4x,  xc_4x,  x2c_4x,  y"  + 12y"  + 48/  + 64y  = 0 

5.  1,  e~x  cos  2x,  e~x  sin  2x,  y"  + 2y"  + 5y'  = 0 
. 1,  x , x , x y — 3 xy  + 3y  =0 

7.  TEAM  PROJECT.  General  Properties  of  Solutions 
of  Linear  ODEs.  These  properties  are  important  in 
obtaining  new  solutions  from  given  ones.  Therefore 
extend  Team  Project  38  in  Sec.  2.2  to  nth-order  ODEs. 
Explore  statements  on  sums  and  multiples  of  solutions 
of  (1)  and  (2)  systematically  and  with  proofs. 
Recognize  clearly  that  no  new  ideas  are  needed  in  this 
extension  from  n — 2 to  general  n. 

LINEAR  INDEPENDENCE 

Are  the  given  functions  linearly  independent  or  dependent 
on  the  half-axis  x £ 0?  Give  reason. 

8.  x2,  1/x2,  0 9.  tan  x,  cot  x,  1 


-t  a 2x  2x  2 2x  -*  -t  x x • x 

10.  e , xe  , x e 11.  e cosx,  e sin*,  e 

12.  sin2  x,  cos2  x,  cos  2x  13.  sin  x,  cos  x,  sin  2x 

14.  cos2  x,  sin2  x,  27 r 15.  cosh  2x,  sinh  2x,  e2x 

16.  TEAM  PROJECT.  Linear  Independence  and 
Dependence,  (a)  Investigate  the  given  question  about 
a set  S of  functions  on  an  interval  I.  Give  an  example. 
Prove  your  answer. 

(1)  If  S contains  the  zero  function,  can  S be  linearly 
independent? 

(2)  If  S is  linearly  independent  on  a subinterval  J of  /, 
is  it  linearly  independent  on  11 

(3)  If  S is  linearly  dependent  on  a subinterval  J of  /, 
is  it  linearly  dependent  on  II 

(4)  If  S is  linearly  independent  on  /,  is  it  linearly 
independent  on  a subinterval  J1 

(5)  If  S is  linearly  dependent  on  /,  is  it  linearly 
independent  on  a subinterval  J1 

(6)  If  S is  linearly  dependent  on  /,  and  if  T contains  S, 
is  T linearly  dependent  on  II 

(b)  In  what  cases  can  you  use  the  Wronskian  for 
testing  linear  independence?  By  what  other  means  can 
you  perform  such  a test? 


Homogeneous  Linear  ODEs 
with  Constant  Coefficients 


We  proceed  along  the  lines  of  Sec.  2.2,  and  generalize  the  results  from  n = 2 to  arbitrary  n. 
We  want  to  solve  an  nth-order  homogeneous  linear  ODE  with  constant  coefficients, 
written  as 


(ri)  i in—  1)  i i t i r\ 

y + cin-iy  +■■■+  my  + a0y  = 0 


(1) 
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where  v n)  = dny/dxn,  etc.  As  in  Sec.  2.2,  we  substitute  y = eAX  to  obtain  the  characteristic 
equation 

(2) 


A(n)  + an-  + 


+ a\ A + cioy  = 0 


of  (1).  If  A is  a root  of  (2),  then  y = eAx  is  a solution  of  (1).  To  find  these  roots,  you  may 
need  a numeric  method,  such  as  Newton’s  in  Sec.  19.2,  also  available  on  the  usual  CASs. 
For  general  n there  are  more  cases  than  for  n = 2.  We  can  have  distinct  real  roots,  simple 
complex  roots,  multiple  roots,  and  multiple  complex  roots,  respectively.  This  will  be  shown 
next  and  illustrated  by  examples. 

Distinct  Real  Roots 

If  all  the  n roots  Ai,  • • • , \n  of  (2)  are  real  and  different,  then  the  n solutions 


(3) 


yi  = eAlX, 


yn  = 


constitute  a basis  for  all  x.  The  corresponding  general  solution  of  (1)  is 


(4) 


y = CleAlX  + 


“h  Cn6 


Indeed,  the  solutions  in  (3)  are  linearly  independent,  as  we  shall  see  after  the  example. 

Distinct  Real  Roots 

Solve  the  ODE  ym  — 2 y"  — y'  + 2y  = 0. 

Solution.  The  characteristic  equation  is  A3  — 2A2  — A + 2 = 0.  It  has  the  roots  —1,  1,2;  if  you  find  one 
of  them  by  inspection,  you  can  obtain  the  other  two  roots  by  solving  a quadratic  equation  (explain!).  The 
corresponding  general  solution  (4)  is  y = C\e~x  + c^ex  + c^e2x. 

Linear  Independence  of  (3).  Students  familiar  with  nth-order  determinants  may  verify 
that,  by  pulling  out  all  exponential  functions  from  the  columns  and  denoting  their  product 
by  E — exp  [Ai  + ■ • • + An)x],  the  Wronskian  of  the  solutions  in  (3)  becomes 


(5) 


W = 


AieAlX 

A?eAlX 


\ n—  1 

Ai  e 1 


= E 


Af 


\U—  1 

Ai 


„a2x 


A2eA2X 

A|eA2X 


Arvax 

1 

A2 

A 1 ••• 


\ n—  1 

A2 


„Anx 


A n.e 


\ 2 Anx 


\ n—  1 „ArtX 
An.  € 


\ n—  1 
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THEOREM  1 


THEOREM  2 


EXAMPLE  2 


The  exponential  function  E is  never  zero.  Hence  W = 0 if  and  only  if  the  determinant  on 
the  right  is  zero.  This  is  a so-called  Vandermonde  or  Cauchy  determinant.1  It  can  be 
shown  that  it  equals 

(6)  ( — !)«(«- l)/2y 

where  V is  the  product  of  all  factors  A — A/.  with  j < k (s=  «);  for  instance,  when  n = 3 
we  get  — V = —(Aj  — A2)(A1  — A3)(A2  — A3).  This  shows  that  the  Wronskian  is  not  zero 
if  and  only  if  all  the  n roots  of  (2)  are  different  and  thus  gives  the  following. 


Basis 

Solutions  yi  = eAlX,  ■ ■ ■ , yn  = eA"x  of  (1)  ( with  any  real  or  complex  A j’s)  form  a 
basis  of  solutions  of  { 1)  on  any  open  interval  if  and  only  if  all  n roots  of  { 2)  are 
different. 


Actually,  Theorem  1 is  an  important  special  case  of  our  more  general  result  obtained 
from  (5)  and  (6): 


Linear  Independence 

Any  number  of  solutions  of  { 1)  of  the  form  eAx  are  linearly  independent  on  an  open 
interval  I if  and  only  if  the  corresponding  A are  all  different. 


Simple  Complex  Roots 

If  complex  roots  occur,  they  must  occur  in  conjugate  pairs  since  the  coefficients  of  (1) 
are  real.  Thus,  if  A = y + ia>  is  a simple  root  of  (2),  so  is  the  conjugate  A = y — ico,  and 
two  corresponding  linearly  independent  solutions  are  (as  in  Sec.  2.2,  except  for  notation) 

yi  = eJX  cos  (ox,  y2  = eyx  sin  wx. 


Simple  Complex  Roots.  Initial  Value  Problem 

Solve  the  initial  value  problem 

y"  - y"  + 100)/  - 100 y = 0,  y(0)  = 4,  /( 0)  = 11,  /(0)  = -299. 

Solution.  The  characteristic  equation  is  A3  — A2  + 100 A — 100  = 0.  It  has  the  root  1,  as  can  perhaps  be 
seen  by  inspection.  Then  division  by  A — 1 shows  that  the  other  roots  are  ±10/.  Hence  a general  solution  and 
its  derivatives  (obtained  by  differentiation)  are 

y = c\ex  + A cos  lOx  + B sin  lOx, 
y'  = Clex  — 10A  sin  10x  + 105  cos  lOx, 
y"  = Clex  — 100A  cos  10x  — 1005  sin  10*. 


!ALEXANDRE  THEOPHILE  VANDERMONDE  (1735-1796),  French  mathematician,  who  worked  on 
solution  of  equations  by  determinants.  For  CAUCHY  see  footnote  4,  in  Sec.  2.5. 
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From  this  and  the  initial  conditions  we  obtain,  by  setting  x = 0, 

(a)  cj  + A = 4,  (b)  ci  + 108  =11.  (c)  cx  - 100A  = -299. 

We  solve  this  system  for  the  unknowns  A,  B,  C\.  Equation  (a)  minus  Equation  (c)  gives  101A  = 303,  A = 3. 
Then  Ci  = 1 from  (a)  and  B = 1 from  (b).  The  solution  is  (Fig.  73) 

y = ex  + 3 cos  1 Or  + sin  I O r. 

This  gives  the  solution  curve,  which  oscillates  about  ex  (dashed  in  Fig.  73). 

y 

20 

10 

4 
0 

Fig.  73.  Solution  in  Example  2 


Multiple  Real  Roots 

If  a real  double  root  occurs,  say,  Aj  = A2,  then  >q  = y2  in  (3),  and  we  take  yq  and  xyq  as 
corresponding  linearly  independent  solutions.  This  is  as  in  Sec.  2.2. 

More  generally,  if  A is  a real  root  of  order  m,  then  m corresponding  linearly  independent 
solutions  are 


(7) 


xe 


2 Ax 

x e , 


m— 1 


Ax 


We  derive  these  solutions  after  the  next  example  and  indicate  how  to  prove  their  linear 
independence. 


Real  Double  and  Triple  Roots 

Solve  the  ODE  y°  - 3yiv  + 3y"'  - y"  = 0. 

Solution.  The  characteristic  equation  A5  — 3A4  + 3A3  — A2  = 0 has  the  roots  A;  = A2  = 0,  and  A;.  = A4  = 
A5  = 1 , and  the  answer  is 

(8)  y = Ci  + C2X  + (C3  + C4X  4-  c5xz)ex.  H 

Derivation  of( 7).  We  write  the  left  side  of  (1)  as 

£[y]  = yCn)  + fln-i  yCn~v  + • ■ • + ao.v- 

Let  y = eAx.  Then  by  performing  the  differentiations  we  have 


L[eAx]  — (A11  + an—i  A™  1 ■ + cio)eAx. 
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Now  let  A]  be  a root  of  /77th  order  of  the  polynomial  on  the  right,  where  m = n.  For  m < n 
let  Am+i,  • • ■ , Am  be  the  other  roots,  all  different  from  A] . Writing  the  polynomial  in 
product  form,  we  then  have 


Lle^]  = (A  - A1)m/z(A)eAx 

with  h( A)  = 1 if  m = n,  and  h( A)  = (A  — Am+1)  ■ • • (A  — An)  if  in  < n.  Now  comes  the 
key  idea:  We  differentiate  on  both  sides  with  respect  to  A, 

(9)  — L[eAx]  = m(A  - X1)m~1h(X)eAx  + (A  - \1)m  — [h(\)eAx]. 


The  differentiations  with  respect  to  x and  A are  independent  and  the  resulting  derivatives 
are  continuous,  so  that  we  can  interchange  their  order  on  the  left: 


(10) 


^rL[eAx]  = L 
o A 


Ax 


dA 


L[xeXx], 


The  right  side  of  (9)  is  zero  for  A = Ai  because  of  the  factors  A — Ai  (and  in  =g  2 since 
we  have  a multiple  root!).  Hence  L[xeAlX]  = 0 by  (9)  and  (10).  This  proves  that  xeA,x  is 
a solution  of  (1). 

We  can  repeat  this  step  and  produce  x2eAlX,  ■ ■ ■ , xm~1eAlX  by  another  m — 2 such 
differentiations  with  respect  to  A.  Going  one  step  further  would  no  longer  give  zero  on  the 
right  because  the  lowest  power  of  A — Ai  would  then  be  (A  — Ax)0,  multiplied  by  m'.h(k) 
and  h(\i)  i=  0 because  h( A)  has  no  factors  A — Ap  so  we  get  precisely  the  solutions  in  (7). 

We  finally  show  that  the  solutions  (7)  are  linearly  independent.  For  a specific  n this 
can  be  seen  by  calculating  their  Wronskian,  which  turns  out  to  be  nonzero.  For  arbitrary 
m we  can  pull  out  the  exponential  functions  from  the  Wronskian.  This  gives  (eAx)m  = eAmx 
times  a determinant  which  by  “row  operations”  can  be  reduced  to  the  Wronskian  of  1, 
x,  ■ ■ ■ , xm~ . The  latter  is  constant  and  different  from  zero  (equal  to  1!2!  • • • (m  — 1)!). 
These  functions  are  solutions  of  the  ODE  y<m)  = 0,  so  that  linear  independence  follows 
from  Theroem  3 in  Sec.  3.1. 


Multiple  Complex  Roots 

In  this  case,  real  solutions  are  obtained  as  for  complex  simple  roots  above.  Consequently, 
if  A = y + ico  is  a complex  double  root,  so  is  the  conjugate  A = y — ioj.  Corresponding 
linearly  independent  solutions  are 

(11)  eyx  cos  wx,  eyx  sin  wx,  xeyx  cos  wx,  xeyx  sin  ojx. 


The  first  two  of  these  result  from  eAx  and  eAx  as  before,  and  the  second  two  from  xeAx 
and  xeAx  in  the  same  fashion.  Obviously,  the  corresponding  general  solution  is 

(12)  y = eyx[(A1  + A2x ) cos  cox  + ( Bi  + Bzx)  sin  <uv]. 

For  complex  triple  roots  (which  hardly  ever  occur  in  applications),  one  would  obtain 
two  more  solutions  x2eyx  cos  cox,  x2eyx  sin  cox,  and  so  on. 
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GENERAL  SOLUTION 


Solve  the  given  ODE.  Show  the  details  of  your  work. 


1.  /"  + 25/  = 0 

2.  yiv  + 2 y"  + y = 0 

3.  yiv  + Ay"  = 0 

4.  (D3  - D2  - D + /)y  = 0 

5.  (D4  + 10D2  + 9I)y  = 0 

6.  (D5  + 8D3  + 16 D)y  = 0 


7-13 


INITIAL  VALUE  PROBLEM 


Solve  the  I VP  by  a CAS,  giving  a general  solution  and  the 
particular  solution  and  its  graph. 


7.  /"  + 3.2y"  + 4.81/  = 0,  v(0)  = 3.4,  /(0)  = -4.6, 
y"(0)  = 9.91 


8.  y"  + 7.5/'  + 14.25/  - 9.125y  = 0,  y(0)  = 10.05, 
/(0)  = -54.975,  y"(0)  = 257.5125 

9.  Ay"  + 8y"  +41/  + 37y  = 0,  y(0)  = 9, 
y'(0)  = -6.5,  y"(0)  = -39.75 

10.  yiv  + 4y  = 0,  y(0)  = J,  /(0)=-|  /'(0)  = |, 

m 7 


11.  yiv  - 9y"  - 400v  = 0,  y(0)  = 0,  /(O)  = 0, 
/'(0)  = 41,  /"(0)  = 0 

12.  yv  - 5y"'  + Ay'  = 0,  v(0)  = 3,  /(0)  = -5, 
y"(0)  = 11,  y'"(0)  = -23,  yiv(0)  = 47 


13.  yiv+  0.45y"'  - 0.165y"  + 0.0045y'  - 0.001 75y  = 0, 
y(0)  = 17.4,  y'(0)  = -2.82,  /'(0)  = 2.0485, 
/"(0)  = -1.458675 

14.  PROJECT.  Reduction  of  Order.  This  is  of  practical 
interest  since  a single  solution  of  an  ODE  can  often  be 
guessed.  For  second  order,  see  Example  7 in  Sec.  2.1. 

(a)  How  could  you  reduce  the  order  of  a linear 
constant-coefficient  ODE  if  a solution  is  known? 

(b)  Extend  the  method  to  a variable-coefficient  ODE 

/"  + Pz(x)y"  + pi{x)y'  + p0(x)y  = 0. 

Assuming  a solution  y1  to  be  known,  show  that  another 
solution  is  y2{x)  = u(x)y\{x)  with  u(x)  = f z(x)  dx  and 
Z obtained  by  solving 

yiz"  + (3/  + p2y  i)z'  + (3>’i  + 2 p2y'i  + piyi)z  = 0. 

(c)  Reduce 

x3y"'  — 3x2y"  + (6  — x2)xy'  — (6  — x2)y  = 0, 
using  yi  = x (perhaps  obtainable  by  inspection). 

15.  CAS  EXPERIMENT.  Reduction  of  Order.  Starting 
with  a basis,  find  third-order  linear  ODEs  with  variable 
coefficients  for  which  the  reduction  to  second  order 
turns  out  to  be  relatively  simple. 


33  Nonhomogeneous  Linear  ODEs 

We  now  turn  from  homogeneous  to  nonhomogeneous  linear  ODEs  of  nth  order.  We  write 
them  in  standard  form 

(1)  y(n)  + pn_1(x)y(n~1)  + • • • + Pi(x)y'  + p0(x)y  = r(x ) 

with  y(n)  = dny/dxn  as  the  first  term,  and  r(x)  # 0.  As  for  second-order  ODEs,  a general 
solution  of  (1)  on  an  open  interval  I of  the  x-axis  is  of  the  form 

(2)  y(x)  = yh(x)  + yp(x). 

Here  yh(x)  = t'lViU')  + • ■ ■ + cnyn(x ) is  a general  solution  of  the  corresponding 
homogeneous  ODE 

(3)  /m)  + pn-1(x)y(n~Vl  + • ■ • + pi{x)y'  + p0(x)y  = 0 

on  I.  Also,  yp  is  any  solution  of  (1)  on  I containing  no  arbitrary  constants.  If  (1)  has 
continuous  coefficients  and  a continuous  r(x)  on  /,  then  a general  solution  of  (1)  exists 
and  includes  all  solutions.  Thus  (1)  has  no  singular  solutions. 
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EXAMPLE  1 


An  initial  value  problem  for  (1)  consists  of  (1)  and  n initial  conditions 

(4)  y(*o)  = ^o,  y'(x0)  = K1,  y(n_1)(x0)  = Kn^ 

with  x0  in  I.  Under  those  continuity  assumptions  it  has  a unique  solution.  The  ideas  of 
proof  are  the  same  as  those  for  n = 2 in  Sec.  2.7. 


Method  of  Undetermined  Coefficients 

Equation  (2)  shows  that  for  solving  (1)  we  have  to  determine  a particular  solution  of  (1). 
For  a constant-coefficient  equation 

(5)  y(n>  + an_i.y(n_1)  + • ■ ■ + a±y'  + a0y  = r(x) 

(a0,  ■ ■ ■ , an_1  constant)  and  special  r (x)  as  in  Sec.  2.7,  such  a yp(x)  can  be  determined  by 
the  method  of  undetermined  coefficients,  as  in  Sec.  2.7,  using  the  following  rules. 


(A)  Basic  Rule  as  in  Sec.  2.7. 

(B)  Modification  Rule.  If  a term  in  your  choice  for  yv(x)  is  a solution  of  the 
homogeneous  equation  (3),  then  multiply  this  term  by  xk,  where  k is  the  smallest 
positive  integer  such  that  this  term  times  x is  not  a solution  of  { 3). 

(C)  Sum  Rule  as  in  Sec.  2.7. 


The  practical  application  of  the  method  is  the  same  as  that  in  Sec.  2.7.  It  suffices  to 
illustrate  the  typical  steps  of  solving  an  initial  value  problem  and,  in  particular,  the  new 
Modification  Rule,  which  includes  the  old  Modification  Rule  as  a particular  case  (with 
k = 1 or  2).  We  shall  see  that  the  technicalities  are  the  same  as  for  n = 2,  except  perhaps 
for  the  more  involved  determination  of  the  constants. 


Initial  Value  Problem.  Modification  Rule 

Solve  the  initial  value  problem 

(6)  /"  + 3/  + 3/  + y = 30e~x,  y(  0)  = 3,  /( 0)  = -3,  /( 0)  = -47. 

Solution.  Step  1.  The  characteristic  equation  is  A3  + 3 A2  + 3 A + 1 = (A  + l)3  = 0.  It  has  the  triple  root 
A = — 1 . Hence  a general  solution  of  the  homogeneous  ODE  is 

yh  = C]e~x  + c2xe~x  + c3x2e~x 
= (Cj  + C2x  + c3x2)e~x. 

Step  2.  If  we  try  yp  = Ce~x,  we  get  — C + 3C  — 3C  + C = 30,  which  has  no  solution.  Try  Cxe~x  and 
Cx2e~x.  The  Modification  Rule  calls  for 

yp  = Cx3e~x. 
yp  = C(3x2  - x3)e~x, 
y'p  = C(6x  - 6x2  + x3)e~x , 
yp  = C( 6-  lSx  + 9x2  - x3)e~x. 


Then 
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Substitution  of  these  expressions  into  (6)  and  omission  of  the  common  factor  e x gives 

C( 6 - l&t  + 9x2  - x 3)  + 3C(6x  - 6x2  + x3)  + 3C(3x2  - x3)  + Cx3  = 30. 

The  linear,  quadratic,  and  cubic  terms  drop  out,  and  6 C = 30.  Hence  C — 5.  This  gives  yp  = 5x3e~x. 

Step  3.  We  now  write  down  y = yh  + yp,  the  general  solution  of  the  given  ODE.  From  it  we  find  Ci  by  the 
first  initial  condition.  We  insert  the  value,  differentiate,  and  determine  c 2 from  the  second  initial  condition,  insert 
the  value,  and  finally  determine  C3  from  yn(0)  and  the  third  initial  condition: 

y = yh  + yv  = (ci  + c2x  + Csx2)e~x  + 5x3e~x,  y(0)  = c,  = 3 

y'  = [-3  + c2  + (~c2  + 2 c3)x  + (15  - c3)x2  - 5x3]e~x,  y'(0)  = -3  + c2  = -3,  c2  = 0 

y"  = [3  + 2c3  + (30  - 4c3)x  + (-30  + c3)x2  + 5x3]e~x,  y"(0)  = 3 + 2 c3  = -47,  c3  = -25. 

Hence  the  answer  to  our  problem  is  (Fig.  73) 

y = (3  — 25x2)e~x  + 5x3e~x. 

The  curve  of  y begins  at  (0,  3)  with  a negative  slope,  as  expected  from  the  initial  values,  and  approaches  zero 
as  oo.  The  dashed  curve  in  Fig.  74  is  yp. 


y 

5 - 


i — 4—  r — -i — i 

10 


x 


Fig.  74.  y and  yp  (dashed)  in  Example  1 


Method  of  Variation  of  Parameters 

The  method  of  variation  of  parameters  (see  Sec.  2.10)  also  extends  to  arbitrary  order  n. 
It  gives  a particular  solution  yp  for  the  nonhomogeneous  equation  (1)  (in  standard  form 
with  yCn)  as  the  first  term!)  by  the  formula 


(7) 


n 

yP(x)  = 2 »•(+) 


k=  1 


' Wk(x) 
. W(x) 


r(x ) dx 


= y\(x) 


'Wyjx) 
. W(x) 


r(x)  dx  + ■ ■ ■ + yn(x ) 


' Wn(x) 
. W(x) 


r(x)  dx 


on  an  open  interval  I on  which  the  coefficients  of  (1)  and  r(x)  are  continuous.  In  (7)  the 
functions  y1?  ■ ■ • , yn  form  a basis  of  the  homogeneous  ODE  (3),  with  Wronskian  W,  and 
Wj(j  = 1 ,••■,«)  is  obtained  from  W by  replacing  the  /th  column  of  W by  the  column 
[0  0 ■ • • 0 1]T.  Thus,  when  n = 2,  this  becomes  identical  with  (2)  in  Sec.  2.10, 


yi 

f 

y2 

f 

, Wy  = 

0 

J2 

f 

= 

>’i 

f 

0 

1 

= -yz. 

l 

yi 

J2 

yi 

W = 


= y i- 
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EXAMPLE  2 


The  proof  of  (7)  uses  an  extension  of  the  idea  of  the  proof  of  (2)  in  Sec.  2.10  and  can 
be  found  in  Ref  [All]  listed  in  App.  1. 

Variation  of  Parameters.  Nonhomogeneous  Euler-Cauchy  Equation 

Solve  the  nonhomogeneous  Euler-Cauchy  equation 

x3y"'  — 3x2y"  + 6 xy'  — 6 y = x 4 In  x (x  > 0). 

Solution.  Step  1.  General  solution  of  the  homogeneous  ODE.  Substitution  of  y = xm  and  the  derivatives 
into  the  homogeneous  ODE  and  deletion  of  the  factor  xm  gives 

m(m  — 1 ){m  — 2)  — 3 m(m  — 1)  + 6m  — 6 = 0. 

The  roots  are  1,  2,  3 and  give  as  a basis 

>'i  =X,  y2  = Xz,  y3  = X3. 

Hence  the  corresponding  general  solution  of  the  homogeneous  ODE  is 

yh  = cj_x  + c2x2  + c3x3. 


Step  2.  Determinants  needed  in  (7).  These  are 


W = 


1 2x  3x 
0 2 6x 


= 2x3 


Wi  = 


0 2x  3x" 


6x 


W2  = 


x 0 x3 
1 0 3x2 

0 1 6x 


= 2xd 


W3  = 


0 

2x  0 
2 1 


Step  3.  Integration.  In  (7)  we  also  need  the  right  side  r(x)  of  our  ODE  in  standard  form,  obtained  by  division 
of  the  given  equation  by  the  coefficient  x 6 of  ym\  thus,  r(;t)  = (x4  In  x)/xd  = x In  x.  In  (7)  we  have  the  simple 
quotients  W\/W  = x/2,  W2/W  = — 1,  VE3/W  = l/(2x).  Hence  (7)  becomes 


yP 


X 0 I o I 1 

x In  x dx  — x | x In  x dx  + x \ — x In  x dx 


= — ( — In  x — — ] — x [ — In  x — ] + — (x  In  x — x) 


Simplification  gives  yp  = gi4  (In  x — ^).  Hence  the  answer  is 


y = yh  + yp  = CiX  + C2x2  + c3x3  + gr4  (\nx  - 


Figure  75  shows  yp.  Can  you  explain  the  shape  of  this  curve?  Its  behavior  near  a = 0?  The  occurrence  of  a minimum? 
Its  rapid  increase?  Why  would  the  method  of  undetermined  coefficients  not  have  given  the  solution? 
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EXAMPLE  3 


Fig.  75.  Particular  solution  yp  of  the  nonhomogeneous 
Euler-Cauchy  equation  in  Example  2 


Application:  Elastic  Beams 

Whereas  second-order  ODEs  have  various  applications,  of  which  we  have  discussed  some 
of  the  more  important  ones,  higher  order  ODEs  have  much  fewer  engineering  applications. 
An  important  fourth-order  ODE  governs  the  bending  of  elastic  beams,  such  as  wooden  or 
iron  girders  in  a building  or  a bridge. 

A related  application  of  vibration  of  beams  does  not  fit  in  here  since  it  leads  to  PDEs 
and  will  therefore  be  discussed  in  Sec.  12.3. 


Bending  of  an  Elastic  Beam  under  a Load 

We  consider  a beam  B of  length  L and  constant  (e.g.,  rectangular)  cross  section  and  homogeneous  elastic 
material  (e.g.,  steel);  see  Fig.  76.  We  assume  that  under  its  own  weight  the  beam  is  bent  so  little  that  it  is 
practically  straight.  If  we  apply  a load  to  B in  a vertical  plane  through  the  axis  of  symmetry  (the  x-axis  in 
Fig.  76),  B is  bent.  Its  axis  is  curved  into  the  so-called  elastic  curve  C (or  deflection  curve).  It  is  shown  in 
elasticity  theory  that  the  bending  moment  M{x ) is  proportional  to  the  curvature  k{x ) of  C.  We  assume  the  bending 
to  be  small,  so  that  the  deflection  y(x)  and  its  derivative  y'(x)  (determining  the  tangent  direction  of  C)  are  small. 
Then,  by  calculus,  k = y"/(l  + y2)3/2  y" . Hence 

M{x)  = Ely"  (x). 

El  is  the  constant  of  proportionality.  E is  Young ’s  modulus  of  elasticity  of  the  material  of  the  beam.  I is  the 
moment  of  inertia  of  the  cross  section  about  the  (horizontal)  z-axis  in  Fig.  76. 

Elasticity  theory  shows  further  that  M"(x)  = f(x),  where /(x)  is  the  load  per  unit  length.  Together, 

(8)  Ely™  = f(x). 


Fig.  76.  Elastic  beam 
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In  applications  the  most  important  supports  and  corresponding  boundary  conditions  are  as  follows  and  shown 
in  Fig.  77. 

(A)  Simply  supported  y = y"  = 0 at  x = 0 and  L 

(B)  Clamped  at  both  ends  y = y'  = 0 at  x = 0 and  L 

(C)  Clamped  at  x = 0,  free  at  x = L y(0)  = yr( 0)  = 0,  y"(L)  = y'n  (L ) = 0. 

The  boundary  condition  y = 0 means  no  displacement  at  that  point,  y — 0 means  a horizontal  tangent,  y"  = 0 

means  no  bending  moment,  and  ym  = 0 means  no  shear  force. 

Let  us  apply  this  to  the  uniformly  loaded  simply  supported  beam  in  Fig.  76.  The  load  is  /( x)  — fo  = const. 
Then  (8)  is 

(9)  ylv  = k,  k = ^r 

El 

This  can  be  solved  simply  by  calculus.  Two  integrations  give 

»»  k p 

y = ~xz  + c\x  + C£. 

yn(0)  = 0 gives  c 2 = 0.  Then  yn(L)  = L^kL  + ci)  = 0,  Ci  = —kL/2  (since  L A 0).  Hence 

tt  k ( 2 r \ 

y = - (x  -Lx). 

Integrating  this  twice,  we  obtain 

kf  1 4 L 3 \ 

y=2{nX  ~6X  +C3*  + C4J 


with  c4  = 0 from  y(0)  = 0. 


Inserting  the  expression  for 


Then 


kL  ( I?  L3 
y L 2 V 12  6 + Cs)  ~ 

k , we  obtain  as  our  solution 

y = — — (x4  — 2 Lx3 
7 24EI v 


0,  c3  = 


+ l?x). 


12' 


Since  the  boundary  conditions  at  both  ends  are  the  same,  we  expect  the  deflection  y(x)  to  be  “symmetric”  with 
respect  to  L/2,  that  is,  y(x)  — y(L  — x).  Verify  this  directly  or  set  x = u + L/2  and  show  that  y becomes  an 
even  function  of  u, 


From  this  we  can  see  that  the  maximum  deflection  in  the  middle  at  u = 0(x  = L/2)  is  5/oL4/(16  • 2AEI).  Recall 
that  the  positive  direction  points  downward. 


x = 0 


x = L 


(A)  Simply  supported 


(. B)  Clamped  at  both 
ends 


(C)  Clamped  at  the  left 
end,  free  at  the 
right  end 


Fig.  77.  Supports  of  a beam 
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GENERAL  SOLUTION 

Solve  the  following  ODEs,  showing  the  details  of  your 

work. 

1.  y"  + 3y"  + 3/  + y = ex  - x - 1 

2.  /"  + 2v"  v'  2y  = I - 4x3 

3.  (D4  + 10D2  + 9 l)y  = 6.5  sinh  2x 

4.  (D3  + 3D2  - 5D  - 39/)y  = -300  cosx 

5.  (x3D3  + x2D2  - 2 xD  + 21  )y  = x-2 

6.  (D3  + 4 D)y  = sin  x 

7.  (D3  - 9D2  + 21 D - 21I)y  = 27  sin  3x 

INITIAL  VALUE  PROBLEM 

Solve  the  given  IVP,  showing  the  details  of  your  work. 

8.  yiv  - 5y"  + 4y  = 10e_3x,  y(0)  = 1,  y'(0)  = 0, 
y"(0)  = 0,  y'"(0)  = 0 

9.  yiv  + 5y"  + 4y  = 90  sin  4x,  y(0)  = 1,  y'(0)  = 2, 
y"(0)  = -1,  y"'(0)  = -32 

10.  x3y’"  + xy'  — y = x2,  y(l)  = 1,  /(l)  = 3, 

/'(l)  = 14 

11.  (D3  - 2D2  - 3£>)y  = 74e_3xsinx,  y(0)  = -1.4, 
y'(0)  = 3.2,  y"(0)  = -5.2 

12.  (D3  - 2D2  -9 D + 1 8/)y  = e2x,  y(0)  = 4.5, 
y'(0)  = 8.8,  y"(0)  = 17.2 


13.  (D3  — 4 D)y  = 10  cos  x + 5 sin  x,  y(0)  = 3, 

y'(0)  = -2,  y"(0)  = -1 

14.  CAS  EXPERIMENT.  Undetermined  Coefficients. 

Since  variation  of  parameters  is  generally  complicated, 
it  seems  worthwhile  to  try  to  extend  the  other  method. 
Find  out  experimentally  for  what  ODEs  this  is  possible 
and  for  what  not.  Hint:  Work  backward,  solving  ODEs 
with  a CAS  and  then  looking  whether  the  solution 
could  be  obtained  by  undetermined  coefficients.  For 
example,  consider 

y — 3y  + 3y  - y = x ' e 

and 

x v + x y — 2xy  + 2v  = x In  x. 

15.  WRITING  REPORT.  Comparison  of  Methods.  Write 
a report  on  the  method  of  undetermined  coefficients  and 
the  method  of  variation  of  parameters,  discussing  and 
comparing  the  advantages  and  disadvantages  of  each 
method.  Illustrate  your  findings  with  typical  examples. 
Try  to  show  that  the  method  of  undetermined  coefficients, 
say,  for  a third-order  ODE  with  constant  coefficients  and 
an  exponential  function  on  the  right,  can  be  derived  from 
the  method  of  variation  of  parameters. 


T I O N S AND  PROBLEMS 


1.  What  is  the  superposition  or  linearity  principle?  For 
what  nth-order  ODEs  does  it  hold? 

2.  List  some  other  basic  theorems  that  extend  from 
second-order  to  nth-order  ODEs. 

3.  If  you  know  a general  solution  of  a homogeneous  linear 
ODE,  what  do  you  need  to  obtain  from  it  a general 
solution  of  a corresponding  nonhomogeneous  linear 
ODE? 

4.  What  form  does  an  initial  value  problem  for  an  nth- 
order  linear  ODE  have? 

5.  What  is  the  Wronskian?  What  is  it  used  for? 
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GENERAL  SOLUTION 


Solve  the  given  ODE.  Show  the  details  of  your  work. 


6.  yiv  - 3y"  - 4y  = 0 

7.  y"  + Ay"  + 13/  = 0 

8.  y"  - Ay"  - y'  + 4y  = 30e2x 

9.  (D4  - 16 1)y  = -15  coshx 
10.  x2y"'  + 3xy"  - 2y  = 0 


11.  y"  + 4.5y"  + 6.75/  + 3.375y  = 0 

12.  (D3  - D)y  = sinh  0.8x 

13.  (D3  + 6 D2  + 12 D + 8 1)y  = 8x2 

14.  (D4  - 13D2  + 36 1)y  = 12ex 

15.  4x3/”  + 3xy'  — 3y  = 10 
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INITIAL  VALUE  PROBLEM 


Solve  the  IVP.  Show  the  details  of  your  work. 

16.  (D3  - D2  - D + l)y  = 0,  y(0)  = 0,  Dy(0)  = 1, 
D2y(0)  = 0 

17.  /"  + 5y"  + 24/  + 20v  = x,  y(0)  = 1.94, 

/(0)  = -3.95,  y"  = -24 

18.  (D4  - 26 D2  + 25 1)y  = 50(x  + l)2,  v(0)  = 12.16, 
Dy(  0)  = -6,  D2y(  0)  = 34,  D3v(0)  = -130 

19.  (D3  + 9D2  + 23 D + 15 l)y  = 12exp(-4x), 
y(0)  = 9,  Dy(0)  = -41,  D2y(  0)  = 189 

20.  (D3  + 3D2  + 3D  + I)y  = 8 sinx,  y(0)  = -1, 
y'(0)  = -3,  /'(0)  = 5 


Summary  of  Chapter  3 
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SUMMARY  OF  CH  APTER  3 

Higher  Order  Linear  ODEs 


Compare  with  the  similar  Summary  of  Chap.  2 (the  case  n =2). 

Chapter  3 extends  Chap.  2 from  order  n = 2 to  arbitrary  order  n.  An  nth-order 
linear  ODE  is  an  ODE  that  can  be  written 

(1)  y(n)  + pn-l(x)y(n~X)  + • • • + pY(x)y'  + p0(x)y  = r(x) 

with  y(n>  = dny/dxn  as  the  first  term;  we  again  call  this  the  standard  form.  Equation 

(1)  is  called  homogeneous  if  r(x)  = 0 on  a given  open  interval  I considered, 
nonhomogeneous  if  r (x)  # 0 on  I.  For  the  homogeneous  ODE 

(2)  y(TO)  + pn_1(x)y(?l_1)  + • • • + pi(x)y'  + Po(x)y  = 0 

the  superposition  principle  (Sec.  3.1)  holds,  just  as  in  the  case  n = 2.  A basis  or 
fundamental  system  of  solutions  of  (2)  on  I consists  of  n linearly  independent 
solutions  yi,  • ■ • , yn  of  (2)  on  I.  A general  solution  of  (2)  on  1 is  a linear  combination 
of  these, 

(3)  y = cyvi  + • • • + cn  yn  (ci,  • ■ • , cn  arbitrary  constants). 

A general  solution  of  the  nonhomogeneous  ODE  (1)  on  1 is  of  the  form 

(4)  y = yh  + yp  (Sec.  3.3). 

Here,  yp  is  a particular  solution  of  (1)  and  is  obtained  by  two  methods  ( undetermined 
coefficients  or  variation  of  parameters)  explained  in  Sec.  3.3. 

An  initial  value  problem  for  (1)  or  (2)  consists  of  one  of  these  ODEs  and  n 
initial  conditions  (Secs.  3.1,  3.3) 

(5)  y(x0)  = K0,  y'(x0)  = K1,  •••,  y(m_1)(x0)  = Kn_x 

with  given  x0  in  I and  given  K0,  • • • , Kn_x  If  p0,  • • • , pn_  x , r are  continuous  on  I, 
then  general  solutions  of  (1)  and  (2)  on  I exist,  and  initial  value  problems  (1),  (5) 
or  (2),  (5)  have  a unique  solution. 


CHAPTER 


Systems  of  ODEs.  Phase  Plane. 
Qualitative  Methods 


Tying  in  with  Chap.  3,  we  present  another  method  of  solving  higher  order  ODEs  in 
Sec.  4.1.  This  converts  any  nth-order  ODE  into  a system  of  n first-order  ODEs.  We  also 
show  some  applications.  Moreover,  in  the  same  section  we  solve  systems  of  first-order 
ODEs  that  occur  directly  in  applications,  that  is,  not  derived  from  an  nth-order  ODE  but 
dictated  by  the  application  such  as  two  tanks  in  mixing  problems  and  two  circuits  in 
electrical  networks.  (The  elementary  aspects  of  vectors  and  matrices  needed  in  this  chapter 
are  reviewed  in  Sec.  4.0  and  are  probably  familiar  to  most  students.) 

In  Sec.  4.3  we  introduce  a totally  different  way  of  looking  at  systems  of  ODEs.  The 
method  consists  of  examining  the  general  behavior  of  whole  families  of  solutions  of  ODEs 
in  the  phase  plane , and  aptly  is  called  the  phase  plane  method.  It  gives  information  on  the 
stability  of  solutions.  ( Stability  of  a physical  system  is  desirable  and  means  roughly  that  a 
small  change  at  some  instant  causes  only  a small  change  in  the  behavior  of  the  system  at 
later  times.)  This  approach  to  systems  of  ODEs  is  a qualitative  method  because  it  depends 
only  on  the  nature  of  the  ODEs  and  does  not  require  the  actual  solutions.  This  can  be  very 
useful  because  it  is  often  difficult  or  even  impossible  to  solve  systems  of  ODEs.  In  contrast, 
the  approach  of  actually  solving  a system  is  known  as  a quantitative  method. 

The  phase  plane  method  has  many  applications  in  control  theory,  circuit  theory, 
population  dynamics  and  so  on.  Its  use  in  linear  systems  is  discussed  in  Secs.  4.3,  4.4, 
and  4.6  and  its  even  more  important  use  in  nonlinear  systems  is  discussed  in  Sec.  4.5  with 
applications  to  the  pendulum  equation  and  the  Lokta-Vol terra  population  model.  The 
chapter  closes  with  a discussion  of  nonhomogeneous  linear  systems  of  ODEs. 

NOTATION.  We  continue  to  denote  unknown  functions  by  y;  thus,  y\(t),  y2(t) — 
analogous  to  Chaps.  1-3.  (Note  that  some  authors  use  x for  functions,  Xi(f)>  x2(f)  when 
dealing  with  systems  of  ODEs.) 

Prerequisite:  Chap.  2. 

References  and  Answers  to  Problems:  App.  1 Part  A,  and  App.  2. 

4.0  For  Reference: 

Basics  of  Matrices  and  Vectors 
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For  clarity  and  simplicity  of  notation,  we  use  matrices  and  vectors  in  our  discussion 
of  linear  systems  of  ODEs.  We  need  only  a few  elementary  facts  (and  not  the  bulk  of 
the  material  of  Chaps.  7 and  8).  Most  students  will  very  likely  be  already  familiar 
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with  these  facts.  Thus  this  section  is  for  reference  only.  Begin  with  Sec.  4.1  and  consult 
4.0  as  needed. 

Most  of  our  linear  systems  will  consist  of  two  linear  ODEs  in  two  unknown  functions 
y\(t),  yz(t). 


y'l  = aiD'i  + a12y2,  y'i  = -5>'i  + 2 y2 

(1)  for  example, 

j4  = «2i'’i  + a22y2,  y'z  = 13yi  + \y2 

(perhaps  with  additional  given  functions  gi(t),  g2(t)  on  the  right  in  the  two  ODEs). 

Similarly,  a linear  system  of  n first-order  ODEs  in  n unknown  functions  yi(f),  ■ ■ • , yn(f) 
is  of  the  form 


(2) 


yi  - anyi  + fli2.v2  + 


“f  a\ nSn 


y'z  = fl2iVi  + a22y2  + 


T a2nyn 


yn  an  1 V 1 “f  Clnzyz  + ' ' " + Url/ny  rl 


(perhaps  with  an  additional  given  function  on  the  right  in  each  ODE). 

Some  Definitions  and  Terms 

Matrices.  In  (1)  the  (constant  or  variable)  coefficients  form  a 2 X 2 matrix  A,  that  is, 
an  array 


(3) 

II 

23 

II 

< 

an  ai2 

for  example, 

A 

. a21  a22 

Similarly,  the  coefficients  in  (2)  form  an  n 

x n matrix 

An 

ai2 

^1  n 

(4) 

A 

\fijk] 

a21 

a22 

a2n 

Ani 

an2 

dnn 

The  an,  ci\2,  ■ ■ ■ are  called  entries,  the  horizontal  lines  rows,  and  the  vertical  lines  columns. 
Thus,  in  (3)  the  first  row  is  [«n  ciy2\,  the  second  row  is  [a21  a22],  and  the  first  and 

second  columns  are 


an 

«12 

and 

«21 

«22 

In  the  “ double  subscript  notation ” for  entries,  the  first  subscript  denotes  the  row  and  the 
second  the  column  in  which  the  entry  stands.  Similarly  in  (4).  The  main  diagonal  is  the 
diagonal  flu  a22  ■ ■ ■ ann  in  (4).  hence  flu  a22  in  (3). 
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We  shall  need  only  square  matrices,  that  is,  matrices  with  the  same  number  of  rows 
and  columns,  as  in  (3)  and  (4). 

Vectors.  A column  vector  x with  n components  x1;  ■ ■ ■ , xn  is  of  the  form 


Xi 

*2 


thus  if  n = 2, 


x = 


Xi 

*2 


Xn 

Similarly,  a row  vector  v is  of  the  form 

v = [Ui  • • • vn],  thus  if  n = 2,  then  v = [Vi  u2]- 


Calculations  with  Matrices  and  Vectors 

Equality.  Two  n X n matrices  are  equal  if  and  only  if  corresponding  entries  are  equal. 
Thus  for  n = 2,  let 


A = 

flu 

fl12 

and 

B = 

bu 

^12 

fl21 

a22 

^21 

^22 

Then  A = B if  and  only  if 

all  = ^ll>  fl12  = ^12 

fl2i  = a22  = t>22- 

Two  column  vectors  (or  two  row  vectors)  are  equal  if  and  only  if  they  both  have  n 
components  and  corresponding  components  are  equal.  Thus,  let 


V = 

Vl 

and  x = 

*1 

Then 

Vl  = X! 

v = x if  and  only  if 

V2 

*2 

v2  = x2 

Addition  is  performed  by  adding  corresponding  entries  (or  components);  here,  matrices 
must  both  be  n X n,  and  vectors  must  both  have  the  same  number  of  components.  Thus 
for  n = 2, 


«n  + ^11 

«12  + bi2 

Vl  + Xi 

(5) 

A + B = 

«21  + ^21 

«22  + b22 

, V + X = 

v2  + X2 

Scalar  multiplication  (multiplication  by  a number  c)  is  performed  by  multiplying  each 
entry  (or  component)  by  c.  For  example,  if 


9 

3' 

'-63 

-2l" 

A = 

-2 

0 

, then 

— 7A  = 

14 

0 
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If 


" 0.4  " 

4 

V = 

-13 

, then 

lOv  = 

-130 

Matrix  Multiplication.  The  product  C = AB  (in  this  order)  of  two  n X n matrices 
A = [ajk]  and  B = f bjk]  is  the  n X n matrix  C = [c]k]  with  entries 


(6) 


cjk 


QjmPmk 


m=  1 


j = 1 , ■ ■ • , « 

k = 1 , • • • , n. 


that  is,  multiply  each  entry  in  the  /th  row  of  A by  the  corresponding  entry  in  the  Ath  column 
of  B and  then  add  these  n products.  One  says  briefly  that  this  is  a “multiplication  of  rows 
into  columns.”  For  example, 


9 3" 

"l  -4 

9 • 1 + 3 • 2 

9 

(-4)  + 3-5' 

-2  0 

2 5 

-2 • 1 + 0 • 2 

(-2) 

(-4)  + 0-5 

15  -21 

-2  8 


CAUTION!  Matrix  multiplication  is  not  commutative,  AB  # BA  in  general.  In  our 
example, 


'l  -4 

9 3' 

1 • 9 + (-4)  • (-2) 

1 • 3 + (-4)  • 0" 

2 5 

-2  0 

2 • 9 + 5 • (-2) 

2 • 3 + 5 • 0 

17  3 

8 6 


Multiplication  of  an  n X n matrix  A by  a vector  x with  n components  is  defined  by  the 
same  rule:  v = Ax  is  the  vector  with  the  n components 

n 

Vj  ajmxm  j 1>  * * * > 71* 

m=  1 

For  example, 


' 12  i 

Xl 

\2xi  + 7x2 

co 

00 

1 

1 

.X2. 

— 8^1  + 3.r2 

Systems  of  ODEs  as  Vector  Equations 

Differentiation.  The  derivative  of  a matrix  (or  vector)  with  variable  entries  (or 
components)  is  obtained  by  differentiating  each  entry  (or  component).  Thus,  if 


y(t) 


yi(t) 

'e-2t' 

Mt) 

sin  t 

T— 1 

1 

' —2e~zt' 

ykO. 

cos  t 

then  y'(t) 
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Using  matrix  multiplication  and  differentiation,  we  can  now  write  (1)  as 


f ' 

y i 

An 

«12 

yi 

t 

y = 

r 

y2 

= Ay  = 

a21 

a22 

y2 

’-5 

2 

yi 

13 

1 

2 _ 

.>'2. 

Similarly  for  (2)  by  means  of  an  n X n matrix  A and  a column  vector  y with  n components, 
namely,  y = Ay.  The  vector  equation  (7)  is  equivalent  to  two  equations  for  the 
components,  and  these  are  precisely  the  two  ODEs  in  (1). 

Some  Further  Operations  and  Terms 

Transposition  is  the  operation  of  writing  columns  as  rows  and  conversely  and  is  indicated 
by  T.  Thus  the  transpose  AT  of  the  2X2  matrix 


All 

a12 

'-5  2 

an 

fl21 

’-5 

13" 

A = 

fl21 

a22 

— 

i 

tOlM 

1 

is  At  = 

a12 

a22 

— 

2 

1 

2 

The  transpose  of  a column  vector,  say. 


v = 


Vi 

V2 


is  a row  vector, 


vT  = [ui  inl- 


and conversely. 

Inverse  of  a Matrix.  The  n X n unit  matrix  I is  the  n X n matrix  with  main  diagonal 
1,  1,  • • • , 1 and  all  other  entries  zero.  If,  for  a given  n X n matrix  A,  there  is  an  n X n 
matrix  B such  that  AB  = BA  = I,  then  A is  called  nonsingular  and  B is  called  the  inverse 
of  A and  is  denoted  by  A-1;  thus 

(8)  A A-1  = A-1  A = I. 

The  inverse  exists  if  the  determinant  det  A of  A is  not  zero. 

If  A has  no  inverse,  it  is  called  singular.  For  n = 2, 

fl22  ~fl12 
— fl21  all 


(9) 


A'1  = 


1 

det  A 


where  the  determinant  of  A is 


(10) 


det  A = 


an 


a21 


a\2 

a22 


~ alla22  a12a21- 


(For  general  n,  see  Sec.  7.7,  but  this  will  not  be  needed  in  this  chapter.) 

Linear  Independence,  r given  vectors  v(1),  • • • , \<r>  with  n components  are  called  a 
linearly  independent  set  or,  more  briefly,  linearly  independent,  if 


(11) 


Clv(1)  + • • • + crv(r)  = 0 
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implies  that  all  scalars  Ci,  • ■ ■ , cr  must  be  zero;  here,  0 denotes  the  zero  vector,  whose  n 
components  are  all  zero.  If  (11)  also  holds  for  scalars  not  all  zero  (so  that  at  least  one  of 
these  scalars  is  not  zero),  then  these  vectors  are  called  a linearly  dependent  set  or,  briefly, 
linearly  dependent,  because  then  at  least  one  of  them  can  be  expressed  as  a linear 
combination  of  the  others;  that  is,  if,  for  instance,  ci  =£  0 in  (11),  then  we  can  obtain 

vC1)=-^(c2v(2>+  ...  + crvw). 


Eigenvalues,  Eigenvectors 

Eigenvalues  and  eigenvectors  will  be  very  important  in  this  chapter  (and,  as  a matter  of 
fact,  throughout  mathematics). 

Let  A = [ajk]  be  an  n X n matrix.  Consider  the  equation 

(12)  Ax  = Ax 


where  A is  a scalar  (a  real  or  complex  number)  to  be  determined  and  x is  a vector  to  be 
determined.  Now,  for  every  A,  a solution  is  x = 0.  A scalar  A such  that  (12)  holds  for 
some  vector  x 0 is  called  an  eigenvalue  of  A,  and  this  vector  is  called  an  eigenvector 
of  A corresponding  to  this  eigenvalue  A. 

We  can  write  (12)  as  Ax  — Ax  = 0 or 

(13)  (A  - AI)x  = 0. 

These  are  n linear  algebraic  equations  in  the  n unknowns  x i , ■ ■ • , xn  (the  components 
of  x).  For  these  equations  to  have  a solution  x 0,  the  determinant  of  the  coefficient 
matrix  A — AI  must  be  zero.  This  is  proved  as  a basic  fact  in  linear  algebra  (Theorem  4 
in  Sec.  7.7).  In  this  chapter  we  need  this  only  for  n = 2.  Then  (13)  is 


(14) 


in  components. 


(14*) 


on  A 

fl12 

Xi 

"o' 

o2i 

a22  ~ A 

.*2. 

0 

(an  - A)*!  + ai2*2  = 0 

«21*1  + («22  - A)x2  = 0. 


Now  A — AI  is  singular  if  and  only  if  its  determinant  det  (A  — AI),  called  the  characteristic 
determinant  of  A (also  for  general  n),  is  zero.  This  gives 


det  (A  - AI) 


An  A a12 
^21  fl22  — A 


— (an  — A)(a22  — A)  — fli2fl2i 

= A2  — (flu  + «22)A  + flllfl22  — Oi2fl21  = 0. 


(15) 
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This  quadratic  equation  in  A is  called  the  characteristic  equation  of  A.  Its  solutions  are 
the  eigenvalues  A]  and  A2  of  A.  First  determine  these.  Then  use  (14*)  with  A = A!  to 
determine  an  eigenvector  x(1)  of  A corresponding  to  A] . Finally  use  (14*)  with  A = A2 
to  find  an  eigenvector  x<2>  of  A corresponding  to  A2.  Note  that  if  x is  an  eigenvector  of 
A,  so  is  kx  with  any  k ¥=  0. 


EXAMPLE  Eigenvalue  Problem 

Find  the  eigenvalues  and  eigenvectors  of  the  matrix 


(16) 


-4.0 

4.0 

-1.6 

1.2 

Solution.  The  characteristic  equation  is  the  quadratic  equation 


det  | A — Al| 


-4  - A 
-1.6 


= A2  + 2.8A  + 1.6  = 0. 


It  has  the  solutions  Ai  = —2  and  A2  = —0.8.  These  are  the  eigenvalues  of  A. 

Eigenvectors  are  obtained  from  (14*).  For  A = Ai  = —2  we  have  from  (14*) 

(-4.0  + 2.0)*  1 + 4.0x  2 = 0 

— 1.6*1  + (1.2  + 2.0)*  2 = 0. 

A solution  of  the  first  equation  is  *1  = 2,  *2  = 1.  This  also  satisfies  the  second  equation.  (Why?)  Hence  an 
eigenvector  of  A corresponding  to  Ai  = —2.0  is 


'2 

1 ' 

(17) 

x(1,= 

1 

Similarly, 

x(2)  = 

0.8 

is  an  eigenvector  of  A corresponding  to  A2  = —0.8,  as  obtained  from  (14*)  with  A = A2.  Verify  this. 


4.  Systems  of  ODEs  as  Models 
in  Engineering  Applications 

We  show  how  systems  of  ODEs  are  of  practical  importance  as  follows.  We  first  illustrate 
how  systems  of  ODEs  can  serve  as  models  in  various  applications.  Then  we  show  how  a 
higher  order  ODE  (with  the  highest  derivative  standing  alone  on  one  side)  can  be  reduced 
to  a first-order  system. 

Mixing  Problem  Involving  Two  Tanks 

A mixing  problem  involving  a single  tank  is  modeled  by  a single  ODE,  and  you  may  first  review  the 
corresponding  Example  3 in  Sec.  1.3  because  the  principle  of  modeling  will  be  the  same  for  two  tanks.  The 
model  will  be  a system  of  two  first-order  ODEs. 

Tank  7i  and  T2  in  Fig.  78  contain  initially  100  gal  of  water  each.  In  7f  the  water  is  pure,  whereas  150  lb  of 
fertilizer  are  dissolved  in  7^.  By  circulating  liquid  at  a rate  of  2 gal/min  and  stirring  (to  keep  the  mixture  uniform) 
the  amounts  of  fertilizer  yi(r)  in  T\  and  y2ti)  in  T2  change  with  time  t.  How  long  should  we  let  the  liquid  circulate 
so  that  7i  will  contain  at  least  half  as  much  fertilizer  as  there  will  be  left  in  72? 
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System  of  tanks 

Fig.  78.  Fertilizer  content  in  Tanks  7",  (lower  curve)  and  T2 


Solution.  Step  1.  Setting  up  the  model.  As  for  a single  tank,  the  time  rate  of  change  yj(f)  of  \-\it)  equals 
inflow  minus  outflow.  Similarly  for  tank  T2.  From  Fig.  78  we  see  that 


yi  = Inflow/min  — Outflow/min  = y2 yi 

100  100 


(Tank  71) 


y2  = Inflow/min  — Outflow/min  = y-i y2 

100 ' 100 


(Tank  T2). 


Hence  the  mathematical  model  of  our  mixture  problem  is  the  system  of  first-order  ODEs 

(Tank  71) 
(Tank  T2). 


y'\  = —0.02}’!  + 0.02y2 
y2  = 0.02 y1  — 0.02y2 


As  a vector  equation  with  column  vector  y 


yi 

>2 


and  matrix  A this  becomes 


y'  = Ay. 


where 


A = 


-0.02 

0.02 


0.02 

-0.02 


Step  2.  General  solution.  As  for  a single  equation,  we  try  an  exponential  function  of  t. 

(1)  y = xeM.  Then  y'  = AxeAt  = Axeu. 

Dividing  the  last  equation  AxcAt  = AxeA(  by  eAt  and  interchanging  the  left  and  right  sides,  we  obtain 


Ax  = Ax. 


We  need  nontrivial  solutions  (solutions  that  are  not  identically  zero).  Hence  we  have  to  look  for  eigenvalues 
and  eigenvectors  of  A.  The  eigenvalues  are  the  solutions  of  the  characteristic  equation 


(2) 


det  (A  — AI)  = 


-0.02  - A 
0.02 


0.02 

-0.02  - A 


= (-0.02  - A)2  - 0.022  = A(A 


0.04)  = 0. 


We  see  that  Ai  = 0 (which  can  very  well  happen — don’t  get  mixed  up — it  is  eigen  vectors  that  must  not  be  zero) 
and  A2  = —0.04.  Eigenvectors  are  obtained  from  (14*)  in  Sec.  4.0  with  A = 0 and  A = —0.04.  For  our  present 
A this  gives  [we  need  only  the  first  equation  in  (14*)] 


-0.02.ri  + 0.02*2  — 0 and  (—0.02  + 0.04)*  1 + 0.02*2  = 0, 
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respectively.  Hence  xi  = X2  and  Xi  = —X2,  respectively,  and  we  can  take  *i  — X2  — 1 and  Xi  = —X2  — 1. 
This  gives  two  eigenvectors  corresponding  to  Aj.  = 0 and  A2  = —0.04,  respectively,  namely, 


1 

f 

X<D  = 

1 

and 

X®  = 

-1 

From  (1)  and  the  superposition  principle  (which  continues  to  hold  for  systems  of  homogeneous  linear  ODEs) 
we  thus  obtain  a solution 


1 

1 

(3) 

y = Clx(1)cAlt  + c2x™ex*  = Cl 

1 

+ c2 

-1 

where  ci  and  C2  are  arbitrary  constants.  Later  we  shall  call  this  a general  solution. 

Step  3.  Use  of  initial  conditions.  The  initial  conditions  are  yi(0)  = 0 (no  fertilizer  in  tank  7i)  and  y2(0)  — 150. 
From  this  and  (3)  with  t = 0 we  obtain 


1 

1 

Cl  + 

C2 

0 

+ £2 

= 

= 

1 

-1 

. C1  " 

£2. 

1 5° 

In  components  this  is  C\  + C2  — 0,  Ci  — C2  = 150.  The  solution  is  C\  = 75,  C2  = —75.  This  gives  the  answer 


1 

1 

y = 75x(1)  - 75x(2)c_004t  = 75 

1 

- 75 

-1 

In  components, 

yi  = 75  — 15e~0  04t  (Tank  7i,  lower  curve) 

y2  = 75  + 15e~omt  (Tank  T2,  upper  curve). 


Figure  78  shows  the  exponential  increase  of  yi  and  the  exponential  decrease  of  y2  to  the  common  limit  75  lb. 
Did  you  expect  this  for  physical  reasons?  Can  you  physically  explain  why  the  curves  look  “symmetric”?  Would 
the  limit  change  if  7i  initially  contained  100  lb  of  fertilizer  and  T2  contained  50  lb? 

Step  4.  Answer.  T\  contains  half  the  fertilizer  amount  of  T2  if  it  contains  1/3  of  the  total  amount,  that  is, 
50  lb.  Thus 


yi  = 75  - 75<T004t  = 50,  e~0Mt  = g,  t = (In  3)/0.04  = 27.5. 

Hence  the  fluid  should  circulate  for  at  least  about  half  an  hour. 

Electrical  Network 

Find  the  currents  Ii(t)  and  I^if)  in  the  network  in  Fig.  79.  Assume  all  currents  and  charges  to  be  zero  at  t = 0, 
the  instant  when  the  switch  is  closed. 


L = 1 henry  C = 0.25  farad 


Fig.  79.  Electrical  network  in  Example  2 

Solution.  Step  1.  Setting  up  the  mathematical  model.  The  model  of  this  network  is  obtained  from 
Kirchhoff’s  Voltage  Law,  as  in  Sec.  2.9  (where  we  considered  single  circuits).  Let  Ii(t)  and  I2 (t)  be  the  currents 
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in  the  left  and  right  loops,  respectively.  In  the  left  loop,  the  voltage  drops  are  L/l  = /J  [V]  over  the  inductor 
and  Ri(I\  — I2 ) = 4(/i  — I2 ) [V]  over  the  resistor,  the  difference  because  I\  and  I2  flow  through  the  resistor  in 
opposite  directions.  By  Kirchhoff  s Voltage  Law  the  sum  of  these  drops  equals  the  voltage  of  the  battery;  that 
is,  l[  + 4(7 1 — I2 ) — 12,  hence 

(4a)  l[  = -4/i  + 4/2  + 12. 

In  the  right  loop,  the  voltage  drops  are  ^2/2  = 6/2  [V]  and  R1V2  ~ h)  = MI2  ~ h)  [V]  over  the  resistors  and 
(7/C)J  ^dt  = 4J  ^dt  [V]  over  the  capacitor,  and  their  sum  is  zero, 

6/2  + 4(/2  - /1)  + 4 \ I2  dt  = 0 or  10/2  - 4/x  + 4 I2dt  = 0. 


Division  by  10  and  differentiation  gives  I2  ~ 0.4/1  + O.4/2  — 0. 

To  simplify  the  solution  process,  we  first  get  rid  of  0.4/1,  which  by  (4a)  equals  0.4(— 4/i  + 4/2  + 12). 
Substitution  into  the  present  ODE  gives 

I2  = 0.4/1  - 0.4/2  = 0.4(— 4/i  + 4/2  + 12)  - 0.4/2 


and  by  simplification 

(4b)  I2  = -I.6/1  + 1.2/2  + 4.8. 

In  matrix  form,  (4)  is  (we  write  J since  I is  the  unit  matrix) 


h 

‘ -4.0 

4.0 

12.0 

(5) 

J'  = AJ  + g, 

where 

j = 

I 2 

, A = 

-1.6 

1.2 

• g = 

4.8 

Step  2.  Solving  (5).  Because  of  the  vector  g this  is  a nonhomo  gene  ous  system,  and  we  try  to  proceed  as  for  a 
single  ODE,  solving  first  the  homogeneous  system  = AJ  (thus  j'  — AJ  = 0)  by  substituting  J = xext.  This 
gives 


Jr  = \xeM  = Ax£At,  hence  Ax  = Ax. 

Hence,  to  obtain  a nontrivial  solution,  we  again  need  the  eigenvalues  and  eigenvectors.  For  the  present  matrix 
A they  are  derived  in  Example  1 in  Sec.  4.0: 


'2 

x(2)  = 

1 ’ 

; A2  = -0.8, 

1 

0.8 

Hence  a “general  solution”  of  the  homogeneous  system  is 

= clX(1)^-2t  + c2xC2)e-°-8t. 

For  a particular  solution  of  the  nonhomogeneous  system  (5),  since  g is  constant,  we  try  a constant  column 
vector  Jp  = a with  components  a±,  «2-  Then  Jp  = 0,  and  substitution  into  (5)  gives  Aa  + g = 0;  in  components, 


— 4.0^1  + 4.0fl2  "f  12.0  — 0 
— 1.6^1  + 1.2«2  + 4.8  = 0. 


The  solution  is  ci\  = 3,  02  — 0;  thus  a = 


3 


0 


. Hence 


(6)  J = Jh  + Jp  = Cixa)<r2‘  + c2x<2)e-°-8t  + a; 

in  components, 

h = 2cie_2t  + c2e~08t  + 3 

h = c^2t  + 0.8  c2e~08t. 
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The  initial  conditions  give 


/i(0)  — 2c\  + c2  + 3 — 0 
/2(0)  = d + 0.8c2  = 0. 

Hence  Ci  = — 4 and  c2  = 5.  As  the  solution  of  our  problem  we  thus  obtain 
(7)  J = -4xa)e~2t  + 5x(2)£-0'8t  + a. 

In  components  (Fig.  80b), 

A = Se~2t  + 5e~08t  + 3 
/2  = -4e~2t  + 4e~08t. 


Now  comes  an  important  idea,  on  which  we  shall  elaborate  further,  beginning  in  Sec.  4.3.  Figure  80a  shows 
Ii(t)  and  I2(t)  as  two  separate  curves.  Figure  80b  shows  these  two  currents  as  a single  curve  Ui(t),  /2(f)]  in  the 
/i/2-plane.  This  is  a parametric  representation  with  time  t as  the  parameter.  It  is  often  important  to  know  in 
which  sense  such  a curve  is  traced.  This  can  be  indicated  by  an  arrow  in  the  sense  of  increasing  t,  as  is  shown. 
The  /i/2-plane  is  called  the  phase  plane  of  our  system  (5),  and  the  curve  in  Fig.  80b  is  called  a trajectory.  We 
shall  see  that  such  “phase  plane  representations”  are  far  more  important  than  graphs  as  in  Fig.  80a  because 
they  will  give  a much  better  qualitative  overall  impression  of  the  general  behavior  of  whole  families  of  solutions, 
not  merely  of  one  solution  as  in  the  present  case. 


1.5  - 

i - J 

0.5  - 

ol I I L_ I L 

0 1 2 3 4 5 


(a)  Currents  I 
(upper  curve) 
and  I2 


(b)  Trajectory  I2(t)f 
in  the  II- plane 
(the  "phase  plane”) 


Fig.  80.  Currents  in  Example  2 


I 


1 


Remark.  In  both  examples,  by  growing  the  dimension  of  the  problem  (from  one  tank  to 
two  tanks  or  one  circuit  to  two  circuits)  we  also  increased  the  number  of  ODEs  (from  one 
ODE  to  two  ODEs).  This  “growth”  in  the  problem  being  reflected  by  an  “increase”  in  the 
mathematical  model  is  attractive  and  affirms  the  quality  of  our  mathematical  modeling  and 
theory. 


Conversion  of  an  nth-Order  ODE  to  a System 

We  show  that  an  nth-order  ODE  of  the  general  form  (8)  (see  Theorem  1)  can  be  converted 
to  a system  of  n first-order  ODEs.  This  is  practically  and  theoretically  important — 
practically  because  it  permits  the  study  and  solution  of  single  ODEs  by  methods  for 
systems,  and  theoretically  because  it  opens  a way  of  including  the  theory  of  higher  order 
ODEs  into  that  of  first-order  systems.  This  conversion  is  another  reason  for  the  importance 
of  systems,  in  addition  to  their  use  as  models  in  various  basic  applications.  The  idea  of 
the  conversion  is  simple  and  straightforward,  as  follows. 
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THEOREM  1 


PROOF 
EXAMPLE  3 


Conversion  of  an  ODE 

An  nth-order  ODE 

(8) 

can  be  converted  to  a system  of  n first-order  ODEs  by  setting 

II 

t-H 

w- 

f ft  (n—  1) 

y2  = y , y3  = y ,■■■  ,yn  = y 

This  system  is  of  the  form 

<N 

II 

rH 

(10) 

CO 

II  ••• 

- <N 

t _ 

yn—i  yn 

y'n  = F(t,  yi,  yz,  • ■ ■ , yn)- 

The  first  n — 1 of  these  n ODEs  follows  immediately  from  (9)  by  differentiation.  Also, 
y'n  = y(n>  by  (9),  so  that  the  last  equation  in  (10)  results  from  the  given  ODE  (8). 

Mass  on  a Spring 

To  gain  confidence  in  the  conversion  method,  let  us  apply  it  to  an  old  friend  of  ours,  modeling  the  free  motions 
of  a mass  on  a spring  (see  Sec.  2.4) 


my"  + cy'  + ky  = 0 or  y" 


k 

— y. 
m 


For  this  ODE  (8)  the  system  (10)  is  linear  and  homogeneous, 

y'l  = y2 


Setting  y = 


, we  get  in  matrix  form 


y'  = Ay  = 


k_ 

m m 


y i 

V2 


The  characteristic  equation  is 

det  (A  — AI)  = 


-A  1 

k c 

- — A 

m m 


= A 


= 0. 
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It  agrees  with  that  in  Sec.  2.4.  For  an  illustrative  computation,  let  m = l,c  = 2,  and  k = 0.75.  Then 
A2  + 2A  + 0.75  = (A  + 0.5)(A  + 1.5)  = 0. 


This  gives  the  eigenvalues  Ai  = — 0.5  and  A2  = —1.5.  Eigenvectors  follow  from  the  first  equation  in  A — AI  = 0, 
which  is  — Ajci  + JC2  — 0.  For  Ai  this  gives  0.5*  1 + *2  — 0,  say,  x\  = 2,  *2  = — 1-  For  A2  = —1.5  it  gives 
1.5*1  + *2  = 0,  say,  *1  = 1,  *2  = — 1.5.  These  eigenvectors 


2 

1 

2 

1 

, x<2,= 

give  y = Ci 

<G°-5t  + c2 

-1 

— l-5 

-1 

— 1.5 

This  vector  solution  has  the  first  component 


o —0.5 1 1 — 1.5t 

y = y 1 = 2ci«  + c2e 


which  is  the  expected  solution.  The  second  component  is  its  derivative 


y2=y'1=y'  = - 1 .5^ “ 15t. 


FROBITE^SF-T=4^1 


1-6 


MIXING  PROBLEMS 


1.  Find  out,  without  calculation,  whether  doubling  the 
flow  rate  in  Example  1 has  the  same  effect  as  halfing 
the  tank  sizes.  (Give  a reason.) 


2.  What  happens  in  Example  1 if  we  replace  by  a tank 
containing  200  gal  of  water  and  150  lb  of  fertilizer 
dissolved  in  it? 

3.  Derive  the  eigenvectors  in  Example  1 without  consulting 
this  book. 


4.  In  Example  1 find  a “general  solution”  for  any  ratio 
a = (flow  rate)/(tank  size),  tank  sizes  being  equal. 
Comment  on  the  result. 


5.  If  you  extend  Example  1 by  a tank  of  the  same  size 
as  the  others  and  connected  to  T2  by  two  tubes  with 
flow  rates  as  between  7(  and  T2.  what  system  of  ODEs 
will  you  get? 

6.  Find  a “general  solution”  of  the  system  in  Prob.  5. 


7-9 


ELECTRICAL  NETWORK 


In  Example  2 find  the  currents: 


7.  If  the  initial  currents  are  0 A and  —3  A (minus  meaning 
that  /2(0)  flows  against  the  direction  of  the  arrow). 

8.  If  the  capacitance  is  changed  to  C = 5/27  F.  (General 
solution  only.) 


9.  If  the  initial  currents  in  Example  2 are  28  A and  14  A. 


10-13 


CONVERSION  TO  SYSTEMS 


Find  a general  solution  of  the  given  ODE  (a)  by  first  converting 
it  to  a system,  (b),  as  given.  Show  the  details  of  your  work. 
10.  y"  + 3y'  + 2y  = 0 11.  4y"  - 15/  - 4y  = 0 

12.  /"  + 2y"  - y - 2y  = 0 

13.  y"  + 2/  - 24y  = 0 


14.  TEAM  PROJECT.  Two  Masses  on  Springs,  (a)  Set 

up  the  model  for  the  (undamped)  system  in  Fig.  81. 

(b)  Solve  the  system  of  ODEs  obtained.  Hint.  Try 
y = xewt  and  set  co2  = A.  Proceed  as  in  Example  1 or 
2.  (c)  Describe  the  influence  of  initial  conditions  on  the 
possible  kind  of  motions. 


System  in 
static 

equilibrium 


System  in 
motion 


Fig.  81.  Mechanical  system  in  Team  Project 


15.  CAS  EXPERIMENT.  Electrical  Network,  (a)  In 

Example  2 choose  a sequence  of  values  of  C that 
increases  beyond  bound,  and  compare  the  corresponding 
sequences  of  eigenvalues  of  A.  What  limits  of  these 
sequences  do  your  numeric  values  (approximately) 
suggest? 

(b)  Find  these  limits  analytically. 

(c)  Explain  your  result  physically. 

(d)  Below  what  value  (approximately)  must  you  decrease 
C to  get  vibrations? 
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4.2  Basic  Theory  of  Systems  of  ODEs. 
Wronskian 


In  this  section  we  discuss  some  basic  concepts  and  facts  about  system  of  ODEs  that  are 
quite  similar  to  those  for  single  ODEs. 

The  first-order  systems  in  the  last  section  were  special  cases  of  the  more  general  system 

>’i  = flit,  ylt  ■ ■ ■ , yn) 

y'z  = fiit,  Ti.  • ■ ■ , yn) 

(l) 


y'n  = fn(t » Tl.  • • ■ . yn)- 

We  can  write  the  system  (1)  as  a vector  equation  by  introducing  the  column  vectors 
y — [ y i • • • yn]T  and  f = [/i  ■ • • fn]T  (where  T means  transposition  and  saves  us 

the  space  that  would  be  needed  for  writing  y and  f as  columns).  This  gives 

(1)  y'  = f it,  y). 

This  system  (1)  includes  almost  all  cases  of  practical  interest.  For  n = 1 it  becomes 
y i = hit,  vi)  or,  simply,  y = fit,  y),  well  known  to  us  from  Chap.  1. 

A solution  of  (1)  on  some  interval  a < t < b is  a set  of  n differentiable  functions 


yi  = hit),  •••,  yn  = hn(t) 


on  a < t < b that  satisfy  (1)  throughout  this  interval.  In  vector  from,  introducing  the 
“ solution  vector ” h = [h\  • • ■ hn]J  (a  column  vector!)  we  can  write 

y = h(t). 

An  initial  value  problem  for  (1)  consists  of  (1)  and  n given  initial  conditions 

(2)  yi(fo)  = Ki,  y2ito)  = k2,  ■■■,  ynito)  = Kn, 

in  vector  form,  y(r())  = K,  where  1 o is  a specified  value  of  t in  the  interval  considered  and 
the  components  of  K = \K\  ■ ■ ■ Kn]J  are  given  numbers.  Sufficient  conditions  for  the 

existence  and  uniqueness  of  a solution  of  an  initial  value  problem  (1),  (2)  are  stated  in 
the  following  theorem,  which  extends  the  theorems  in  Sec.  1.7  for  a single  equation.  (For 
a proof,  see  Ref.  [A7].) 


THEOREM  1 


Existence  and  Uniqueness  Theorem 

Let  /i,  • ■ • ,fn  in  (1)  be  continuous  functions  having  continuous  partial  derivatives 
dfi/dyi,  ■■■,  Bh/dyn,  • • • , dfn/dyn  in  some  domain  R of  tyiy2  ■ ■ • yn-space 
containing  the  point  (to,  K\,  ■ ■ • , Kn).  Then  (1)  has  a solution  on  some  interval 
tg  — a < t < to  + a satisfying  (2),  and  this  solution  is  unique. 
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THEOREM  2 


THEOREM  3 


PROOF 


Linear  Systems 

Extending  the  notion  of  a linear  ODE,  we  call  (1)  a linear  system  if  it  is  linear  in 
yi,  • • • , yn;  that  is,  if  it  can  be  written 


y'l  = «n(0yi  + ■ • • + aln(t)yn  + gi  (t) 


(3) 


y'n  = Onl(t)yi  + • • • + ann(t)yn  + gn(t). 


As  a vector  equation  this  becomes 

(3)  y'  = Ay  + g 


an 

ain 

yi 

8i 

where 

A = 

. y = 

. g = 

_a.nl 

ann_ 

J/n_ 

_Sn_ 

This  system  is  called  homogeneous  if  g = 0,  so  that  it  is 
(4)  y'  = Ay. 

If  g A 0,  then  (3)  is  called  nonhomogeneous.  For  example,  the  systems  in  Examples  1 and 
3 of  Sec.  4. 1 are  homogeneous.  The  system  in  Example  2 of  that  section  is  nonhomogeneous. 

For  a linear  system  (3)  we  have  5/i/dyi  = fln(r),  • • • , dfn/dyn  = ann(t)  in  Theorem  1. 
Hence  for  a linear  system  we  simply  obtain  the  following. 


Existence  and  Uniqueness  in  the  Linear  Case 

Let  the  Oj^’s  and  gj’s  in  (3)  be  continuous  functions  of  t on  an  open  interval 
a < t < (3  containing  the  point  t = to.  Then  (3)  has  a solution  y (t)  on  this  interval 
satisfying  (2),  and  this  solution  is  unique. 


As  for  a single  homogeneous  linear  ODE  we  have 

Superposition  Principle  or  Linearity  Principle 

If  y(1>  and  y<2)  are  solutions  of  the  homogeneous  linear  system  (4)  on  some  interval, 
so  is  any  linear  combination  y = C\  y(1)  + c\  y<2'1. 

Differentiating  and  using  (4),  we  obtain 

y'  = [ciy(1)  + o y(2)] ' 

= ciy(1)'  + c2y(2)' 

= c1Ay(1)  + c2Ay(2) 

= A(Clycl)  + c2y(2))  = Ay. 
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The  general  theory  of  linear  systems  of  ODEs  is  quite  similar  to  that  of  a single  linear 
ODE  in  Secs.  2.6  and  2.7.  To  see  this,  we  explain  the  most  basic  concepts  and  facts.  For 
proofs  we  refer  to  more  advanced  texts,  such  as  [A7J. 

Basis.  General  Solution.  Wronskian 

By  a basis  or  a fundamental  system  of  solutions  of  the  homogeneous  system  (4)  on  some 
interval  J we  mean  a linearly  independent  set  of  n solutions  y(1),  • • ■ , \<n>  of  (4)  on  that 
interval.  (We  write  J because  we  need  I to  denote  the  unit  matrix.)  We  call  a corresponding 
linear  combination 

(5)  y = C!y(1)  • • • + creyCn)  (c1;  • • • , cn  arbitrary) 


a general  solution  of  (4)  on  J.  It  can  be  shown  that  if  the  aMf)  in  (4)  are  continuous  on 
J,  then  (4)  has  a basis  of  solutions  on  J,  hence  a general  solution,  which  includes  every 
solution  of  (4)  on  J. 

We  can  write  n solutions  ya:>,  • • • , y(n)  of  (4)  on  some  interval  J as  columns  of  an  n X n 
matrix 


(6) 


Y = [y(1) 


y(m)]- 


The  determinant  of  Y is  called  the  Wronskian  of  ycl),  • • ■ , y<'n),  written 


(7) 


My1 


a) 


•,y(w))  = 


y(iD  yi2) 

y2°  y® 


y(in) 

y2° 


y™  y™  ■■■  yT 


The  columns  are  these  solutions,  each  in  terms  of  components.  These  solutions  form  a 
basis  on  J if  and  only  if  W is  not  zero  at  any  1\  in  this  interval.  W is  either  identically 
zero  or  nowhere  zero  in  J.  (This  is  similar  to  Secs.  2.6  and  3.1.) 

If  the  solutions  ya),  • • • , yCn)  in  (5)  form  a basis  (a  fundamental  system),  then  (6)  is 
often  called  a fundamental  matrix.  Introducing  a column  vector  c = [q  C2  • ■ ■ cn]T, 
we  can  now  write  (5)  simply  as 


(8) 


y = Yc. 


Furthermore,  we  can  relate  (7)  to  Sec.  2.6,  as  follows.  If  y and  z are  solutions  of  a 
second-order  homogeneous  linear  ODE,  their  Wronskian  is 


W(y,  z) 


y 

I 

y 


z 


t 


z 


To  write  this  ODE  as  a system,  we  have  to  set  y = y i , y'  = y'\  = y2  and  similarly  for  z 
(see  Sec.  4.1).  But  then  W(y,  z)  becomes  (7),  except  for  notation. 
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4.:  Constant-Coefficient  Systems. 

Phase  Plane  Method 

Continuing,  we  now  assume  that  our  homogeneous  linear  system 

(1)  y'  = Ay 

under  discussion  has  constant  coefficients , so  that  the  n X n matrix  A = [a:ji-\  has  entries 
not  depending  on  t.  We  want  to  solve  (1).  Now  a single  ODE  y = ky  has  the  solution 
y = Cekt.  So  let  us  try 

(2)  y = xeAt. 

Substitution  into  (1)  gives  y*  = Axe'u  = Ay  = Axc/U.  Dividing  by  eAt,  we  obtain  the 

eigenvalue  problem 

(3)  Ax  = Ax. 


Thus  the  nontrivial  solutions  of  (1)  (solutions  that  are  not  zero  vectors)  are  of  the  form 
(2),  where  A is  an  eigenvalue  of  A and  x is  a corresponding  eigenvector. 

We  assume  that  A has  a linearly  independent  set  of  n eigenvectors.  This  holds  in  most 
applications,  in  particular  if  A is  symmetric  (a^j  = ajk)  or  skew-symmetric  (ci^j  = — a jjf) 
or  has  n different  eigenvalues. 

Let  those  eigenvectors  be  x(1),  • ■ • , x(n)  and  let  them  correspond  to  eigenvalues 
Ai,  • • ■ , An  (which  may  be  all  different,  or  some — or  even  all — may  be  equal).  Then  the 
corresponding  solutions  (2)  are 


Their  Wronskian  W = W(ya),  • ■ • , y(n))  [(7)  in  Sec.  4.2]  is  given  by 


*F>eAit  • 

■ x^eKt 

x?  • 

r(n) 

Xl 

W = (y(1),  • 

• • , y(m))  = 

x$>ex*  ■ 

Ait  + ■ 

= e 1 

■ + A nt 

X?  ■ 

v(n) 
X 2 

(1)  Ait 

Xyi  6 

(n)  A nt 

• xn  e n 

X?  • 

roo 

On  the  right,  the  exponential  function  is  never  zero,  and  the  determinant  is  not  zero  either 
because  its  columns  are  the  n linearly  independent  eigenvectors.  This  proves  the  following 
theorem,  whose  assumption  is  true  if  the  matrix  A is  symmetric  or  skew-symmetric,  or  if 
the  n eigenvalues  of  A are  all  different. 
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THEOREM  1 


EXAMPLE  1 


General  Solution 

If  the  constant  matrix  A in  the  system  (1)  has  a linearly  independent  set  of  n 
eigenvectors,  then  the  corresponding  solutions  ya),  • • • , y(n)  in  (4)  form  a basis  of 
solutions  of  ( 1),  and  the  corresponding  general  solution  is 

(5)  y = ciX(1)eAlt  + ■ ■ • + cmx(n)eA"t. 


How  to  Graph  Solutions  in  the  Phase  Plane 

We  shall  now  concentrate  on  systems  (1)  with  constant  coefficients  consisting  of  two 
ODEs 

y'\  = «ll.Vl  + fll2>’2 

(6)  y'  = Ay;  in  components, 

>’2  = a2iyi  + a22  y2- 


Of  course,  we  can  graph  solutions  of  (6), 


(7) 


y(0 


A2(0  J 


as  two  curves  over  the  f-axis,  one  for  each  component  of  y(f).  (Figure  80a  in  Sec.  4.1  shows 
an  example. ) But  we  can  also  graph  (7)  as  a single  curve  in  the  y-y  v'2-plane.  This  is  a parametric 
representation  (parametric  equation ) with  parameter  t.  (See  Fig.  80b  for  an  example.  Many 
more  follow.  Parametric  equations  also  occur  in  calculus.)  Such  a curve  is  called  a trajectory 
(or  sometimes  an  orbit  or  path ) of  (6).  The  y\y2- plane  is  called  the  phase  plane.1  If  we  fill 
the  phase  plane  with  trajectories  of  (6),  we  obtain  the  so-called  phase  portrait  of  (6). 

Studies  of  solutions  in  the  phase  plane  have  become  quite  important,  along  with 
advances  in  computer  graphics,  because  a phase  portrait  gives  a good  general  qualitative 
impression  of  the  entire  family  of  solutions.  Consider  the  following  example,  in  which 
we  develop  such  a phase  portrait. 


Trajectories  in  the  Phase  Plane  (Phase  Portrait) 

Find  and  graph  solutions  of  the  system. 

In  order  to  see  what  is  going  on,  let  us  find  and  graph  solutions  of  the  system 


(8) 


thus 


y'l  = -3yi  + y2 
yz=  yi~  3 y2- 


JA  name  that  comes  from  physics,  where  it  is  the  y-(mu)-plane,  used  to  plot  a motion  in  terms  of  position  y 
and  velocity  y'  = v (m  = mass);  but  the  name  is  now  used  quite  generally  for  the  y1y2"Plane- 
The  use  of  the  phase  plane  is  a qualitative  method,  a method  of  obtaining  general  qualitative  information 
on  solutions  without  actually  solving  an  ODE  or  a system.  This  method  was  created  by  HENRI  POINCARE 
(1854-1912),  a great  French  mathematician,  whose  work  was  also  fundamental  in  complex  analysis,  divergent 
series,  topology,  and  astronomy. 
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EXAMPLE  1 


Solution.  By  substituting  y = xeAt  and  y'  = Xxeu  and  dropping  the  exponential  function  we  get  Ax  = Ax. 
The  characteristic  equation  is 


det  (A  — AI)  = 


= A2  + 6A  + 8 = 0. 


This  gives  the  eigenvalues  Ai  = — 2 and  A2  = —4.  Eigenvectors  are  then  obtained  from 


(—3  — A)xi  + *2  = 0. 

ForAi  = —2  this  is  —X\  + x^  = 0.  Hence  we  can  take  x(1)  = [1  1]T.  For  A2  = —4  this  becomes  a'i  + X2  — 0, 

and  an  eigenvector  is  x(2)  = [1  — 1]T.  This  gives  the  general  solution 


y = 


= <ry(1)  + c2  y(2)  = Cl 


1 

f 

1 

+ 

1 

-1 

Figure  82  shows  a phase  portrait  of  some  of  the  trajectories  (to  which  more  trajectories  could  be  added  if  so 
desired).  The  two  straight  trajectories  correspond  to  ci  = 0 and  = 0 and  the  others  to  other  choices  of 
ci,  c2. 


The  method  of  the  phase  plane  is  particularly  valuable  in  the  frequent  cases  when  solving 
an  ODE  or  a system  is  inconvenient  of  impossible. 


Critical  Points  of  the  System  (6) 

The  point  y = 0 in  Fig.  82  seems  to  be  a common  point  of  all  trajectories,  and  we  want 
to  explore  the  reason  for  this  remarkable  observation.  The  answer  will  follow  by  calculus. 
Indeed,  from  (6)  we  obtain 

^ dy2  _ )’2  dt  _ >4  _ <221  >’l  + «22>’2 

dy  1 y[  dt  y[  flnyi  + a12y2' 

This  associates  with  every  point  P:  ( y i , >’2)  a unique  tangent  direction  dy2/dy\  of  the 
trajectory  passing  through  P,  except  for  the  point  P = Po  '.  (0,  0),  where  the  right  side  of  (9) 
becomes  0/0.  This  point  Pq,  at  which  dy2/dy\  becomes  undetermined,  is  called  a critical 
point  of  (6). 


Five  Types  of  Critical  Points 

There  are  five  types  of  critical  points  depending  on  the  geometric  shape  of  the  trajectories 
near  them.  They  are  called  improper  nodes,  proper  nodes,  saddle  points,  centers,  and 
spiral  points.  We  define  and  illustrate  them  in  Examples  1-5. 


( Continued ) Improper  Node  (Fig.  82) 

An  improper  node  is  a critical  point  Pq  at  which  all  the  trajectories,  except  for  two  of  them,  have  the  same 
limiting  direction  of  the  tangent.  The  two  exceptional  trajectories  also  have  a limiting  direction  of  the  tangent 
at  Pq  which,  however,  is  different. 

The  system  (8)  has  an  improper  node  at  0,  as  its  phase  portrait  Fig.  82  shows.  The  common  limiting  direction 
at  0 is  that  of  the  eigenvector  x(1)  = [1  1]T  because  e~4t  goes  to  zero  faster  than  e~2t  as  t increases.  The  two 

exceptional  limiting  tangent  directions  are  those  of  x(2)  = [1  — 1]T  and  — x<2)  = [—  1 1]T. 
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EXAMPLE  2 


EXAMPLE  3 


Proper  Node  (Fig.  83) 

A proper  node  is  a critical  point  Pq  at  which  every  trajectory  has  a definite  limiting  direction  and  for  any  given 
direction  d at  Pq  there  is  a trajectory  having  d as  its  limiting  direction. 

The  system 


(10) 


r 

y 


i 

o 


thus 


y'i  = yi 
y'2.  = y2 


has  a proper  node  at  the  origin  (see  Fig.  83).  Indeed,  the  matrix  is  the  unit  matrix.  Its  characteristic  equation 
(1  — A)2  = 0 has  the  root  A = 1.  Any  x # 0 is  an  eigenvector,  and  we  can  take  [1  0]T  and  [0  1]T.  Hence 

a general  solution  is 


1 

o' 

y = ci 

0 

e*  + c2 

1 

yi  = 

or  or 

y2  = c2e‘ 


C1.V2  = c2yi- 


Fig.  82.  Trajectories  of  the  system  (8) 
(Improper  node) 


Fig.  83.  Trajectories  of  the  system  (10) 
(Proper  node) 


Saddle  Point  (Fig.  84) 

A saddle  point  is  a critical  point  P0  at  which  there  are  two  incoming  trajectories,  two  outgoing  trajectories,  and 
all  the  other  trajectories  in  a neighborhood  of  /}>  bypass  P0. 

The  system 


(11) 


thus 


y'i  = yi 
y'i  = -yi 


has  a saddle  point  at  the  origin.  Its  characteristic  equation  (1  — A)(— 1 — A)  = 0 has  the  roots  Aj  = 1 and 
A2  = — 1 . For  A = 1 an  eigenvector  [1  0]T  is  obtained  from  the  second  row  of  (A  — AI)x  = 0,  that  is. 

Ox -]  + (—1  — l)x2  = 0.  For  A2  = — 1 the  first  row  gives  [0  1]T.  Hence  a general  solution  is 


i 

o' 

y = ci 

o_ 

e*  + c2 

1 

Vi  = Cje* 

or 

y2  = c2e~ 


or  ^1^2  = const. 


This  is  a family  of  hyperbolas  (and  the  coordinate  axes);  see  Fig.  84. 
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EXAMPLE  4 


EXAMPLE  5 


Center  (Fig.  85) 

A center  is  a critical  point  that  is  enclosed  by  infinitely  many  closed  trajectories. 
The  system 


(12) 


y' 


o 

-4 


(a)  y'i  = y2 

thus 

(b)  y'%  = -4yi 


has  a center  at  the  origin.  The  characteristic  equation  A2  + 4 = 0 gives  the  eigenvalues  2 i and  —2 i.  For  2 i an 
eigenvector  follows  from  the  first  equation  —2ix±  + X2  — 0 of  (A  — AI)x  = 0,  say,  [1  2 i]T.  For  A = —2 i that 

equation  is  —(—2i)xi  + X2  = 0 and  gives,  say,  [1  — 2*]T.  Hence  a complex  general  solution  is 


l 

e2it  + c2 

1 

(12*) 

y = ci 

2 i 

—2/ 

— _2it  i — 2it 

yi  = 

thus 

y2  = 2(c1r2“  - 2 /c2e“2!t. 


A real  solution  is  obtained  from  (12*)  by  the  Euler  formula  or  directly  from  (12)  by  a trick.  (Remember  the 
trick  and  call  it  a method  when  you  apply  it  again.)  Namely,  the  left  side  of  (a)  times  the  right  side  of  (b)  is 
— 4yiyJ.  This  must  equal  the  left  side  of  (b)  times  the  right  side  of  (a).  Thus, 

-4vi>’;  = y2y'z-  By  integration,  2 y\  + \y2  = const. 


This  is  a family  of  ellipses  (see  Fig.  85)  enclosing  the  center  at  the  origin. 


Fig.  84.  Trajectories  of  the  system  (11) 
(Saddle  point) 


Fig.  85.  Trajectories  of  the  system  (12) 
(Center) 


Spiral  Point  (Fig.  86) 

A spiral  point  is  a critical  point  P0  about  which  the  trajectories  spiral,  approaching  /q  as  I — * (or  tracing  these 

spirals  in  the  opposite  sense,  away  from  P0). 

The  system 


(13) 


y' 


thus 


y'i  = ~yi  + y2 
y'z  = ~yi  ~ yz 


has  a spiral  point  at  the  origin,  as  we  shall  see.  The  characteristic  equation  is  A2  + 2A  + 2 = 0.  It  gives  the 
eigenvalues  — 1 + i and  — 1 — i.  Corresponding  eigenvectors  are  obtained  from  (— 1 — A)a'  | + x2  = 0.  For 
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EXAMPLE  6 


A = — 1 + /'  this  becomes  —ix  i + *2  — 0 and  we  can  take  [1  i]T  as  an  eigenvector.  Similarly,  an  eigenvector 
corresponding  to  — 1 — / is  [ 1 — i]T.  This  gives  the  complex  general  solution 


£ 

II 

1 

e(-i+«  + C2 

f 

i 

— i 

The  next  step  would  be  the  transformation  of  this  complex  solution  to  a real  general  solution  by  the  Euler 
formula.  But,  as  in  the  last  example,  we  just  wanted  to  see  what  eigenvalues  to  expect  in  the  case  of  a spiral 
point.  Accordingly,  we  start  again  from  the  beginning  and  instead  of  that  rather  lengthy  systematic  calculation 
we  use  a shortcut.  We  multiply  the  first  equation  in  (13)  by  yi,  the  second  by  y 2,  and  add,  obtaining 

yiy'i  + ^2)4  = ~(yi  + yi)- 

We  now  introduce  polar  coordinates  r,  t,  where  r2  = yf  + yi-  Differentiating  this  with  respect  to  t gives 
2 rr  = 2yiyi  + 2y2y2-  Hence  the  previous  equation  can  be  written 

rr  = ~r2,  Thus,  r = — r,  dr/r  = —dt,  In  | r|  = — t + c*,  r = ce~t. 

For  each  real  c this  is  a spiral,  as  claimed  (see  Fig.  86). 


No  Basis  of  Eigenvectors  Available.  Degenerate  Node  (Fig.  87) 

This  cannot  happen  if  A in  (1)  is  symmetric  (a^j  = as  in  Examples  1-3)  or  skew-symmetric  (a^j  = —Ojki 
thus  cijj  = 0).  And  it  does  not  happen  in  many  other  cases  (see  Examples  4 and  5).  Hence  it  suffices  to  explain 
the  method  to  be  used  by  an  example. 

Find  and  graph  a general  solution  of 


(14) 


1 


2 


y- 


Solution.  A is  not  skew-symmetric!  Its  characteristic  equation  is 


det  (A  — AI)  = 


4 - A 
-1 


1 


2 - A 


= A2  - 6A  + 9 = (A  - 3)2  = 0. 
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It  has  a double  root  A = 3.  Hence  eigenvectors  are  obtained  from  (4  — A)jci  + x2  — 0,  thus  from  *i  + x2  = 0, 
say,  x(1)  = [1  — 1]T  and  nonzero  multiples  of  it  (which  do  not  help).  The  method  now  is  to  substitute 


y 


(2) 


xteAt  + ueAt 


with  constant  u = [u\  u2]J  into  (14).  (The  xf-term  alone,  the  analog  of  what  we  did  in  Sec.  2.2  in  the  case 
of  a double  root,  would  not  be  enough.  Try  it.)  This  gives 

y(2)'  = xeAf  + Xxtext  + AueAt  = Ay(2)  = A xte*  + AueAt. 

On  the  right.  Ax  = Ax.  Hence  the  terms  A xtext  cancel,  and  then  division  by  gives 


x + Au  = Au, 


thus  (A  — AI)u  = x. 


Here  A = 3 and  x = [1  — 1]T,  so  that 


(A  - 3I)u  = 


4-3  1 

1 

u = 

-1  2-3 

-1 

thus 


U 1 + ll2  = 1 

~u1  - u2=  -1. 


A solution,  linearly  independent  of  x = [1  — 1],  is  u = [0  1]T.  This  yields  the  answer  (Fig.  87) 

(\ 


y = c1ytl)  + c2  y(2)  = Cl 


1 

-1 


3 1 , 

e + c2 


1 

-1 


1 + 


The  critical  point  at  the  origin  is  often  called  a degenerate  node.  Cxy*1’  gives  the  heavy  straight  line,  with 
Ci  > 0 the  lower  part  and  < \ < 0 the  upper  part  of  it.  y<2>  gives  the  right  part  of  the  heavy  curve  from  0 through 
the  second,  first,  and — finally — fourth  quadrants.  — y<2)  gives  the  other  part  of  that  curve. 


Fig.  87.  Degenerate  node  in  Example  6 


We  mention  that  for  a system  (1)  with  three  or  more  equations  and  a triple  eigenvalue 
with  only  one  linearly  independent  eigenvector,  one  will  get  two  solutions,  as  just 
discussed,  and  a third  linearly  independent  one  from 

y(3)  = 2Xt2eAt  + u teH  + \eAt 


with  v from  u + Av  = Av. 
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P-R-QBl=E^M=S=E^T— 4=F3 


1-9 


GENERAL  SOLUTION 


14.  yi  = —yi  - >’2 


Find  a real  general  solution  of  the  following  systems.  Show 
the  details. 

i-  y'\  = yi  + y2 

>’2  = 3y,  - y2 
2.  yi  = 6yi  + 9y2 

>'2  = Vi  + 6y2 

3-  yi  = yi  + 2yz 
>'2  = §yi  + y2 

4-  yi  = -8yi  - 2y2 

y2  = 2y1  - 4y2 

5.  y[  = 2y,  + 5y2 

y2  = 5yi  + 12.5y2 

6-  yi  = 2y1  - 2yz 

y2  = 2yi  + 2y2 


V2  = yi  - y2 
yi(0)  = l,  y2(0)  = o 


15.  vj  = 3yi  + 2y2 
y2  = 2yi  + 3v2 


yi(0) 

16-17 


= 0.5,  y2(0)  = -0.5 

CONVERSION 


Find  a general  solution  by  conversion  to  a single  ODE. 


16.  The  system  in  Prob.  8. 


17.  The  system  in  Example  5 of  the  text. 

18.  Mixing  problem.  Fig.  88.  Each  of  the  two  tanks 
contains  200  gal  of  water,  in  which  initially  100  lb 
(Tank  7i)  and  200  lb  (Tank  Tz)  of  fertilizer  are  dissolved. 
The  inflow,  circulation,  and  outflow  are  shown  in 
Fig.  88.  The  mixture  is  kept  uniform  by  stirring.  Find 
the  fertilizer  contents  yi(t)  in  7j  and  y2(f)  in  Tz. 


7-  yi  = y2 
y2  = -yi  + y3 
y3  = -y2 

8.  vi  = 8yi  - y2 

y2  = yi  + I0y2 

9.  yi  = 10y!  - 10y2  - 4y3 
y2  = “ 10yi  + y2  - 14y3 
y3  = — 4yi  - 14y2  - 2y3 


10-15 


IVPs 


F'g-  88.  Tanks  in  Problem  18 

19.  Network.  Show  that  a model  for  the  currents  /j(f)  and 
I2(t)  in  Fig.  89  is 


| hdt  + R(h  - I2)  = 0,  U'2  + R(I2  - /i)  = 0. 


Solve  the  following  initial  value  problems. 

10.  yi  = 2y!  + 2y2 
v2  = 5yi  — y2 

yi(0)  = o,  y2(0)  = 7 

11.  yi  = 2v!  + 5y2 

r _ 1 3 

y2  — 2yi  2y2 

yi(0)  = — 12,  y2(0)  = 0 

12.  yi  = V!  + 3y2 
y2  = |yi  + y2 


Find  a general  solution,  assuming  that  R = 3 D, 
L = 4H,  C = 1/12  F. 

C 



R 

<^WV 

L V 


Fig.  89.  Network  in  Problem  19 


yi/O)  = 12,  y2(0)  = 2 
13.  yi  = v2 
y2  = yi 

yi(0)  = 0,  y2(0)  = 2 


20.  CAS  PROJECT.  Phase  Portraits.  Graph  some  of 
the  figures  in  this  section,  in  particular  Fig.  87  on  the 
degenerate  node,  in  which  the  vector  y(2)  depends  on  t. 
In  each  figure  highlight  a trajectory  that  satisfies  an 
initial  condition  of  your  choice. 
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4.4  Criteria  for  Critical  Points.  Stability 

We  continue  our  discussion  of  homogeneous  linear  systems  with  constant  coefficients  (1). 
Let  us  review  where  we  are.  From  Sec.  4.3  we  have 


(1)  y'  = Ay 


Oil 

012 

021 

fl22 

in  components, 


y'i  = flnyi  + «i2y2 
y2  = a2iyi  + fl22y2- 


From  the  examples  in  the  last  section,  we  have  seen  that  we  can  obtain  an  overview  of 
families  of  solution  curves  if  we  represent  them  parametrically  as  y(f)  = [yi(t)  V2W  |T 
and  graph  them  as  curves  in  the  yiy2-plane,  called  the  phase  plane.  Such  a curve  is  called 
a trajectory  of  (1),  and  their  totality  is  known  as  the  phase  portrait  of(l). 

Now  we  have  seen  that  solutions  are  of  the  form 


y (t)  = xeAt.  Substitution  into  (1)  gives  y ' (t)  = AxeAt  = Ay  = AxeAt. 

Dropping  the  common  factor  eAt,  we  have 


(2) 


Ax  = Ax. 


Hence  y(f)  is  a (nonzero)  solution  of  (1)  if  A is  an  eigenvalue  of  A and  x a corresponding 
eigenvector. 

Our  examples  in  the  last  section  show  that  the  general  form  of  the  phase  portrait  is 
determined  to  a large  extent  by  the  type  of  critical  point  of  the  system  ( 1 ) defined  as  a 
point  at  which  dyz/dyi  becomes  undetermined,  0/0;  here  [see  (9)  in  Sec.  4.3] 


(3) 


dy 2 _ y'2.  dt  _ «21.Vl  + fl2 2 42 

dy 1 y'l  dt  anyi  + oi2y2 


We  also  recall  from  Sec.  4.3  that  there  are  various  types  of  critical  points. 

What  is  now  new,  is  that  we  shall  see  how  these  types  of  critical  points  are  related 
to  the  eigenvalues.  The  latter  are  solutions  A = Ai  and  A2  of  the  characteristic  equation 


(4)  det  (A  - AI) 


an  A 
«21 


a12 

a22  ~ A 


— A2  — (an  + «22)A  + det  A — 0. 


This  is  a quadratic  equation  A2  — p\  + q = 0 with  coefficients  p,  q and  discriminant  A 
given  by 

(5)  p = an  + <722,  q = det  A = flnfl22  ~ 012021,  A = p2  - 4 q. 


From  algebra  we  know  that  the  solutions  of  this  equation  are 


(6) 


Ai  = Up  + VA),  a2  = Up  ^ VA). 
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Furthermore,  the  product  representation  of  the  equation  gives 

A2  — p\  4-  q — (A  — Ai)(A  — A2)  — A2  — (A^  + A2)A  + A1A2. 

Hence  p is  the  sum  and  q the  product  of  the  eigenvalues.  Also  Ai  — A2  = VA  from  (6). 
Together, 

(7)  p = Ai  + A2,  q = AjA2,  A = (Aj  — A2)2. 

This  gives  the  criteria  in  Table  4.1  for  classifying  critical  points.  A derivation  will  be 
indicated  later  in  this  section. 


Table  4.1  Eigenvalue  Criteria  for  Critical  Points 
(Derivation  after  Table  4.2) 


Name 

P — Ai  + A2 

q — A]A2 

A — (Ai  A2)2 

Comments  on  Aj,  A2 

(a)  Node 

q > 0 

AgO 

Real,  same  sign 

(b)  Saddle  point 

q < 0 

Real,  opposite  signs 

(c)  Center 

p = 0 

q > 0 

Pure  imaginary 

(d)  Spiral  point 

p # 0 

A < 0 

Complex,  not  pure 
imaginary 

Stability 

Critical  points  may  also  be  classified  in  terms  of  their  stability.  Stability  concepts  are  basic 
in  engineering  and  other  applications.  They  are  suggested  by  physics,  where  stability 
means,  roughly  speaking,  that  a small  change  (a  small  disturbance)  of  a physical  system 
at  some  instant  changes  the  behavior  of  the  system  only  slightly  at  all  future  times  t.  For 
critical  points,  the  following  concepts  are  appropriate. 


DEFINITIONS 


Stable,  Unstable,  Stable  and  Attractive 

A critical  point  /})  of  (1)  is  called  stable2  if,  roughly,  all  trajectories  of  (1)  that  at 
some  instant  are  close  to  Pq  remain  close  to  Pq  at  all  future  times;  precisely:  if  for 
every  disk  De  of  radius  e > 0 with  center  Pq  there  is  a disk  Ds  of  radius  8 > 0 with 
center  Pq  such  that  every  trajectory  of  (1)  that  has  a point  P\  (corresponding  to  t = t\, 
say)  in  Ds  has  all  its  points  corresponding  to  t g t\  in  De.  See  Fig.  90. 

Pq  is  called  unstable  if  Pq  is  not  stable. 

Pq  is  called  stable  and  attractive  (or  asymptotically  stable)  if  Iq  is  stable  and 
every  trajectory  that  has  a point  in  Ds  approaches  Iq  as  t — » 0°.  See  Fig.  91. 


Classification  criteria  for  critical  points  in  terms  of  stability  are  given  in  Table  4.2.  Both 
tables  are  summarized  in  the  stability  chart  in  Fig.  92.  In  this  chart  region  of  instability 
is  dark  blue. 


2In  the  sense  of  the  Russian  mathematician  ALEXANDER  MICHAILOVICH  LJAPUNOV  (1857-1918), 
whose  work  was  fundamental  in  stability  theory  for  ODEs.  This  is  perhaps  the  most  appropriate  definition  of 
stability  (and  the  only  we  shall  use),  but  there  are  others,  too. 
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Fig.  90.  Stable  critical  point  P0  of  (1) 
(The  trajectory  initiating  at  P,  stays 
in  the  disk  of  radius  e.) 


Fig.  91.  Stable  and  attractive  critical 
point  P0  of  (1) 


Table  4.2  Stability  Criteria  for  Critical  Points 


Type  of  Stability 

P — Ay  + A2 

(M 

■< 

II 

(a)  Stable  and  attractive 

(b)  Stable 

(c)  Unstable 

p < 0 
0 

p > 0 0 

q > 0 
q > 0 
R q < 0 

Fig.  92.  Stability  chart  of  the  system  (1)  with  p,  q,  A defined  in  (5). 

Stable  and  attractive:  The  second  quadrant  without  the  q-axis. 
Stability  also  on  the  positive  q-axis  (which  corresponds  to  centers). 
Unstable:  Dark  blue  region 


We  indicate  how  the  criteria  in  Tables  4.1  and  4.2  are  obtained.  If  q = A1A2  > 0,  both 
of  the  eigenvalues  are  positive  or  both  are  negative  or  complex  conjugates.  If  also 
p = Ai  + A2  < 0,  both  are  negative  or  have  a negative  real  part.  Hence  Pq  is  stable  and 
attractive.  The  reasoning  for  the  other  two  lines  in  Table  4.2  is  similar. 

If  A < 0,  the  eigenvalues  are  complex  conjugates,  say,  A^  = a + if5  and  A2  = a — iji. 
If  also  p = Ai  + A2  = 2a  < 0,  this  gives  a spiral  point  that  is  stable  and  attractive.  If 
p = 2a  > 0,  this  gives  an  unstable  spiral  point. 

If  p = 0,  then  A2  = — Ai  and  q = A1A2  = — A?.  If  also  q > 0,  then  A?  = —q  < 0,  so 
that  Ai,  and  thus  A2,  must  be  pure  imaginary.  This  gives  periodic  solutions,  their  trajectories 
being  closed  curves  around  Pq,  which  is  a center. 


Application  of  the  Criteria  in  Tables  4.1  and  4.2 


In  Example  1,  Sec  4.3,  we  have  y = 
stable  and  attractive  by  Table  4.2(a). 


-3 

1 


1 


y.p  = 


-3 


-6  ,q 


8,  A 


4,  a node  by  Table  4.1(a),  which  is 
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EXAMPLE  2 Free  Motions  of  a Mass  on  a Spring 

What  kind  of  critical  point  does  my  + cy  + ky  = 0 in  Sec.  2.4  have? 

Solution.  Division  by  m gives  y"  — —(k/m)y  — (c/m)y  . To  get  a system,  setyi  = y,  y^  = y (see  Sec.  4.1). 
Then  y^  = y — ~(k/m)y\  — (c/m)y2-  Hence 


y' 


o 1 

—k/ m ~c/m 


det  (A  — AI)  = 


-A 

—k/m 


m m 


We  see  that/?  = —c/m,q  = k/m , A = (c/m)2  — Ak/m.  From  this  and  Tables  4.1  and  4.2  we  obtain  the  following 
results.  Note  that  in  the  last  three  cases  the  discriminant  A plays  an  essential  role. 


No  damping,  c = 0,  p = 0,  q > 0,  a center. 

Underdamping.  cz  < Amk,  p < 0,  q > 0,  A < 0,  a stable  and  attractive  spiral  point. 

Critical  damping,  c = Amk,  p < 0,  q > 0,  A = 0,  a stable  and  attractive  node. 

Overdamping,  c2  > Amk,  p < 0,  q > 0,  A > 0,  a stable  and  attractive  node.  ■ 


PROBLEM~SFT-^-~4 


1-10 


TYPE  AND  STABILITY  OF 
CRITICAL  POINT 


Determine  the  type  and  stability  of  the  critical  point.  Then 
find  a real  general  solution  and  sketch  or  graph  some  of  the 
trajectories  in  the  phase  plane.  Show  the  details  of  your  work. 


1-  y'l  = Vi 
y'2  = 2y2 

3.  y[  = y2 
y'2  = — 9yi 

5.  y[  = — 2yi  + 2y2 
V2  = ~2vi  - 2y2 

7-  y'l  = Vi  + 2y2 
y'2  = 2yi  + y2 

9-  y'i  = 4yi  + y2 
y'2  = 4yi  + 4y2 


2.  y'i  = -4yx 

y'2  = -3y2 

4-  y'i  = 2yi  + y2 
y'2  = 5yi  - 2y2 

6.  y'i  = — 6yi  - y2 
y2  = -9yi  - 6y2 

8-  y'l  = -yi  + 4 y2 
y'2  ~ 3yi  — 2y2 

10.  y'i  = y2 

y'2  = -5vi  - 2y2 


11-18 


TRAJECTORIES  OF  SYSTEMS  AND 
SECOND-ORDER  ODEs.  CRITICAL 
POINTS 


11.  Damped  oscillations.  Solvey”  + 2y  + 2y  = 0.  What 
kind  of  curves  are  the  trajectories? 


12.  Harmonic  oscillations.  Solve  y"  + l,y  = 0.  Find  the 
trajectories.  Sketch  or  graph  some  of  them. 


13.  Types  of  critical  points.  Discuss  the  critical  points  in 
( 10)— (13)  of  Sec.  4.3  by  using  Tables  4.1  and  4.2. 


14.  Transformation  of  parameter.  What  happens  to  the 
critical  point  in  Example  1 if  you  introduce  t = — t as 
a new  independent  variable? 


15.  Perturbation  of  center.  What  happens  in  Example  4 
of  Sec.  4.3  if  you  change  A to  A + 0.  II,  where  I is  the 
unit  matrix? 

16.  Perturbation  of  center.  If  a system  has  a center  as 
its  critical  point,  what  happens  if  you  replace  the 
matrix  A by  A = A + kl  with  any  real  number  k # 0 
(representing  measurement  errors  in  the  diagonal 
entries)? 

17.  Perturbation.  The  system  in  Example  4 in  Sec.  4.3 
has  a center  as  its  critical  point.  Replace  each  a ^ in 
Example  4,  Sec.  4.3,  by  cijh  + b.  Find  values  of  b such 
that  you  get  (a)  a saddle  point,  (b)  a stable  and  attractive 
node,  (c)  a stable  and  attractive  spiral,  (d)  an  unstable 
spiral,  (e)  an  unstable  node. 

18.  CAS  EXPERIMENT.  Phase  Portraits.  Graph  phase 
portraits  for  the  systems  in  Prob.  17  with  the  values 
of  b suggested  in  the  answer.  Try  to  illustrate  how 
the  phase  portrait  changes  “continuously”  under  a 
continuous  change  of  b. 

19.  WRITING  PROBLEM.  Stability.  Stability  concepts 
are  basic  in  physics  and  engineering.  Write  a two-part 
report  of  3 pages  each  (A)  on  general  applications 
in  which  stability  plays  a role  (be  as  precise  as  you 
can),  and  (B)  on  material  related  to  stability  in  this 
section.  Use  your  own  formulations  and  examples;  do 
not  copy. 

20.  Stability  chart.  Locate  the  critical  points  of  the 
systems  (10) — (14)  in  Sec.  4.3  and  of  Probs.  1,  3,  5 in 
this  problem  set  on  the  stability  chart. 
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4.5  Qualitative  Methods  for  Nonlinear  Systems 

Qualitative  methods  are  methods  of  obtaining  qualitative  information  on  solutions 
without  actually  solving  a system.  These  methods  are  particularly  valuable  for  systems 
whose  solution  by  analytic  methods  is  difficult  or  impossible.  This  is  the  case  for  many 
practically  important  nonlinear  systems 


, >T  =/l(>'l,>’2) 

(1)  y = f(y),  thus  ( 

= /2(>t,  y2). 

In  this  section  we  extend  phase  plane  methods,  as  just  discussed,  from  linear  systems 
to  nonlinear  systems  (1).  We  assume  that  (1)  is  autonomous,  that  is,  the  independent 
variable  t does  not  occur  explicitly.  (All  examples  in  the  last  section  are  autonomous.) 
We  shall  again  exhibit  entire  families  of  solutions.  This  is  an  advantage  over  numeric 
methods,  which  give  only  one  (approximate)  solution  at  a time. 

Concepts  needed  from  the  last  section  are  the  phase  plane  (the  yiy2-plane),  trajectories 
(solution  curves  of  (1)  in  the  phase  plane),  the  phase  portrait  of  (1)  (the  totality  of  these 
trajectories),  and  critical  points  of  (1)  (points  (vi,  y2) at  which  both/j(  vi,  y2)  and/2(y1;  y2) 
are  zero). 

Now  (1)  may  have  several  critical  points.  Our  approach  shall  be  to  discuss  one  critical 
point  after  another.  If  a critical  point  P0  is  not  at  the  origin,  then,  for  technical 
convenience,  we  shall  move  this  point  to  the  origin  before  analyzing  the  point.  More 
formally,  if  Pq.  ( a , b ) is  a critical  point  with  (a,  b ) not  at  the  origin  (0,  0),  then  we  apply 
the  translation 


>T  =yi  ~a,  y2  = y2  ~ b 

which  moves  P0  to  (0,  0)  as  desired.  Thus  we  can  assume  /),  to  be  the  origin  (0,  0),  and 
for  simplicity  we  continue  to  write  yl5  y2  (instead  of  yi,  y2).  We  also  assume  that  P0  is 
isolated,  that  is,  it  is  the  only  critical  point  of  (1)  within  a (sufficiently  small)  disk  with 
center  at  the  origin.  If  (1)  has  only  finitely  many  critical  points,  that  is  automatically 
true.  (Explain!) 


Linearization  of  Nonlinear  Systems 

How  can  we  determine  the  kind  and  stability  property  of  a critical  point  P0 : (0,  0)  of 
(1)?  In  most  cases  this  can  be  done  by  linearization  of  (1)  near  P0,  writing  (1)  as 
y = f(y)  = Ay  + h(y)  and  dropping  h(y),  as  follows. 

Since  P0  is  critical, /j(0,  0)  = 0,/2(0,  0)  = 0,  so  that/j  and /2  have  no  constant  terms 
and  we  can  write 


(2) 


y'  = Ay  + h(y), 


thus 


y'i  = «n>T  + a12y2  + ^i(yi>  y2) 
y'2  = «2iyi  + a22y2  + h2(yi,  y2). 


A is  constant  (independent  of  t)  since  (1)  is  autonomous.  One  can  prove  the  following 
(proof  in  Ref.  [A7],  pp.  375-388,  listed  in  App.  1). 
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THEOREM  1 


EXAMPLE  1 


Linearization 

If  fa  and  fa  in  (1)  are  continuous  and  have  continuous  partial  derivatives  in  a 
neighborhood  of  the  critical  point  Pq:  (0,  0),  and  if  det  A 0 in  (2),  then  the  kind 
and  stability  of  the  critical  point  of  i I ) are  the  same  as  those  of  the  linearized 

system 

, y'l  = aim  + ai2^2 

(3)  y = Ay,  thus  ; 

y 2 = «2iyi  + a22>’2- 

Exceptions  occur  if  A has  equal  or  pure  imaginary  eigenvalues;  then  (1)  may  have 
the  same  kind  of  critical  point  as  (3)  or  a spiral  point. 


Free  Undamped  Pendulum.  Linearization 

Figure  93a  shows  a pendulum  consisting  of  a body  of  mass  m (the  bob)  and  a rod  of  length  L.  Determine  the 
locations  and  types  of  the  critical  points.  Assume  that  the  mass  of  the  rod  and  air  resistance  are  negligible. 

Solution.  Step  1.  Setting  up  the  mathematical  model.  Let  6 denote  the  angular  displacement,  measured 
counterclockwise  from  the  equilibrium  position.  The  weight  of  the  bob  is  mg  ( g the  acceleration  of  gravity).  It 
causes  a restoring  force  mg  sin  6 tangent  to  the  curve  of  motion  (circular  arc)  of  the  bob.  By  Newton’s  second 
law,  at  each  instant  this  force  is  balanced  by  the  force  of  acceleration  mLd” , where  L6"  is  the  acceleration; 
hence  the  resultant  of  these  two  forces  is  zero,  and  we  obtain  as  the  mathematical  model 

mLd " + mg  sin  6 = 0. 

Dividing  this  by  mL,  we  have 


(4) 


6 " + k sin  6 = 0 


When  6 is  very  small,  we  can  approximate  sin  6 rather  accurately  by  6 and  obtain  as  an  approximate  solution 
A cos  y/ict  + B sin  y/kt,  but  the  exact  solution  for  any  6 is  not  an  elementary  function. 

Step  2.  Critical  points  (0,  0),  (±2tt,  0),  (±4tt,  0),  • • , Linearization.  To  obtain  a system  of  ODEs,  we  set 
6 = y i,6 r = y 2-  Then  from  (4)  we  obtain  a nonlinear  system  (1)  of  the  form 


(4*) 


y'i  = Myi,  yz)  = yz 


y'z  = fz(yi,yz)  = -*siny!. 


The  right  sides  are  both  zero  when  y2  = 0 and  sin  yi  = 0.  This  gives  infinitely  many  critical  points  («77,  0), 
where  n = 0,  ±1,  ±2,  • ■ • . We  consider  (0,  0).  Since  the  Maclaurin  series  is 


sinyj  = yi  - jy?  + - • 


: yi. 


the  linearized  system  at  (0,  0)  is 

y'  = Ay  = 


0 1 

-k  0 


thus 


y i = y2 

= -tyi- 


To  apply  our  criteria  in  Sec.  4.4  we  calculate  p = an  + £*22  — 0,  q = det  A = k = g/L{> 0),  and 
A = p —4 q = —4k.  From  this  and  Table  4.1(c)  in  Sec.  4.4  we  conclude  that  (0,  0)  is  a center,  which  is  always 
stable.  Since  sin  6 = sin  yi  is  periodic  with  period  277,  the  critical  points  (mr,  0),  n = ±2,  ±4,  • • • , are  all  centers. 

Step  3.  Critical  points  (±tt,  0),  (±3tt,  0),  (±5tt,  0),  • • • , Linearization.  We  now  consider  the  critical  point 
(77,  0),  setting  6 — 77  = yi  and  (0  — 77/  = 6'  = y2 • Then  in  (4), 


sin  6 = sin  (yx  + 77)  = — sinyx  = — yx  + gy? 1 ~ — yx 
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and  the  linearized  system  at  (17,  0)  is  now 


y'  = Ay  = 


0 


k 


thus 


y'x  = y2 
y'2  = Ay  i . 


We  see  that  p = 0,  q = —k  (<0),  and  A = —4q  = 4k.  Hence,  by  Table  4.1(b),  this  gives  a saddle  point,  which 
is  always  unstable.  Because  of  periodicity,  the  critical  points  («ir,  0),  n = ±1,  ±3,  • ••,  are  all  saddle  points. 
These  results  agree  with  the  impression  we  get  from  Fig.  93b. 


mg  sin  0 


(a)  Pendulum 


Fig.  93. 


(b)  Solution  curves y2(y1)  of  (4)  in  the  phase  plane 

Example  1 (C  will  be  explained  in  Example  4.) 


Linearization  of  the  Damped  Pendulum  Equation 

To  gain  further  experience  in  investigating  critical  points,  as  another  practically  important  case,  let  us  see  how 
Example  1 changes  when  we  add  a damping  term  c9  (damping  proportional  to  the  angular*  velocity)  to  equation 

(4) ,  so  that  it  becomes 

(5)  d"  + cQ’  + k sin  d = 0 

where  k > 0 and  c ^ 0 (which  includes  our  previous  case  of  no  damping,  c = 0).  Setting  0 = yi,  0 = y2 , as 
before,  we  obtain  the  nonlinear  system  (use  d"  = y^) 

y'l  = y2 

yz=  ~k  sin  yx  - cy2. 


We  see  that  the  critical  points  have  the  same  locations  as  before,  namely,  (0,  0),  (±77,  0),  (±277,  0),  • • • . We 
consider  (0,  0).  Linearizing  sinyi  ~ yi  as  in  Example  1,  we  get  the  linearized  system  at  (0,  0) 


(6) 


y'i  = y2 

thus 

y2  = -kyi  ~ cy2. 


This  is  identical  with  the  system  in  Example  2 of  Sec.  4.4,  except  for  the  (positive!)  factor  m (and  except  for 
the  physical  meaning  of  yi).  Hence  for  c = 0 (no  damping)  we  have  a center  (see  Fig.  93b),  for  small  damping 
we  have  a spiral  point  (see  Fig.  94),  and  so  on. 

We  now  consider  the  critical  point  (77,  0).  We  set  6 — 77  = yi,  ( 6 — rr)'  — d'  = y2  and  linearize 


sin  6 — sin  (yi  + 77)  = —sinyi  ~ — yi- 


This  gives  the  new  linearized  system  at  (77,  0) 


(6*) 


y'i  = y2 

thus 

y2  = ky  i - cy2. 
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For  our  criteria  in  Sec.  4.4  we  calculate  p = an  + a 22  ~ ~c,  q = det  A = —k,  and  A = p2  — 4q  = c2  + 4 k. 
This  gives  the  following  results  for  the  critical  point  at  (77,  0). 

No  damping,  c = 0,  p = 0,  q < 0,  A > 0,  a saddle  point.  See  Fig.  93b. 

Damping,  c > 0,  p < 0,  q < 0,  A > 0,  a saddle  point.  See  Fig.  94. 

Since  sinyi  is  periodic  with  period  277,  the  critical  points  (±277,  0),  (±477,  0),  • • • are  of  the  same  type  as 
(0,  0),  and  the  critical  points  (—77,  0),  (±377,  0),  • • • are  of  the  same  type  as  (77,  0),  so  that  our  task  is  finished. 

Figure  94  shows  the  trajectories  in  the  case  of  damping.  What  we  see  agrees  with  our  physical  intuition. 
Indeed,  damping  means  loss  of  energy.  Hence  instead  of  the  closed  trajectories  of  periodic  solutions  in 
Fig.  93b  we  now  have  trajectories  spiraling  around  one  of  the  critical  points  (0,  0),  (±27 7,  0),  • • • . Even  the 
wavy  trajectories  corresponding  to  whirly  motions  eventually  spiral  around  one  of  these  points.  Furthermore, 
there  are  no  more  trajectories  that  connect  critical  points  (as  there  were  in  the  undamped  case  for  the  saddle 
points).  ■ 


Fig.  94.  Trajectories  in  the  phase  plane  for  the  damped  pendulum  in  Example  2 


Lotka-Volterra  Population  Model 

Predator-Prey  Population  Model3 

This  model  concerns  two  species,  say,  rabbits  and  foxes,  and  the  foxes  prey  on  the  rabbits. 

Step  1.  Setting  lip  the  model.  We  assume  the  following. 

1.  Rabbits  have  unlimited  food  supply.  Hence,  if  there  were  no  foxes,  their  number  yi(t)  would  grow 
exponentially,  = ay\. 

2.  Actually,  yi  is  decreased  because  of  the  kill  by  foxes,  say,  at  a rate  proportional  to  yiy2>  where  y^{t)  is 
the  number  of  foxes.  Hence  yi  = ayi  — by^y2,  where  a > 0 and  b > 0. 

3.  If  there  were  no  rabbits,  then  y^ify  would  exponentially  decrease  to  zero,  y2  — ~lyz-  However,  y2  is 
increased  by  a rate  proportional  to  the  number  of  encounters  between  predator  and  prey;  together  we 
have  y2  — — Zy2  + kyiyz,  where  k > 0 and  / > 0. 

This  gives  the  (nonlinear!)  Lotka-Volterra  system 


(7) 


y'i  =Myi,y£  = ay  1 - 6>’i  y2 


y'2.  =fz(yi,yz)  = kyiyz  ~ ly z- 


introduced  by  ALFRED  J.  LOTKA  (1880-1949),  American  biophysicist,  and  VITO  VOLTERRA 
(1860-1940),  Italian  mathematician,  the  initiator  of  functional  analysis  (see  [GR7]  in  App.  1). 
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Step  2.  Critical  point  (0,  0),  Linearization.  We  see  from  (7)  that  the  critical  points  are  the  solutions  of 

(7*)  A(yi>  yz)  = yi(a  ~ by2 ) = o,  /2(yi,  y2)  = y-z(kyi  -0  = 0. 


The  solutions  are  (y1;  y2)  = (0,  0)  and 
the  linearized  system 


l a 
k’  b 


We  consider  (0,  0).  Dropping  by \ y2  and  ky\y2  from  (7)  gives 


y = 


a 0 
0 -I 


Its  eigenvalues  are  Ay  = a > 0 and  A2  = — Z < 0.  They  have  opposite  signs,  so  that  we  get  a saddle  point. 

Step.  3.  Critical  point  (l/k,  a/b).  Linearization.  We  set  \'i  ■ > | • l/k , y2  = y2  + a/h-  Then  the  critical  point 
(// k,  a/ b)  corresponds  to  fyy , y2)  = (0,  0).  Since  y[  = y i,y2  = y2,  we  obtain  from  (7)  [factorized  as  in  (7*)] 


.Vi  = yi 


a — b I V2  + 


= yi 


{-by 2) 


ya  = l ya 


k ( yi  + ~ _ 1 


= ( y2  + -J  *yi. 


Dropping  the  two  nonlinear  terms  -^>1^2  an(l  kyiy2->  we  have  the  linearized  system 


lb„ 

(a)  yi  = - — y2 
k 


ak„ 

(b)  V2  = — Vl- 
b 


The  left  side  of  (a)  times  the  right  side  of  (b)  must  equal  the  right  side  of  (a)  times  the  left  side  of  (b). 


ak , lb , . . ak  „ 2 lb 

— yi)’i  = ^272-  By  integration,  — y f H ^2  ~ const. 

b k ' b k 


This  is  a family  of  ellipses,  so  that  the  critical  point  (l/k,  a/b)  of  the  linearized  system  (7**)  is  a center  (Fig.  95). 
It  can  be  shown,  by  a complicated  analysis,  that  the  nonlinear  system  (7)  also  has  a center  (rather  than  a spiral 
point)  at  (l/k,  a/b)  surrounded  by  closed  trajectories  (not  ellipses). 

We  see  that  the  predators  and  prey  have  a cyclic  variation  about  the  critical  point.  Let  us  move  counterclockwise 
around  the  ellipse,  beginning  at  the  right  vertex,  where  the  rabbits  have  a maximum  number.  Foxes  are  sharply 
increasing  in  number  until  they  reach  a maximum  at  the  upper  vertex,  and  the  number  of  rabbits  is  then  sharply 
decreasing  until  it  reaches  a minimum  at  the  left  vertex,  and  so  on.  Cyclic  variations  of  this  kind  have 
been  observed  in  nature,  for  example,  for  lynx  and  snowshoe  hare  near  the  Hudson  Bay,  with  a cycle  of  about 
10  years. 

For  models  of  more  complicated  situations  and  a systematic  discussion,  see  C.  W.  Clark,  Mathematical 
Bioeconomics:  The  Mathematics  of  Conservation,  3rd  ed.  Hoboken,  NJ,  Wiley,  2010. 


k 

Fig.  95.  Ecological  equilibrium  point  and  trajectory 
of  the  linearized  Lotka-Volterra  system  (7**) 
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Transformation  to  a First-Order  Equation 
in  the  Phase  Plane 

Another  phase  plane  method  is  based  on  the  idea  of  transforming  a second-order 
autonomous  ODE  (an  ODE  in  which  t does  not  occur  explicitly) 

F(y,y',y")  = 0 

to  first  order  by  taking  y = jq  as  the  independent  variable,  setting  y'  = y2  and  transforming 
y by  the  chain  rule, 


dy2  _ dyz  c/v, 
dt  dy i dt 


dyz 

J2- 
dy  i 


Then  the  ODE  becomes  of  first  order. 


(8) 


dyz 
yiA’2,  — >’2 

dy  i 


= 0 


and  can  sometimes  be  solved  or  treated  by  direction  fields.  We  illustrate  this  for  the 
equation  in  Example  1 and  shall  gain  much  more  insight  into  the  behavior  of  solutions. 


An  ODE  (8)  for  the  Free  Undamped  Pendulum 

If  in  (4)  d"  + k sin  6 = 0 we  set  6 = yi,  6'  = y2  (the  angular  velocity)  and  use 

dyz  dyz  dy1  dy 2 dy2 

6 = — = — = — yz,  we  get  — yz  = ~k  sinyi. 

at  ayi  at  cly\  ay  i 

Separation  of  variables  gives  j2  dy*2  = ~k  sin  yi  dy\.  By  integration, 

(9)  ^yf  ~ ^cosy!  + C (C  constant). 

Multiplying  this  by  mL2,  we  get 


2 m(Ly2)2  — mL2k  cos  y i = mL2C. 

We  see  that  these  three  terms  are  energies.  Indeed,  y2  is  the  angular  velocity,  so  that  Ly2  is  the  velocity  and  the 
first  term  is  the  kinetic  energy.  The  second  term  (including  the  minus  sign)  is  the  potential  energy  of  the  pendulum, 
and  mL2C  is  its  total  energy,  which  is  constant,  as  expected  from  the  law  of  conservation  of  energy,  because 
there  is  no  damping  (no  loss  of  energy).  The  type  of  motion  depends  on  the  total  energy,  hence  on  C,  as  follows. 

Figure  93b  shows  trajectories  for  various  values  of  C.  These  graphs  continue  periodically  with  period  277  to 
the  left  and  to  the  right.  We  see  that  some  of  them  are  ellipse-like  and  closed,  others  are  wavy,  and  there  are  two 
trajectories  (passing  through  the  saddle  points  («77,  0), « = ±1,  ±3, •••)  that  separate  those  two  types  of 
trajectories.  From  (9)  we  see  that  the  smallest  possible  C is  C = —k;  then  y2  = 0,  and  cosyi  = 1,  so  that  the 
pendulum  is  at  rest.  The  pendulum  will  change  its  direction  of  motion  if  there  are  points  at  which  y2  = S'  =0. 
Then  k cos  yi  + C = 0 by  (9).  If  yi  = 77,  then  cos  yi  = — 1 and  C = k.  Hence  if  — k < C < k,  then  the 
pendulum  reverses  its  direction  for  a |yi|  = \d\  <77,  and  for  these  values  of  C with  \C\  < k the  pendulum 
oscillates.  This  corresponds  to  the  closed  trajectories  in  the  figure.  However,  if  C>k,  then  V2  = 0 is  impossible 
and  the  pendulum  makes  a whirly  motion  that  appears  as  a wavy  trajectory  in  the  yiy2-plane.  Finally,  the  value 
C — k corresponds  to  the  two  “separating  trajectories”  in  Fig.  93b  connecting  the  saddle  points. 


The  phase  plane  method  of  deriving  a single  first-order  equation  (8)  may  be  of  practical 
interest  not  only  when  (8)  can  be  solved  (as  in  Example  4)  but  also  when  a solution 
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is  not  possible  and  we  have  to  utilize  fields  (Sec.  1.2).  We  illustrate  this  with  a very 
famous  example: 

Self-Sustained  Oscillations.  Van  der  Pol  Equation 

There  are  physical  systems  such  that  for  small  oscillations,  energy  is  fed  into  the  system,  whereas  for  large 
oscillations,  energy  is  taken  from  the  system.  In  other  words,  large  oscillations  will  be  damped,  whereas  for 
small  oscillations  there  is  “negative  damping”  (feeding  of  energy  into  the  system).  For  physical  reasons  we 
expect  such  a system  to  approach  a periodic  behavior,  which  will  thus  appear  as  a closed  trajectory  in  the  phase 
plane,  called  a limit  cycle.  A differential  equation  describing  such  vibrations  is  the  famous  van  der  Pol  equation4 

(10)  y"  — n(l  — y2)y'  + y = 0 (/x  > 0,  constant). 


It  first  occurred  in  the  study  of  electrical  circuits  containing  vacuum  tubes.  For  /x  = 0 this  equation  becomes 
y"  + y = 0 and  we  obtain  harmonic  oscillations.  Let  /x  > 0.  The  damping  term  has  the  factor  — /x(  1 — y ). 
This  is  negative  for  small  oscillations,  when  y < 1,  so  that  we  have  “negative  damping,”  is  zero  for  y = 1 
(no  damping),  and  is  positive  if  y2  > 1 (positive  damping,  loss  of  energy).  If  /x  is  small,  we  expect  a limit  cycle 
that  is  almost  a circle  because  then  our  equation  differs  but  little  from  y"  + y = 0.  If  /x  is  large,  the  limit  cycle 
will  probably  look  different. 

Setting  y = yi,y  = yz  and  using  y = (dy^/  dy{)y2  as  in  (8),  we  have  from  (10) 


(11) 


- ^ o 

— >2  - Ml  - >’l).V2  + >1  = 0. 
ay  i 


The  isoclines  in  the  Vi  Vy-phine  (the  phase  plane)  are  the  curves  dy2/dy\  = K = const,  that  is, 


dy2 

dyi 


= mi  - Vi)  ~ 


yi 

>2 


= K. 


Solving  algebraically  for  y2,  we  see  that  the  isoclines  are  given  by 


>2  = 


>1 

Ml  - yi)  ~ k 


(Figs.  96,  97). 


Fig.  96.  Direction  field  for  the  van  der  Pol  equation  with  /x  = 0.1  in  the  phase  plane, 
showing  also  the  limit  cycle  and  two  trajectories.  See  also  Fig.  8 in  Sec.  1.2 


4BALTHASAR  VAN  DER  POL  (1889-1959),  Dutch  physicist  and  engineer. 
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Figure  96  shows  some  isoclines  when  /jl  is  small,  /jl  = 0.1,  the  limit  cycle  (almost  a circle),  and  two  (blue)  trajectories 
approaching  it,  one  from  the  outside  and  the  other  from  the  inside,  of  which  only  the  initial  portion,  a small  spiral,  is 
shown.  Due  to  this  approach  by  trajectories,  a limit  cycle  differs  conceptually  from  a closed  curve  (a  trajectory) 
surrounding  a center,  which  is  not  approached  by  trajectories.  For  larger  /jl  the  limit  cycle  no  longer  resembles  a 
circle,  and  the  trajectories  approach  it  more  rapidly  than  for  smaller  /jl.  Figure  97  illustrates  this  for  /jl  — \.  ■ 


Fig.  97.  Direction  field  for  the  van  der  Pol  equation  with  /x  = 1 in  the  phase  plane, 
showing  also  the  limit  cycle  and  two  trajectories  approaching  it 


gRQBL~E^M==y£T=4=5 


1.  Pendulum.  To  what  state  (position,  speed,  direction 
of  motion)  do  the  four  points  of  intersection  of  a 
closed  trajectory  with  the  axes  in  Fig.  93b 
correspond?  The  point  of  intersection  of  a wavy  curve 
with  the  y2-axis? 

2.  Limit  cycle.  What  is  the  essential  difference  between 
a limit  cycle  and  a closed  trajectory  surrounding  a 
center? 

3.  CAS  EXPERIMENT.  Deformation  of  Limit  Cycle. 

Convert  the  van  der  Pol  equation  to  a system.  Graph 
the  limit  cycle  and  some  approaching  trajectories  for 
/jl  = 0.2,  0.4,  0.6,  0.8,  1.0,  1.5,  2.0.  Try  to  observe  how 
the  limit  cycle  changes  its  form  continuously  if  you 
vary  /jl  continuously.  Describe  in  words  how  the  limit 
cycle  is  deformed  with  growing  /jl. 


4-8  CRITICAL  POINTS.  LINEARIZATION 

Find  the  location  and  type  of  all  critical  points  by 
linearization.  Show  the  details  of  your  work. 

4-  yj  = 4vi  - y\  5. 

y'z  = y2 

6.  y\  = y2  7.  y[  = -y1  + y2  ~ y\ 

y'z  = -yi  ~ yi  yz  = -yi  - yz 

8t  2 

• >’i  — y2  ~ y2 

yz  = yi 


yi  = y2 

I _ i 1 2 

y2  - — yi  + 2yi 


y? 


9-13 


CRITICAL  POINTS  OF  ODEs 


Find  the  location  and  type  of  all  critical  points  by  first 
converting  the  ODE  to  a system  and  then  linearizing  it. 

9.  y"  - 9y  + y3  = 0 10.  y"  + y - y3  = 0 

11.  y"  + cos  y = 0 12.  y"  + 9y  + y2  = 0 
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13.  y"  + siny  = 0 

14.  TEAM  PROJECT.  Self-sustained  oscillations. 

(a)  Van  der  Pol  equation.  Determine  the  type  of  the 
critical  point  at  (0,  0)  when  /x  > 0,  /x  = 0,  yu,  < 0. 

(b)  Rayleigh  equation.  Show  that  the  Rayleigh 
equation5 

Y"  - /x(  1 - Jy'V'  + Y = 0 (/a  > 0) 
also  describes  self-sustained  oscillations  and  that  by 
differentiating  it  and  setting  y = Y'  one  obtains  the  van 
der  Pol  equation. 

(c)  Duffing  equation.  The  Duffing  equation  is 

y"  + wo y + j8y3  = 0 

where  usually  | y8 1 is  small,  thus  characterizing  a small 
deviation  of  the  restoring  force  from  linearity.  j3  > 0 
and  /3  < 0 are  called  the  cases  of  a hard  spring  and  a 
soft  spring,  respectively.  Find  the  equation  of  the 
trajectories  in  the  phase  plane.  (Note  that  for  f5  > 0 all 
these  curves  are  closed.) 


15.  Trajectories.  Write  the  ODE  y"  — 4y  + y3  = 0 as  a 
system,  solve  it  for  y2  as  a function  of  yq,  and  sketch 
or  graph  some  of  the  trajectories  in  the  phase  plane. 


Fig.  98.  Trajectories  in  Problem  15 


4.6  Nonhomogeneous  Linear  Systems  of  ODEs 

In  this  section,  the  last  one  of  Chap.  4,  we  discuss  methods  for  solving  nonhomogeneous 
linear  systems  of  ODEs 

(1)  y'  = Ay  + g (see  Sec.  4.2) 

where  the  vector  g (t)  is  not  identically  zero.  We  assume  g(r)  and  the  entries  of  the 
n X n matrix  A (t)  to  be  continuous  on  some  interval  J of  the  f-axis.  From  a general 
solution  y {h\t)  of  the  homogeneous  system  y*  = Ay  on  J and  a particular  solution 
y(p)(f)  of  (1)  on  J [i.e.,  a solution  of  (1)  containing  no  arbitrary  constants],  we  get  a 
solution  of  (1), 

(2)  y = yW  + y(P>. 

y is  called  a general  solution  of  (1)  on  J because  it  includes  every  solution  of  (1)  on  J. 
This  follows  from  Theorem  2 in  Sec.  4.2  (see  Prob.  1 of  this  section). 

Having  studied  homogeneous  linear  systems  in  Secs.  4. 1-4.4,  our  present  task  will  be 
to  explain  methods  for  obtaining  particular  solutions  of  (1).  We  discuss  the  method  of 


5L0RD  RAYLEIGH  (JOHN  WILLIAM  STRUTT)  (1842-1919),  English  physicist  and  mathematician, 
professor  at  Cambridge  and  London,  known  by  his  important  contributions  to  the  theory  of  waves,  elasticity 
theory,  hydrodynamics,  and  various  other  branches  of  applied  mathematics  and  theoretical  physics.  In  1904  he 
was  awarded  the  Nobel  Prize  in  physics. 
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undetermined  coefficients  and  the  method  of  the  variation  of  parameters;  these  have 
counterparts  for  a single  ODE,  as  we  know  from  Secs.  2.7  and  2.10. 


Method  of  Undetermined  Coefficients 

Just  as  for  a single  ODE,  this  method  is  suitable  if  the  entries  of  A are  constants  and 
the  components  of  g are  constants,  positive  integer  powers  of  t,  exponential  functions, 
or  cosines  and  sines.  In  such  a case  a particular  solution  y(p)  is  assumed  in  a form  similar 
to  g;  for  instance,  y(p)  = u + vf  + xvtz  if  g has  components  quadratic  in  t,  with  u,  v, 
w to  be  determined  by  substitution  into  (1).  This  is  similar  to  Sec.  2.7,  except  for  the 
Modification  Rule.  It  suffices  to  show  this  by  an  example. 


Method  of  Undetermined  Coefficients.  Modification  Rule 

Find  a general  solution  of 


-3  l' 

—6 

(3) 

y'  = Ay  + g = 

1 “3. 

y + 

2 

Solution.  A general  equation  of  the  homogeneous  system  is  (see  Example  1 in  Sec.  4.3) 

’ll  r f 


(4) 


y(h)  = ci 


c2 


e 


—4 1 


ij  L-1 

Since  A = —2  is  an  eigenvalue  of  A,  the  function  e~2t  on  the  right  side  also  appears  in  y(h\  and  we  must  apply 
the  Modification  Rule  by  setting 


y(P)  = u te~2t  ■ 


(rather  than  ue  2t ). 


Note  that  the  first  of  these  two  terms  is  the  analog  of  the  modification  in  Sec.  2.7,  but  it  would  not  be  sufficient 
here.  (Try  it.)  By  substitution, 

y(P)'  = u*-*  - 2u te-*  ~ 2\e~zt  = Aute~zt  + Ave~zt  + g. 


Equating  the  te  2t-terms  on  both  sides,  we  have  — 2u  = Au.  Hence  u is  an  eigenvector  of  A corresponding  to 
A = —2;  thus  [see  (5)]  u = a[  1 1]T  with  any  a =£  0.  Equating  the  other  terms  gives 


—6 

a 

2c  i 

— 3ci  + v2 

-6 

thus 

— 

= 

+ 

2 

a 

2v2 

Ci  - 3c2 

2 

Collecting  terms  and  reshuffling  gives 


Vi  ~ v2  = ~ a - 6 

—Vi  + u2  = ~a  + 2. 


By  addition,  0 = —2 a — 4,  a = —2,  and  then  i>2  = v±  + 4,  say,  V\  = k,  u2  = & + 4,  thus,  v = [k  k + 4]T. 
We  can  simply  choose  k — 0.  This  gives  the  answer 


l 

r 

Y 

0 

(5) 

V- 

II 

V! 

+ 

V! 

g 

II 

Ci 

i 

-2 1 , 

e + c2 

e_4‘  - 2 

1 

te  zt  + 

4 

For  other  k we  get  other  v;  for  instance,  k = —2  gives  v = [— 2 2]T,  so  that  the  answer  becomes 


l 

1 

1 

-2 

(5*) 

y = ci 

l 

1 

tc 

+ 

£ 

-1 

e~M  - 2 

1 

te  21  + 

2 
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Method  of  Variation  of  Parameters 

This  method  can  be  applied  to  nonhomogeneous  linear  systems 
(6)  y'  = A(t)y  + g (/) 


with  variable  A = A(?)  and  general  g(r).  It  yields  a particular  solution  y(p)  of  (6)  on  some 
open  interval  J on  the  f-axis  if  a general  solution  of  the  homogeneous  system  y = A(f)y 
on  J is  known.  We  explain  the  method  in  terms  of  the  previous  example. 


Solution  by  the  Method  of  Variation  of  Parameters 

Solve  (3)  in  Example  1. 

Solution.  A basis  of  solutions  of  the  homogeneous  system  is  [e~2t  e~2t]J  and  [e~4t  — e-4t]T.  Hence 

the  general  solution  (4)  of  the  homogeneous  system  may  be  written 


(7) 


Y(f)c. 


Here,  Y (?)  = [y(1)  y(2)]T  is  the  fundamental  matrix  (see  Sec.  4.2).  As  in  Sec.  2.10  we  replace  the  constant 
vector  c by  a variable  vector  u (t)  to  obtain  a particular  solution 


y(p)  = Y(f)u(f). 


Substitution  into  (3)  y'  = Ay  + g gives 


(8)  Y'u  + Yu'  = AYu  + g. 

Now  since  y(1)  and  y(2)  are  solutions  of  the  homogeneous  system,  we  have 

y(1)'  = AyU),  y(2)'  = Ay<2>,  thus  Y'  = AY. 
Hence  Y'u  = AYu,  so  that  (8)  reduces  to 

Yu'  = g.  The  solution  is  u'  = Y-1g; 


here  we  use  that  the  inverse  Y 1 of  Y (Sec.  4.0)  exists  because  the  determinant  of  Y is  the  Wronskian  W,  which 
is  not  zero  for  a basis.  Equation  (9)  in  Sec.  4.0  gives  the  form  of  Y-1, 


3 

1 

1 

i 

1 

i 

e2t 

-2 1 

— 2t 

2 

4 1 

4 1 

—e 

e 

e 

~e 

We  multiply  this  by  g,  obtaining 


~e2t 

e2t ' 

—6e~2t 

1 

-4 

-2 

e4t 

e 4t 

2e~2t 

2 

-8e2t 

— 4e2t 

Integration  is  done  componentwise  (just  as  differentiation)  and  gives 

u(f)  = 

(where  + 2 comes  from  the  lower  limit  of  integration).  From  this  and  Y in  (7)  we  obtain 

Yu  = 


f* 

-2 

—It 

dt  = 

Jo 

r4e2\ 

-le2t  + 2 

'e-2t 

e-u' 

-It 

—2te~2t  - 2e~2t  + 2e~u' 

—2t  - 2 

e 2t  + 

2 

e-2t 

—2e2t  + 2 

-2t e~2t  + 2e~2t  - 2e_4( 

—2t  + 2 

-2 
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The  last  term  on  the  right  is  a solution  of  the  homogeneous  system.  Hence  we  can  absorb  it  into  y(h\  We  thus 
obtain  as  a general  solution  of  the  system  (3),  in  agreement  with  (5*). 


1 

1 

1 

'-2' 

(9) 

y = ci 

1 

— 2t  , 

e + c2 

-1 

c_4t  - 2 

1 

te~21  + 

2 

FROB  EEM~S^~4=hS 


1.  Prove  that  (2)  includes  every  solution  of  (1). 

GENERAL  SOLUTION 


2-7 


Find  a general  solution.  Show  the  details  of  your  work. 

2.  y'i  — yi  + y2  + 10  cos  t 
y2  = 3yi  — y2  ~ 10  sin  t 

„ t ,3 1 

3-  yi  = y2  + e 

t i 3 1 

>'2  = y\  - ie 

4.  y'i  = 4y1  — 8y2  + 2 cosh  t 

y2  — 2yi  — 6y2  + cosh  t + 2 sinh  t 

5.  y[  = 4y1  + y2  + 0.6 1 
y'i  = 2yx  + 3y2  ~ 2.5 1 

6-  y[  = 4yz 

y2  = 4yi  — 16  t2  + 2 

7.  y\  = — 3yi  — 4v2  + Ilf  + 15 

y'z  — 5>’i  + 6y2  + 3e~(  — 15  f — 20 

8.  CAS  EXPERIMENT.  Undetermined  Coefficients. 
Find  out  experimentally  how  general  you  must  choose 
y(p),  in  particular  when  the  components  of  g have 
a different  form  (e.g.,  as  in  Prob.  7).  Write  a short 
report,  covering  also  the  situation  in  the  case  of  the 
modification  rule. 

9.  Undetermined  Coefficients.  Explain  why,  in  Example 
1 of  the  text,  we  have  some  freedom  in  choosing  the 
vector  v. 


10-15 


INITIAL  VALUE  PROBLEM 


Solve,  showing  details: 

10.  y[  = — 3yi  — 4y2  + 5e‘ 
y'i  = 5yi  + 6y2  - 6e‘ 
yi(0)  = 19,  y2(0)  = -23 

11.  yl  = y2  + 6c2t 

/ 2 1 

T2  = yi  - e 


15.  yi  = y!  + 2y2  + e2t  - 2t 

y'2  = -y2  + 1 + t 

yi(0)  = 1,  y2(0)  = -4 

16.  WRITING  PROJECT.  Undetermined  Coefficients. 

Write  a short  report  in  which  you  compare  the 
application  of  the  method  of  undetermined  coefficients 
to  a single  ODE  and  to  a system  of  ODEs,  using  ODEs 
and  systems  of  your  choice. 


17-20  NETWORK 

Find  the  currents  in  Fig.  99  (Probs.  17-19)  and  Fig.  100 
(Prob.  20)  for  the  following  data,  showing  the  details  of 
your  work. 

17.  ! = 2 O,  R2  = 8 O,  L = 1 H,  C = 0.5  F,  E = 200  V 

18.  Solve  Prob.  17  with  E = 440  sin  f V and  the  other  data 
as  before. 

19.  In  Prob.  17  find  the  particular  solution  when  currents 
and  charge  at  t = 0 are  zero. 


Fig.  99.  Problems  17-19 

20.  Rx  = 1 D,  R2  = 1.4  D,  Lx  = 0.8  H,  L2  = 1 H, 
E=  100  V,  /!(0)  = /2(0)  = 0 


yi(0)  = 1,  y2(0)  = 0 

12.  yl  = yi  + 4y2  — f2  + 6f 
y2  = yi  + ?2  ~ t2  + t - 1 
yi(0)  = 2,  y2(0)  = -1 

13.  yl  = y2  — 5 sin  t 

y'2  = — 4vi  + 17  cos  f 
yi(0)  = 5,  y2(0)  = 2 

14.  y'i  = 4v2  + 5e* 

y'2  = ~yi  ~ 20 e_t 

yi(0)  = 1,  y2(0)  = 0 


Fig.  100.  Problem  20 


164 


CHAP.  4 Systems  of  ODEs.  Phase  Plane.  Qualitative  Methods 


T I O N S AND  PROBLEMS 


1.  State  some  applications  that  can  be  modeled  by  systems 
of  ODEs. 

2.  What  is  population  dynamics?  Give  examples. 

3.  How  can  you  transform  an  ODE  into  a system  of  ODEs? 

4.  What  are  qualitative  methods  for  systems?  Why  are  they 
important? 

5.  What  is  the  phase  plane?  The  phase  plane  method?  A 
trajectory?  The  phase  portrait  of  a system  of  ODEs? 

6.  What  are  critical  points  of  a system  of  ODEs?  How  did 
we  classify  them?  Why  are  they  important? 

7.  What  are  eigenvalues?  What  role  did  they  play  in  this 
chapter? 

8.  What  does  stability  mean  in  general?  In  connection  with 
critical  points?  Why  is  stability  important  in  engineering? 

9.  What  does  linearization  of  a system  mean? 

10.  Review  the  pendulum  equations  and  their  linearizations. 


11-17 


GENERAL  SOLUTION.  CRITICAL  POINTS 


Find  a general  solution.  Determine  the  kind  and  stability  of 
the  critical  point. 


11-  y[  = 2y2 

>’2  = 8vi 


12.  y[  = 5yi 
y'z  = y2 


13.  y[  = -2>q  + 5y2 
y'z  = —yi  - 6 y2 


14.  y[  = 3yi  + 4v2 
y'z  = 3yi  + 2y2 


15.  y[  = — 3yi  — 2y2  16.  vI  = 4y2 

y'z  = -2yi  - 3y2  y2  = -4yi 

17.  y[  = -yj  + 2ya 
v2  = — 2yi  - v2 


24.  Mixing  problem.  Tank  7j  in  Fig.  101  initially  contains 
200  gal  of  water  in  which  160  lb  of  salt  are  dissolved. 
Tank  T2  initially  contains  100  gal  of  pure  water.  Liquid 
is  pumped  through  the  system  as  indicated,  and  the 
mixtures  are  kept  uniform  by  stirring.  Find  the  amounts 
of  salt  yi (t)  and  y2(f)  in  7j  and  T2,  respectively. 


Fig.  101.  Tanks  in  Problem  24 


25.  Network.  Find  the  currents  in  Fig.  102  when 
R = 2.5  O,  L = 1 H,  C = 0.04  F,  E{t)  = 169  sin  t V, 
h(0)  = 0,  /2(0)  = 0. 


26.  Network.  Find  the  currents  in  Fig.  103  when  R = 1 O, 
L = 1.25  H,  C = 0.2  F,  ?i(0)  = 1 A,  /2( 0)  = 1 A. 


18-19 


CRITICAL  POINT 


What  kind  of  critical  point  does 
eigenvalues 

18.  —4  and  2 


t 

y 


= Ay  have  if  A has  the 
19.  2 + 3 /,  2 - 3 i 


20-23 


NONHOMOGENEOUS  SYSTEMS 


Find  a general  solution.  Show  the  details  of  your  work. 

20.  y[  = 2y1  + 2y2  + e* 
y'z  = -2yx  - 3y2  + e* 

21.  y[  = 4y2 

y2  = 4yi  + 32 t2 


22.  y'i  = y!  + y2  + sin  t 
y'z  = 4yr  + y2 

23.  yi  = yi  + 4y2  — 2 cos  t 

y2  = yi  + y2  — cos  t + sin  r 


V V V 

\\ 

> 

> R c 

Fig.  103.  Network  in  Problem  26 


27-30 


LINEARIZATION 


Find  the  location  and  kind  of  all  critical  points  of  the  given 
nonlinear  system  by  linearization. 


27.  y[  = y2 

/ 3 

y2  = >t  - yi 

29.  y[  = — 4y2 
y2  = sin  V! 


28.  y'i  = cos  y2 
y2  = 3yi 

30.  yj  = 2y2  + 2y| 
y2  = — 8vi 


Summary  of  Chapter  4 
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SUMMARY  OT  CHAPTEr  A 

Systems  of  ODEs.  Phase  Plane.  Qualitative  Methods 


Whereas  single  electric  circuits  or  single  mass-spring  systems  are  modeled  by 
single  ODEs  (Chap.  2),  networks  of  several  circuits,  systems  of  several  masses 
and  springs,  and  other  engineering  problems  lead  to  systems  of  ODEs,  involving 
several  unknown  functions  y\(t),  • ■ • , yn(t).  Of  central  interest  are  first-order 
systems  (Sec.  4.2): 


y'i  = yi,  ■ ■ ■ , yn) 

y = f(r,  y),  in  components, 

yn  fnit,  y 1 ’ ‘ ‘ , Jn)’ 

to  which  higher  order  ODEs  and  systems  of  ODEs  can  be  reduced  (Sec.  4.1).  In 
this  summary  we  let  n = 2,  so  that 


, y'i  = flit,  yi,  y2) 

( 1 ) y = f(r,  y),  in  components, 

y'2  = hit,  yi,  y2). 

Then  we  can  represent  solution  curves  as  trajectories  in  the  phase  plane  (the 
yiy2-plane),  investigate  their  totality  [the  “phase  portrait ” of  (1)],  and  study  the  kind 
and  stability  of  the  critical  points  (points  at  which  both  f\  and  /2  are  zero),  and 
classify  them  as  nodes,  saddle  points,  centers,  or  spiral  points  (Secs.  4.3, 4.4).  These 
phase  plane  methods  are  qualitative;  with  their  use  we  can  discover  various  general 
properties  of  solutions  without  actually  solving  the  system.  They  are  primarily  used 
for  autonomous  systems,  that  is,  systems  in  which  t does  not  occur  explicitly. 

A linear  system  is  of  the  form 


an  ai2 

>’i 

81 

(2)  y = Ay  + g,  where  A = 

_a21  «22. 

, y = 

.A2. 

. g = 

.82. 

If  g = 0,  the  system  is  called  homogeneous  and  is  of  the  form 
(3)  y'  = Ay. 

If  an,  ■ ' ' , «22  are  constants,  it  has  solutions  y = xeAt,  where  A is  a solution  of  the 
quadratic  equation 


flu  — A ai2 
a21  a22  ~ A 


(an  — A)(a22  — A)  — ai2a2i  — 0 


166 


CHAP.  4 Systems  of  ODEs.  Phase  Plane.  Qualitative  Methods 


and  x + 0 has  components  x±,  x2  determined  up  to  a multiplicative  constant  by 

(an  - A)*!  + a12x2  = 0. 

(These  A’s  are  called  the  eigenvalues  and  these  vectors  x eigenvectors  of  the 
matrix  A.  Further  explanation  is  given  in  Sec.  4.0.) 

A system  (2)  with  g 0 is  called  nonhomogeneous.  Its  general  solution  is  of 
the  form  y = y^  + yp,  where  y^  is  a general  solution  of  (3)  and  yp  a particular 
solution  of  (2).  Methods  of  determining  the  latter  are  discussed  in  Sec.  4.6. 

The  discussion  of  critical  points  of  linear  systems  based  on  eigenvalues  is 
summarized  in  Tables  4.1  and  4.2  in  Sec.  4.4.  It  also  applies  to  nonlinear  systems 
if  the  latter  are  first  linearized.  The  key  theorem  for  this  is  Theorem  1 in  Sec.  4.5, 
which  also  includes  three  famous  applications,  namely  the  pendulum  and  van  der 
Pol  equations  and  the  Lotka-Volterra  predator-prey  population  model. 


CHAPTER  5 

Series  Solutions  of  ODEs. 
Special  Functions 

In  the  previous  chapters,  we  have  seen  that  linear  ODEs  with  constant  coefficients  can  be 
solved  by  algebraic  methods,  and  that  their  solutions  are  elementary  functions  known  from 
calculus.  For  ODEs  with  variable  coefficients  the  situation  is  more  complicated,  and  their 
solutions  may  be  nonelementary  functions.  Legendre’s , Bessel’s,  and  the  hypergeometric 
equations  are  important  ODEs  of  this  kind.  Since  these  ODEs  and  their  solutions,  the 
Legendre  polynomials,  Bessel  functions,  and  hypergeometric  functions,  play  an  important 
role  in  engineering  modeling,  we  shall  consider  the  two  standard  methods  for  solving 
such  ODEs. 

The  first  method  is  called  the  power  series  method  because  it  gives  solutions  in  the 
form  of  a power  series  «o  + fli-x  + a^x2  + a%x3  + 

The  second  method  is  called  the  Frobenius  method  and  generalizes  the  first;  it  gives 
solutions  in  power  series,  multiplied  by  a logarithmic  term  In  x or  a fractional  power  xr, 
in  cases  such  as  Bessel’s  equation,  in  which  the  first  method  is  not  general  enough. 

All  those  more  advanced  solutions  and  various  other  functions  not  appearing  in  calculus 
are  known  as  higher  functions  or  special  functions,  which  has  become  a technical  term. 
Each  of  these  functions  is  important  enough  to  give  it  a name  and  investigate  its  properties 
and  relations  to  other  functions  in  great  detail  (take  a look  into  Refs.  [GenRefl], 
[GenReflO],  or  [All]  in  App.  1).  Your  CAS  knows  practically  all  functions  you  will  ever 
need  in  industry  or  research  labs,  but  it  is  up  to  you  to  find  your  way  through  this  vast 
terrain  of  formulas.  The  present  chapter  may  give  you  some  help  in  this  task. 

COMMENT.  You  can  study  this  chapter  directly  after  Chap.  2 because  it  needs  no 
material  from  Chaps.  3 or  4. 

Prerequisite:  Chap.  2. 

Section  that  may  be  omitted  in  a shorter  course:  5.5. 

References  and  Answers  to  Problems:  App.  1 Part  A,  and  App.  2. 


5.1  Power  Series  Method 


The  power  series  method  is  the  standard  method  for  solving  linear  ODEs  with  variable 
coefficients.  It  gives  solutions  in  the  form  of  power  series.  These  series  can  be  used 
for  computing  values,  graphing  curves,  proving  formulas,  and  exploring  properties  of 
solutions,  as  we  shall  see.  In  this  section  we  begin  by  explaining  the  idea  of  the  power 
series  method. 
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EXAMPLE  1 


EXAMPLE  2 


From  calculus  we  remember  that  a power  series  (in  powers  of  x — x(l)  is  an  infinite 
series  of  the  form 


oo 

(1)  2 am(x  - x0)m  = a0  + ax(x  - x0)  + a2(x  - x0  f + • • • . 

m = 0 


Here,  x is  a variable.  a0, a±,  a2,  ■ ■ • are  constants,  called  the  coefficients  of  the  series. 
Xo  is  a constant,  called  the  center  of  the  series.  In  particular,  if  Xq  = 0,  we  obtain  a power 
series  in  powers  of  x 


(2)  ^ = flo  + aix  + a2x2  + a^x3  + • ■ ■ . 

m = 0 

We  shall  assume  that  all  variables  and  constants  are  real. 

We  note  that  the  term  “power  series”  usually  refers  to  a series  of  the  form  (1)  [or  (2)] 
but  does  not  include  series  of  negative  or  fractional  powers  of  x.  We  use  m as  the 
summation  letter,  reserving  n as  a standard  notation  in  the  Legendre  and  Bessel  equations 
for  integer  values  of  the  parameter. 


Familiar  Power  Series  are  the  Maclaurin  series 
1 


1 - x 


= 2 t”  = 1 + x 


m = 0 

* “ xm  x2  x3 

ex  = V =l+rH 1 H 

m=o  m!  2!  3! 

^ x2  , x4 

COS  X = >,  =1 1 

(2m)!  2!  4! 

m = 0 v ’ 


1mv2m+l  ^.3 

= X — 


= - (~l)mx 

mto  (2m  + 1)!  3! 


(|x|  < 1 , geometric  series) 


Idea  and  Technique  of  the  Power  Series  Method 

The  idea  of  the  power  series  method  for  solving  linear  ODEs  seems  natural,  once  we 
know  that  the  most  important  ODEs  in  applied  mathematics  have  solutions  of  this  form. 
We  explain  the  idea  by  an  ODE  that  can  readily  be  solved  otherwise. 


Power  Series  Solution.  Solve  y'  — y = 0. 
Solution.  In  the  first  step  we  insert 


(2) 


y = a0  + ayx  + a^x2  + a3x3  + ■ • • = 2 Out” 

m = 0 


SEC.  5.1  Power  Series  Method 
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EXAMPLE  3 


and  the  series  obtained  by  termwise  differentiation 


(3) 


y'  = a1  + 2 a2x  + 3a2x2  + ■ ■ ■ = 2 niamxm  1 

ra=  1 


into  the  ODE: 


(01  + 2«2*  + 3tf3*2  + • • • ) — (#0  + 0i*  + #2*2  + • * • ) = 0. 


Then  we  collect  like  powers  of  x,  finding 

(«i  - a0)  + (2a2  ~ aj)x  + (3a3  - a2)x2  + • • ■ =0. 

Equating  the  coefficient  of  each  power  of  x to  zero,  we  have 

ct\  — CIq  = 0,  2^2  — 0i  = 0,  3«3  — 02  = 0,  • * * . 

Solving  these  equations,  we  may  express  01,  02,  • • • in  terms  of  ciq,  which  remains  arbitrary: 


01  0O 

01  = 0O,  a2  — = — , 
2 2! 


02  0O 


With  these  values  of  the  coefficients,  the  series  solution  becomes  the  familiar  general  solution 


00  2 3 

y = 00  + 0o*  ”1 * ”1 * + ' 

2!  3! 


*2  *3\ 

= 0o  ( 1 + x H h — ) = 00^. 

1 2!  V.) 


Test  your  comprehension  by  solving  y + y = 0 by  power  series.  You  should  get  the  result 
y = a 0 cos  x + 0i  sin  x. 


We  now  describe  the  method  in  general  and  justify  it  after  the  next  example.  For  a given 
ODE 

(4)  y"  + p(x)y'  + q(x)y  = 0 

we  first  represent  p(x)  and  q(x)  by  power  series  in  powers  of  x (or  of  x — xo  if  solutions 
in  powers  of  x — xo  are  wanted).  Often  p(x)  and  q(x)  are  polynomials,  and  then  nothing 
needs  to  be  done  in  this  first  step.  Next  we  assume  a solution  in  the  form  of  a power  series 
(2)  with  unknown  coefficients  and  insert  it  as  well  as  (3)  and 


(5)  y"  = 2az  + 3 ■ 2azx  + 4 • 3fl4.r2  + • • • = ^ wt(m  — 1 )amxm  2 

m = 2 

into  the  ODE.  Then  we  collect  like  powers  of  x and  equate  the  sum  of  the  coefficients  of 
each  occurring  power  of  x to  zero,  starting  with  the  constant  terms,  then  taking  the  terms 
containing  x,  then  the  terms  in  x2,  and  so  on.  This  gives  equations  from  which  we  can 
determine  the  unknown  coefficients  of  (3)  successively. 


A Special  Legendre  Equation.  The  ODE 

(1  — x2)y"  — 2xy'  + 2y  = 0 


occurs  in  models  exhibiting  spherical  symmetry.  Solve  it. 
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Solution.  Substitute  (2),  (3),  and  (5)  into  the  ODE.  (1  — x2)y"  gives  two  series,  one  for  y"  and  one  for 
—x2y".  In  the  term  —2 xy'  use  (3)  and  in  2 y use  (2).  Write  like  powers  of  x vertically  aligned.  This  gives 

y"  = 2^2  + 6a^x  + lla^x2  + 20a^x3  + SOciqx4  + • • • 

—x2y"  = — 2ci2X2  — 6a^x3  — \2a±x^  — • • • 

—2xyr  = — 2a\x  — Aazx2  — 6a^x3  — 804.x4  — • • • 

2y  = 2cio  + 2ciix  + 2ci2X2  + 2a^x3  + 204X4  + • • • . 

Add  terms  of  like  powers  of  x.  For  each  power  jc°,  x,  x2,  • • • equate  the  sum  obtained  to  zero.  Denote  these  sums 
by  [0]  (constant  terms),  [1]  (first  power  of  x),  and  so  on: 


Sum 

Power 

Equations 

[0] 

U°] 

a2  = 

[1] 

M 

a3  = 

0 

[2] 

fv2] 

II 

<3 

<N 

4 a2. 

«4  = 

4 

12  a2  ~ 

3a0 

[3] 

[A3] 

«5  = 

0 

since 

a3  = 

0 

[4] 

[A'4] 

30q6  = 

oo 

a6  = 

W|M 

olco 

Si 

II 

30  (— 3)ao 

This  gives  the  solution 


y = axx  + a0(l  - xz  - 3x4  - g*6  - • • ■ ). 

ciq  and  a i remain  arbitrary.  Hence,  this  is  a general  solution  that  consists  of  two  solutions:  x and 
1 — x2  — Jjc4  These  two  solutions  are  members  of  families  of  functions  called  Legendre  polynomials 

Pn(x)  and  Legendre  functions  Qn(x);  here  we  have  x — P\(x)  and  1 — x2  — g*4  — gx  — = —Q\{x).  The 

minus  is  by  convention.  The  index  1 is  called  the  order  of  these  two  functions  and  here  the  order  is  1 . More  on 
Legendre  polynomials  in  the  next  section. 


Theory  of  the  Power  Series  Method 

The  nth  partial  sum  of  (1)  is 

(6)  sn{x)  = ciq  + cii(x  — x0)  + a2(x  — Xq)2  + • • • + an(x  — x0)n 

where  n = 0,  1,  • • • . If  we  omit  the  terms  of  sn  from  (1),  the  remaining  expression  is 

(7)  Rn(x)  = an+i(x  - x0)n+1  + an+2(x  - x0)n+2  + • • • . 

This  expression  is  called  the  remainder  of(l)  after  the  term  an(x  — x0)n. 

For  example,  in  the  case  of  the  geometric  series 

1 + X + x2  + •••  + xn  + ••• 

we  have 

.s'o  = 1 , Ro  = x + x2  + x3  + ■ ■ ■ , 

= 1 + X,  Ri  = x2  + x3  + x4  + ■ ■ ■ , 

s2  = 1 + x + x2,  R2  = x3  + x4  + x5  + • ■ • , 


etc. 
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In  this  way  we  have  now  associated  with  (1)  the  sequence  of  the  partial  sums 
.v0(x),  .s'i(jc),  ,v2(x),  • • • . If  for  some  x = X\  this  sequence  converges,  say, 

lim  sn(x  i)  = s(xi), 

n— » °° 

then  the  series  (1)  is  called  convergent  at  x = x±,  the  number  s(x  i)  is  called  the  value 
or  sum  of  (1)  at  xi,  and  we  write 


oo 

■SOl)  = 2 am(x  1 “ xo)m- 

m = 0 


Then  we  have  for  every  n, 

(8)  4*l)  = Sn(x  i)  + Rnixi). 

If  that  sequence  diverges  at  x = the  series  (1)  is  called  divergent  at  x = X\. 

In  the  case  of  convergence,  for  any  positive  e there  is  an  N (depending  on  e)  such  that, 
by  (8) 

(9)  |/?n(jt1)|  = U(xi)  — sm(xi)|  < e for  all /?  > N. 

Geometrically,  this  means  that  all  sn(x i)  with  n > N lie  between  s(x  ] ) — e and  s(x  j ) + e 
(Fig.  104).  Practically,  this  means  that  in  the  case  of  convergence  we  can  approximate  the 
sum  s(xi)  of  (1)  at  xi  by  sn(x i)  as  accurately  as  we  please,  by  taking  n large  enough. 


sOq)  - e s(*j)  stej)  - 

Fig.  104.  Inequality  (9) 


Where  does  a power  series  converge?  Now  if  we  choose  x = A'()  in  ( I ),  the  series  reduces 
to  the  single  term  a0  because  the  other  terms  are  zero.  Hence  the  series  converges  at  x0. 
In  some  cases  this  may  be  the  only  value  of  x for  which  (1)  converges.  If  there  are  other 
values  of  x for  which  the  series  converges,  these  values  form  an  interval,  the  convergence 
interval.  This  interval  may  be  finite,  as  in  Fig.  105,  with  midpoint  xo-  Then  the  series  (1) 
converges  for  all  x in  the  interior  of  the  interval,  that  is,  for  all  x for  which 


(10) 


|x  — xqI  < R 


and  diverges  for  |x  — XqI  > R-  The  interval  may  also  be  infinite,  that  is,  the  series  may 
converge  for  all  x. 


Divergence 

■< Conve 

rgence >- 

Divergence 

Fig.  105.  Convergence  interval  (10)  of  a power  series  with  center  x, 
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The  quantity  R in  Fig.  105  is  called  the  radius  of  convergence  (because  for  a complex 
power  series  it  is  the  radius  of  disk  of  convergence).  If  the  series  converges  for  all  x,  we 
set  R = oo  (and  l/R  = 0). 

The  radius  of  convergence  can  be  determined  from  the  coefficients  of  the  series  by 
means  of  each  of  the  formulas 


(ID 


(a) 


(b) 


R = 1 / lim 

/ ra— ><» 


^m+ 1 


provided  these  limits  exist  and  are  not  zero.  [If  these  limits  are  infinite,  then  (1)  converges 
only  at  the  center  x0.] 


EXAMPLE  4 Convergence  Radius  R.  = , 1,  0 


For  all  three  series  let  m — > °o 


ex  = y — = l + X + — + ■ 
m)  2' 

m = 0 


am+ 1 


1 /(m  + 1)!  _ 1 

1/m!  m + 1 


0,  R = 


1 

1 - x 


= ^xm=\+x  + x2 


am+ 1 


1 


R = 1 


2 m\xm  = 1 + x + 2x2  + ■ ■ ■ , 

m = 0 


am+ 1 


(m  + 1)! 


= m 4-  1 — > go, 


/?  = 0. 


Convergence  for  all  x (R  = °°)  is  the  best  possible  case,  convergence  in  some  finite  interval  the  usual,  and 
convergence  only  at  the  center  ( R = 0)  is  useless. 


When  do  power  series  solutions  exist?  Answer:  if  p,  q,  r in  the  ODEs 
(12)  y"  + p(x)y'  + q(x)y  = r(x) 

have  power  series  representations  (Taylor  series).  More  precisely,  a function /(.r)  is  called 
analytic  at  a point  x = x®  if  it  can  be  represented  by  a power  series  in  powers  of  x — xq 
with  positive  radius  of  convergence.  Using  this  concept,  we  can  state  the  following  basic 
theorem,  in  which  the  ODE  (12)  is  in  standard  form,  that  is,  it  begins  with  the  y . If 
your  ODE  begins  with,  say,  h(x)y  , divide  it  first  by  h(x)  and  then  apply  the  theorem  to 
the  resulting  new  ODE. 


THEOREM  1 


Existence  of  Power  Series  Solutions 

If  p,  q , and  r in  (12)  are  analytic  at  x = Xo,  then  every  solution  of  { 12)  is  analytic 
at  x = Xq  and  can  thus  be  represented  by  a power  series  in  powers  of  x — x0  with 
radius  of  convergence  R > 0. 


The  proof  of  this  theorem  requires  advanced  complex  analysis  and  can  be  found  in  Ref. 
[All]  listed  in  App.  1. 

We  mention  that  the  radius  of  convergence  R in  Theorem  1 is  at  least  equal  to  the  distance 
from  the  point  x = x0  to  the  point  (or  points)  closest  to  x0  at  which  one  of  the  functions 
p,  q,  r,  as  functions  of  a complex  variable,  is  not  analytic.  (Note  that  that  point  may  not 
lie  on  the  x-axis  but  somewhere  in  the  complex  plane.) 
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Further  Theory:  Operations  on  Power  Series 

In  the  power  series  method  we  differentiate,  add,  and  multiply  power  series,  and  we  obtain 
coefficient  recursions  (as,  for  instance,  in  Example  3)  by  equating  the  sum  of  the 
coefficients  of  each  occurring  power  of  x to  zero.  These  four  operations  are  permissible 
in  the  sense  explained  in  what  follows.  Proofs  can  be  found  in  Sec.  15.3. 

1.  Termwise  Differentiation.  A power  series  may  be  differentiated  term  by  term.  More 
precisely:  if 


y(x)  = ^ am(x  ~ x0)m 

m = 0 

converges  for  \x  — Xq\  < R,  where  R > 0,  then  the  series  obtained  by  differentiating  term 
by  term  also  converges  for  those  x and  represents  the  derivative  y'  of  y for  those  x: 


y'(x)  = ^ mamix  - *0)m  1 (U  - x0\  < R). 

m=  1 

Similarly  for  the  second  and  further  derivatives. 

2.  Termwise  Addition.  Two  power  series  may  be  added  term  by  term.  More  precisely: 
if  the  series 


(13) 


2 a-m(x  ~ x0)m  and  2 bm(x  ~ x0)m 

771  = 0 771  = 0 


have  positive  radii  of  convergence  and  their  sums  are  /(x)  and  g(x),  then  the  series 


(a7n  ^m)C^  *o) 

m = 0 

converges  and  represents /(x)  + g(x)  for  each  x that  lies  in  the  interior  of  the  convergence 
interval  common  to  each  of  the  two  given  series. 

3.  Termwise  Multiplication.  Two  power  series  may  be  multiplied  term  by  term.  More 
precisely:  Suppose  that  the  series  (13)  have  positive  radii  of  convergence  and  let  f(x)  and 
g(x)  be  their  sums.  Then  the  series  obtained  by  multiplying  each  term  of  the  first  series 
by  each  term  of  the  second  series  and  collecting  like  powers  of  x — xo,  that  is, 

a0b0  + (flo^i  + ai^oX*  “ *o)  + (a0b2  + afb±  + a2b0)(x  - x0f  + ■■■ 


3"  ^ 1 bt)l  — \ T * ‘ ' T amb0)(x  X0) 

777  = 0 

converges  and  represents  f(x)g(x)  for  each  x in  the  interior  of  the  convergence  interval  of 
each  of  the  two  given  series. 
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4.  Vanishing  of  All  Coefficients  (“ Identity  Theorem  for  Power  Series.”)  If  a power 
series  has  a positive  radius  of  convergent  convergence  and  a sum  that  is  identically  zero 
throughout  its  interval  of  convergence,  then  each  coefficient  of  the  series  must  be  zero. 


PROBLE^~SrE'F~5T1 


1.  WRITING  AND  LITERATURE  PROJECT.  Power 
Series  in  Calculus,  (a)  Write  a review  (2-3  pages)  on 
power  series  in  calculus.  Use  your  own  formulations  and 
examples — do  not  just  copy  from  textbooks.  No  proofs, 
(b)  Collect  and  arrange  Maclaurin  series  in  a systematic 
list  that  you  can  use  for  your  work. 


2-5 


REVIEW:  RADIUS  OF  CONVERGENCE 


Determine  the  radius  of  convergence.  Show  the  details  of 
your  work. 


2.  ^ im  + l)mxm 


3. 


2 

m = 0 


(-D” 

km 


4-  i 

m = 0 


j.2m+l 

(2m  + 1)! 


5-  i 

m = 0 


6-9 


SERIES  SOLUTIONS  BY  HAND 


Apply  the  power  series  method.  Do  this  by  hand,  not  by  a 
CAS,  to  get  a feel  for  the  method,  e.g.,  why  a series  may 
terminate,  or  has  even  powers  only,  etc.  Show  the  details. 


6.  (1  + x)y  = y 

7.  y = -2xy 


8.  xy'  —3 y = k(=  const) 


9.  y"  + y = 0 


10-14 


SERIES  SOLUTIONS 


Find  a power  series  solution  in  powers  of  x.  Show  the  details. 


10.  y"  — y + xy  = 0 

11.  y"  — y + x2y  = 0 

12.  (1  - x2)y"  - 2xy'  + 2y  = 0 

13.  y"  + (1  + x2)y  = 0 

14.  y"  — 4xy'  + (4jc2  — 2)y  = 0 


15.  Shifting  summation  indices  is  often  convenient  or 
necessary  in  the  power  series  method.  Shift  the  index 
so  that  the  power  under  the  summation  sign  is  xm. 
Check  by  writing  the  first  few  terms  explicity. 


“ sfs  + 1) 

2^ — \ 
s = 2 ^ + 1 


„P  + 4 


„-i  (P+  D! 


16-19 


CAS  PROBLEMS.  IVPs 


Solve  the  initial  value  problem  by  a power  series.  Graph 
the  partial  sums  of  the  powers  up  to  and  including  x5.  Find 
the  value  of  the  sum  s (5  digits)  at  x\. 


16.  y + 4y  = 1,  y(0)  = 1.25,  xx  = 0.2 


17.  y"  + 3xy'  + 2y  = 0,  y(0)  = 1,  y'(0)  = 1, 
x = 0.5 


18.  (1  - x2)y"  - 2xy  + 30y  = 0,  y(0)  = 0, 
y'(0)  = 1.875,  X!  = 0.5 

19.  (x  — 2)y ' = xy,  y(0)  = 4,  jci  = 2 


20.  CAS  Experiment.  Information  from  Graphs  of 
Partial  Sums.  In  numerics  we  use  partial  sums  of 
power  series.  To  get  a feel  for  the  accuracy  for  various 
x,  experiment  with  sin  x.  Graph  partial  sums  of  the 
Maclaurin  series  of  an  increasing  number  of  terms, 
describing  qualitatively  the  “breakaway  points”  of 
these  graphs  from  the  graph  of  sin  x.  Consider  other 
Maclaurin  series  of  your  choice. 


Fig.  106.  CAS  Experiment  20.  sin  x and  partial 
sums  s3,  s5,  s7 
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Legendres  Equation. 

Legendre  Polynomials  Pn(x) 

Legendre’s  differential  equation1 

(1)  (1  — xz)y"  — 2 xy'  + n(n  + l)y  = 0 (n  constant) 

is  one  of  the  most  important  ODEs  in  physics.  It  arises  in  numerous  problems,  particularly 
in  boundary  value  problems  for  spheres  (take  a quick  look  at  Example  1 in  Sec.  12.10). 

The  equation  involves  a parameter  n,  whose  value  depends  on  the  physical  or 
engineering  problem.  So  (1)  is  actually  a whole  family  of  ODEs.  For  n = 1 we  solved  it 
in  Example  3 of  Sec.  5.1  (look  back  at  it).  Any  solution  of  (1)  is  called  a Legendre  function. 
The  study  of  these  and  other  “higher”  functions  not  occurring  in  calculus  is  called  the 
theory  of  special  functions.  Further  special  functions  will  occur  in  the  next  sections. 

Dividing  (1)  by  1 — x2,  we  obtain  the  standard  form  needed  in  Theorem  1 of  Sec.  5.1 
and  we  see  that  the  coefficients  —2x/(l  — x2)  and  n(n  + 1 )/( 1 — x2)  of  the  new  equation 
are  analytic  at  x = 0,  so  that  we  may  apply  the  power  series  method.  Substituting 


(2)  y = 2 amxm 

m = 0 

and  its  derivatives  into  (1),  and  denoting  the  constant  n(n  +1)  simply  by  k,  we  obtain 

OO  00  00 

(1  — x2)  2 m(m  — 1 )amxm~2  ~ 2x  2 mamxm~1  + k 2 amXm  = 0. 

m= 2 m=  1 m= 0 

By  writing  the  first  expression  as  two  separate  series  we  have  the  equation 

OO  OO  OO  00 

m(m  — 1 )amxm~2  — ^ m(m  ~ 1 )amxm  — ^ 2 ma7nxm  + 2 kamxm  = 0. 

m= 2 m= 2 m= 1 m= 0 

It  may  help  you  to  write  out  the  first  few  terms  of  each  series  explicitly,  as  in  Example  3 
of  Sec.  5.1;  or  you  may  continue  as  follows.  To  obtain  the  same  general  power  xs  in  all 
four  series,  set  m — 2 = s (thus  m = s + 2)  in  the  first  series  and  simply  write  .v  instead 
of  m in  the  other  three  series.  This  gives 


2 (s  + 2)(j  + l)fls+2^s  — 2 s(s  - l)asxs  — 2 2 sasxs  + 2 kasxs  = 0. 

s=0  s=2  s=l  s= 0 


1ADRIEN-MARIE  LEGENDRE  (1752-1833),  French  mathematician,  who  became  a professor  in  Paris  in 
1775  and  made  important  contributions  to  special  functions,  elliptic  integrals,  number  theory,  and  the  calculus 
of  variations.  His  book  Elements  de  geometrie  (1794)  became  very  famous  and  had  12  editions  in  less  than 
30  years. 

Formulas  on  Legendre  functions  may  be  found  in  Refs.  [GenRefl]  and  [GenReflO]. 
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(Note  that  in  the  first  series  the  summation  begins  with  s = 0.)  Since  this  equation  with 
the  right  side  0 must  be  an  identity  in  x if  (2)  is  to  be  a solution  of  (1),  the  sum  of  the 
coefficients  of  each  power  of  x on  the  left  must  be  zero.  Now  x°  occurs  in  the  first  and 
fourth  series  only,  and  gives  [remember  that  k = n(n  + 1 )] 

(3a)  2 • 1 a2  + n(n  + l)ao  = 0. 

x1  occurs  in  the  first,  third,  and  fourth  series  and  gives 

(3b)  3 • 2a3  + [—2  + n(n  + l)]fli  = 0. 

The  higher  powers  x2,  x3,  ■ ■ ■ occur  in  all  four  series  and  give 

(3c)  (s  + 2 )(s  + l)as+2  + [— 5(5  — 1)  — 2s  + nin  + l)]as  = 0. 

The  expression  in  the  brackets  [ • • • ] can  be  written  in  — s)(n  + s + 1),  as  you  may 
readily  verify.  Solving  (3a)  for  a2  and  (3b)  for  a3  as  well  as  (3c)  for  as+ 2,  we  obtain  the 
general  formula 


(4) 


fls+2  “ 


in  — s)(n  + j+l) 
(s  + 2)(s  + 1) 


(J  = 0,  I,---)- 


This  is  called  a recurrence  relation  or  recursion  formula.  (Its  derivation  you  may  verify 
with  your  CAS.)  It  gives  each  coefficient  in  terms  of  the  second  one  preceding  it,  except 
for  a0  and  a\,  which  are  left  as  arbitrary  constants.  We  find  successively 


n(n  +1) 

a2=  ao 


(n  - 2 )(n  + 3) 
fl4  = 4^"^  °2 

(n  — 2)n(n  + 1 )(n  + 3) 


(n  - 1 ){n  + 2) 
a3  = al 

(n  - 3 )(n  + 4) 

°5  = 5~^4  °3 

(n  - 3 )(n  - 1 )(n  + 2 )(n  + 4) 


and  so  on.  By  inserting  these  expressions  for  the  coefficients  into  (2)  we  obtain 

(5)  y(x)  = a0>’i(x)  + a^ix) 

where 


n(n  + 1)  in  — 2 )n(n  + 1 )(n  + 3) 

(6)  yi W = 1 — xz  H — x4  — + 


2! 


4! 


in  - l)(n  + 2)  „ in  - 3 )(«  - l)(n  + 2)(n  + 4) 


-x3  + 


(7)  y2(x)  = x - 


3! 


5! 


- + • • • . 
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These  series  converge  for  |x|  < 1 (see  Prob.  4;  or  they  may  terminate,  see  below).  Since 
(6)  contains  even  powers  of  x only,  while  (7)  contains  odd  powers  of  x only,  the  ratio 
yi/y’2  is  not  a constant,  so  that  yi  and  >’2  are  not  proportional  and  are  thus  linearly 
independent  solutions.  Hence  (5)  is  a general  solution  of  (1)  on  the  interval  —1  < x < 1. 

Note  that  x = ± 1 are  the  points  at  which  1 — x2  = 0,  so  that  the  coefficients  of  the 
standardized  ODE  are  no  longer  analytic.  So  it  should  not  surprise  you  that  we  do  not  get 
a longer  convergence  interval  of  (6)  and  (7),  unless  these  series  terminate  after  finitely 
many  powers.  In  that  case,  the  series  become  polynomials. 


Polynomial  Solutions.  Legendre  Polynomials  Pn(x) 

The  reduction  of  power  series  to  polynomials  is  a great  advantage  because  then  we  have 
solutions  for  all  x,  without  convergence  restrictions.  For  special  functions  arising  as 
solutions  of  ODEs  this  happens  quite  frequently,  leading  to  various  important  families  of 
polynomials;  see  Refs.  [GenRefl],  [GenReflO]  in  App.  1.  For  Legendre’s  equation  this 
happens  when  the  parameter  n is  a nonnegative  integer  because  then  the  right  side  of  (4) 
is  zero  for  5 = n,  so  that  an  + 2 = 0,  an+ 4 = 0,  an+g  = 0,  • • • . Hence  if  n is  even,  yi(x) 
reduces  to  a polynomial  of  degree  11.  If  n is  odd,  the  same  is  true  for  y^ix).  These 
polynomials,  multiplied  by  some  constants,  are  called  Legendre  polynomials  and  are 
denoted  by  Pn(x).  The  standard  choice  of  such  constants  is  done  as  follows.  We  choose 
the  coefficient  an  of  the  highest  power  xn  as 

(2n) ! 1 • 3 • 5 •••(2n  - 1) 

(8)  an  = 2„  2 = («  a positive  integer) 

(and  an  = 1 if  n = 0).  Then  we  calculate  the  other  coefficients  from  (4),  solved  for  as  in 
terms  of  as  + 2,  that  is, 


(9) 


(s  + 2)(s  + 1) 

(n  — s)(n  + 5+1) 


(5  S=  n — 2). 


The  choice  (8)  makes  pn(  1)  = 1 for  every  n (see  Fig.  107);  this  motivates  (8).  From  (9) 
with  s = n — 2 and  (8)  we  obtain 


n(n  — 1)  n(n  — 1)  (2  n)\ 

2(2 n - 1)  Un  ~ ~ 2(2 n - 1)  ’ 2”(n!)2 


Using  (2  n)\  = 2n(2n  — 1)(2  n — 2)!  in  the  numerator  and  n\  = n(n  — 1)!  and 
n\  = n(n  — I )(n  — 2)1  in  the  denominator,  we  obtain 


n(n  - 1 )2n(2n  - 1)(2 n - 2)\ 

2(2 n ~ 1)2 nn(n  - 1)!  n(n  - 1 )(n  - 2)! ' 


n(n  — I )2n(2n  — 1)  cancels,  so  that  we  get 


(2 n - 2)\ 

2 n(n  - 1)!  (n  - 2)! ' 
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Similarly, 


®n— 4 


(n  - 2)(n  - 3) 
4(2 n - 3) 


®n— 2 


(2n  - 4)! 

2m2!  (n  - 2)!  (n  - 4)! 


and  so  on,  and  in  general,  when  n — 2m  0, 


(10) 


2m 


(2n  — 2m)! 

(-1)™ ■ 

2m/«!  (n  — m)\  (n  — 2m)! 


The  resulting  solution  of  Legendre’s  differential  equation  (1)  is  called  the  Legendre 
polynomial  of  degree  n and  is  denoted  by  Pn(x). 

From  (10)  we  obtain 


M 


(2  n — 2m)! 


(ID 


pn(x)  = y (-i)m- 

m=0  2nm\  (n  - m)\  ( n - 2m)! 


n—2m 


(2m)! 
2w(n ! )2 


(2n  - 2)! 

2nl!  (n  - 1)!  (n  - 2)\ 


-xn~2  + 


where  M = n/2  or  (n  — 1 )/ 2,  whichever  is  an  integer.  The  first  few  of  these  functions 
are  (Fig.  107) 


P0{x)  = 1, 

(11')  P2(x)  = |(3x2  - 1), 

F4(x)  = g(35x4  - 30x2  + 3), 


I\  (x)  = X 

P3W  = i(5x3  - 3x) 

Psi*)  = g(63x5  - 70x3  + 15x) 


and  so  on.  You  may  now  program  (11)  on  your  CAS  and  calculate  Pn(x)  as  needed. 


Fig.  107.  Legendre  polynomials 
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The  Legendre  polynomials  Pn(x ) are  orthogonal  on  the  interval  — 1 1 , a basic 

property  to  be  defined  and  used  in  making  up  “Fourier-Legendre  series”  in  the  chapter 
on  Fourier  series  (see  Secs.  1 1.5-11.6). 


PROBLEM  SET  52 


1-5 


LEGENDRE  POLYNOMIALS  AND 
FUNCTIONS 


1.  Legendre  functions  for  n = 0.  Show  that  (6)  with 
n = 0 gives  Pq(x)  = 1 and  (7)  gives  (use  In  (1  + x)  = 
■) 


X - \x2  + |x3  + 


1 

y2(x)  = x + -x 


3 


1 + X 
1 — X 


Verify  this  by  solving  (1)  with  n = 0,  setting  z — y 
and  separating  variables. 

2.  Legendre  functions  for  n = 1.  Show  that  (7)  with 

n = I gives  y2(x)  = P\{x)  = x and  (6)  gives 


yi  = i — *2 


1 

= 1 x In 

2 


1 + x 
1 — x 


3.  Special  n.  Derive  (lL)  from  (11). 

4.  Legendre’s  ODE.  Verify  that  the  polynomials  in  (lL) 
satisfy  (1). 

5.  Obtain  P6  and  P-j. 


6-9 


CAS  PROBLEMS 


6.  Graph  P2(x),  ■ ■ ■ , Tiolx)  on  common  axes.  For  what  x 
(approximately)  and  n = 2,  ■ ■ ■ , 10  is  Pn(x)  < g? 

7.  From  what  n on  will  your  CAS  no  longer  produce 
faithful  graphs  of  Pn(x)?  Why? 

8.  Graph  Qo(x),  Q i(x),  and  some  further  Legendre 
functions. 

9.  Substitute  asxs  + as+ixs+1  + as+2xs+2  into  Legen- 
dre’s equation  and  obtain  the  coefficient  recursion  (4). 


10.  TEAM  PROJECT.  Generating  Functions.  Generating 
functions  play  a significant  role  in  modem  applied 
mathematics  (see  [GenRef5]).  The  idea  is  simple.  If  we 
want  to  study  a certain  sequence  ( fn(x))  and  can  find  a 
function 


G(u,  x)  = 2 fn(x)un, 

n = 0 

we  may  obtain  properties  of  (fn(x))  from  those  of  G, 
which  “generates”  this  sequence  and  is  called  a 
generating  function  of  the  sequence. 


(a)  Legendre  polynomials.  Show  that 


(12)  G(u,  x)  = 1 = = 2 Pn{x)un 

VI  — 2xu  + uz  n=0 


is  a generating  function  of  the  Legendre  polynomials. 
Hint:  Start  from  the  binomial  expansion  of  1/ Vl  — v, 
then  set  v = 2 xu  — u2,  multiply  the  powers  of  2 xu  — u2 
out,  collect  all  the  terms  involving  un,  and  verify  that 
the  sum  of  these  terms  is  Pn{x)un. 

(b)  Potential  theory.  Let  A1  and  /t2  be  two  points  in 
space  (Fig.  108,  r2  > 0).  Using  (12),  show  that 

1 

’ V r i + r 2 ~ 2r\r2  cos  6 

oo  / \m 

m = 0 V 7 

This  formula  has  applications  in  potential  theory.  (Q/r 
is  the  electrostatic  potential  at  A2  due  to  a charge  Q 
located  at  And  the  series  expresses  1/r  in  terms  of 
the  distances  of  A±  and  A2  from  any  origin  O and  the 
angle  6 between  the  segments  OA\  and  OA2.) 


(c)  Further  applications  of  (12).  Show  that 
pn(  1)  = 1,PB(-1)  = (-l)K,F2n  + 1(0)  = 0,  and 
P2n( 0)  = (- 1) n ■ 1 • 3 ■ • ■ (2 n - 1 )/ [2  • 4 • • • (2n)|. 


11-15 


FURTHER  FORMULAS 


11.  ODE.  Find  a solution  of  (a2  — x2)y"  — 2xy'  + 
n(n  + l)y  = 0,  a A 0,  by  reduction  to  the  Legendre 
equation. 


12.  Rodrigues’s  formula  (13)2  Applying  the  binomial 
theorem  to  (x2  — l)n,  differentiating  it  n times  term 
by  term,  and  comparing  the  result  with  (11),  show  that 


(13) 


1 dU  2 

Pn(x)  = [(x2  - 

n 2 n\  dxn 


Dn]. 


2OLINDE  RODRIGUES  (1794—1851),  French  mathematician  and  economist. 
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13.  Rodrigues’s  formula.  Obtain  (ll')  from  (13). 

14.  Bonnet’s  recursion.3  Differentiating  (13)  with 
respect  to  u,  using  (13)  in  the  resulting  formula,  and 
comparing  coefficients  of  un,  obtain  the  Bonnet 
recursion. 

(14)  ( n + ])Pn+i(x)  = (In  + 1 )xPn(x)  - npn-x(x), 

where  n = 1,  2,  ■ ■ ■ . This  formula  is  useful  for  com- 
putations, the  loss  of  significant  digits  being  small 
(except  near  zeros).  Try  (14)  out  for  a few  computations 
of  your  own  choice. 


15.  Associated  Legendre  functions  Pkn  (x)  are  needed,  e.g., 
in  quantum  physics.  They  are  defined  by 

k „ k/9  dkPn(x) 

(15)  Pk(x)  = ( \-x2f'2 — 

dxk 

and  are  solutions  of  the  ODE 

(16)  (1  — x2)y"  — 2 xy'  + q(x)y  = 0 

where  q(x)  = n(n  + 1)  — k2/(  1 — x2).  Find  P\(x), 
P\(x),  P |(x),  and  P\(x)  and  verify  that  they  satisfy  (16). 


5.3  Extended  Power  Series  Method: 
Frobenius  Method 


Several  second-order  ODEs  of  considerable  practical  importance — the  famous  Bessel 
equation  among  them — have  coefficients  that  are  not  analytic  (definition  in  Sec.  5.1),  but 
are  “not  too  bad,”  so  that  these  ODEs  can  still  be  solved  by  series  (power  series  times  a 
logarithm  or  times  a fractional  power  of  x,  etc.).  Indeed,  the  following  theorem  permits 
an  extension  of  the  power  series  method.  The  new  method  is  called  the  Frobenius 
method.4  Both  methods,  that  is,  the  power  series  method  and  the  Frobenius  method,  have 
gained  in  significance  due  to  the  use  of  software  in  actual  calculations. 


THEOREM  1 


Frobenius  Method 

Let  b(x)  and  c(x)  be  any  functions  that  are  analytic  at  x = 0.  Then  the  ODE 


(1) 


b(x)  c(x) 

+ y + 2 y = 0 

X X 


has  at  least  one  solution  that  can  be  represented  in  the  form 

oo 

(2)  y(x)  = x'  2 OmXm  = xr(ao  + a±x  + a2X2  + • • • ) (a0  ^ 0) 

m = 0 

where  the  exponent  r may  be  any  ( real  or  complex)  number  {and  r is  chosen  so  that 

ao  =£  0). 

The  ODE  (1)  also  has  a second  solution  {such  that  these  two  solutions  are  linearly 
independent)  that  may  be  similar  to  (2)  {with  a different  r and  different  coefficients) 
or  may  contain  a logarithmic  term.  (Details  in  Theorem  2 below.) 


3OSSIAN  BONNET  (1819-1892),  French  mathematician,  whose  main  work  was  in  differential  geometry. 

4GEORG  FROBENIUS  (1849-1917),  German  mathematician,  professor  at  ETH  Zurich  and  University  of  Berlin, 
student  of  Karl  Weierstrass  (see  footnote,  Sect.  15.5).  He  is  also  known  for  his  work  on  matrices  and  in  group  theory. 

In  this  theorem  we  may  replace  x by  x — a0  with  any  number  x0.  The  condition  a0  ^ 0 is  no  restriction;  it 
simply  means  that  we  factor  out  the  highest  possible  power  of  x. 

The  singular  point  of  (1)  at  x = 0 is  often  called  a regular  singular  point,  a term  confusing  to  the  student, 
which  we  shall  not  use. 
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For  example,  Bessel’s  equation  (to  be  discussed  in  the  next  section) 


n 


y 


+ -y'  + 


(v  a parameter) 


is  of  the  form  (1)  with  b(x)  = 1 and  c(x)  = xz  — v2  analytic  at  x = 0,  so  that  the  theorem 
applies.  This  ODE  could  not  be  handled  in  full  generality  by  the  power  series  method. 

Similarly,  the  so-called  hypergeometric  differential  equation  (see  Problem  Set  5.3)  also 
requires  the  Frobenius  method. 

The  point  is  that  in  (2)  we  have  a power  series  times  a single  power  of  x whose  exponent 
r is  not  restricted  to  be  a nonnegative  integer.  (The  latter  restriction  would  make  the  whole 
expression  a power  series,  by  definition;  see  Sec.  5.1.) 

The  proof  of  the  theorem  requires  advanced  methods  of  complex  analysis  and  can  be 
found  in  Ref.  [All]  listed  in  App.  1. 


Regular  and  Singular  Points.  The  following  terms  are  practical  and  commonly  used. 
A regular  point  of  the  ODE 


y"  + p(x)y'  + q(x)y  = 0 


is  a point  x0  at  which  the  coefficients  p and  q are  analytic.  Similarly,  a regular  point  of 
the  ODE 


h{x)y"  + p(x)y'(x)  + q(x)y  = 0 

is  an  x0  at  which  h,  p,  q are  analytic  and  h(x o)  A 0 (so  what  we  can  divide  by  h and  get 
the  previous  standard  form).  Then  the  power  series  method  can  be  applied.  If  x0  is  not  a 
regular  point,  it  is  called  a singular  point. 

Indicial  Equation,  Indicating  the  Form  of  Solutions 

We  shall  now  explain  the  Frobenius  method  for  solving  (1).  Multiplication  of  (1)  by  x2 
gives  the  more  convenient  form 

(l/)  x2y"  + xb(x)y'  + c(x)y  = 0. 

We  first  expand  b(x ) and  c(x)  in  power  series, 

b(x)  = bo  + b\X  + /?2-r2  + • • • , c(x ) = co  + c\X  + C2X2  + • ■ • 

or  we  do  nothing  if  b(x)  and  c(x)  are  polynomials.  Then  we  differentiate  (2)  term  by  term, 
finding 

oo 

y'(x)  = 2 (,n  + f)amxm+r~1  = xr-1[rao  + (r  + l)flix  + • • • ] 

m = 0 

oo 

(2*)  y"(x)  = 2 (m  + r)(m  + r — 1 )amxm+r~2 

m = 0 

= x7~2[r(r  — l)ao  + (r  + 1 )ra\X  + • • • ]. 
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By  inserting  all  these  series  into  (I ')  we  obtain 

xr[r(r  — l)ao  + ■ ■ • ] + (bo  + b-px  + ■ • • )xr(rao  + • ■ ■ ) 

(3) 

+ (co  + c\X  + ■ ■ - )x  (ao  + ci\x  + •••)  = 0. 

We  now  equate  the  sum  of  the  coefficients  of  each  power  xr,  xr+1,  xr+z,  • • • to  zero.  This 
yields  a system  of  equations  involving  the  unknown  coefficients  am.  The  smallest  power 
is  xr  and  the  corresponding  equation  is 

I r(j  - 1)  + b0r  + c0]flo  = 0. 

Since  by  assumption  ao  ¥=  0,  the  expression  in  the  brackets  [ • • • ] must  be  zero.  This 
gives 

(4)  r(r  - 1)  + b0r  + c0  = 0. 

This  important  quadratic  equation  is  called  the  indicial  equation  of  the  ODE  (1).  Its  role 
is  as  follows. 

The  Frobenius  method  yields  a basis  of  solutions.  One  of  the  two  solutions  will  always 
be  of  the  form  (2),  where  r is  a root  of  (4).  The  other  solution  will  be  of  a form  indicated 
by  the  indicial  equation.  There  are  three  cases: 

Case  1.  Distinct  roots  not  differing  by  an  integer  1,  2,  3,  • • • . 

Case  2.  A double  root. 

Case  3.  Roots  differing  by  an  integer  1,  2,  3,  ■ ■ • . 

Cases  1 and  2 are  not  unexpected  because  of  the  Euler-Cauchy  equation  (Sec.  2.5),  the 
simplest  ODE  of  the  form  (1).  Case  1 includes  complex  conjugate  roots  ri  and  r2  = rq 
because  ri  — r2  = At  — ri  = 2 i Im  /q  is  imaginary,  so  it  cannot  be  a real  integer.  The 
form  of  a basis  will  be  given  in  Theorem  2 (which  is  proved  in  App.  4),  without  a general 
theory  of  convergence,  but  convergence  of  the  occurring  series  can  be  tested  in  each 
individual  case  as  usual.  Note  that  in  Case  2 we  must  have  a logarithm,  whereas  in  Case  3 
we  may  or  may  not. 


THEOREM  2 


Frobenius  Method.  Basis  of  Solutions.  Three  Cases 

Suppose  that  the  ODE  (1)  satisfies  the  assumptions  in  Theorem  1.  Let  /q  and  r2  be 
the  roots  of  the  indicial  equation  (4).  Then  we  have  the  following  three  cases. 

Case  1.  Distinct  Roots  Not  Differing  by  an  Integer.  A basis  is 

(5)  yi(x)  = xTl(a0  + aix  + a2x2  + • ■ • ) 
and 

(6)  y2(x)  = xr2(A0  + Apx  + A2x2  + ■ ■ ■ ) 

with  coefficients  obtained  successively  from  (3)  with  r = ;q  and  r = r2,  respectively. 
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EXAMPLE  1 


EXAMPLE  2 


Case  2.  Double  Root  r1  = r2  = r.  A basis  is 

(7)  yi(x)  = xr(a0  + a^x  + a2x2  + • ■ ■)  [r  = |(1  - b0)] 
(of  the  same  general  form  as  before ) and 

(8)  yz(x)  = yi(x)  In  x + xr(Aix  + A2x2  + • • •)  (x  > 0). 

Case  3.  Roots  Differing  by  an  Integer.  A basis  is 

(9)  yi(x)  = xTl(ao  + a\X  + a2x2  + • • •) 

(of  the  same  general  form  as  before ) and 

(10)  y2(X)  = kyi(x)  In  x + xr2(A0  + Ai x + A2x2  + • • • ), 

where  the  roots  are  so  denoted  that  r\  — r2  > 0 and  k may  turn  out  to  be  zero. 


Typical  Applications 

Technically,  the  Frobenius  method  is  similar  to  the  power  series  method,  once  the  roots 
of  the  indicial  equation  have  been  determined.  However,  (5)-(10)  merely  indicate  the 
general  form  of  a basis,  and  a second  solution  can  often  be  obtained  more  rapidly  by 
reduction  of  order  (Sec.  2.1). 

Euler-Cauchy  Equation,  Illustrating  Cases  1 and  2 and  Case  3 without  a Logarithm 

For  the  Euler-Cauchy  equation  (Sec.  2.5) 

x2y " + boxy ' + Coy  = 0 (bo,  Cq  constant) 

substitution  of  y = xr  gives  the  auxiliary  equation 

r(r  — 1)  + b0r  + c0  = 0, 

which  is  the  indicial  equation  [and  y = xr  is  a very  special  form  of  (2)!].  For  different  roots  #*i,  r 2 we  get  a basis 
yi  = xr\  y2  — xr 2,  and  for  a double  root  r we  get  a basis  xr,  xr  In  x.  Accordingly,  for  this  simple  ODE,  Case  3 
plays  no  extra  role. 

Illustration  of  Case  2 (Double  Root) 

Solve  the  ODE 

(11)  x(x  - \)y"  + (3*  - 1)/  + y = 0. 

(This  is  a special  hypergeometric  equation,  as  we  shall  see  in  the  problem  set.) 

Solution.  Writing  (1 1)  in  the  standard  form  (1),  we  see  that  it  satisfies  the  assumptions  in  Theorem  1.  [What 
are  b(x)  and  c(x)  in  (11)?]  By  inserting  (2)  and  its  derivatives  (2*)  into  (11)  we  obtain 

2 (m  + r)(m  + r ~ 1 )arnxrn+r  — 2 (m  + r)(m  + r — 1 )amxm+r~1 
m = 0 m = 0 

+ 3^  (m  + r)amxm+T  - 2 (m  + f )amxm+r~1  + 2 amxm+r  = 0. 

m = 0 m = 0 m=0 
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EXAMPLE  3 


The  smallest  power  is  xr  1,  occurring  in  the  second  and  the  fourth  series;  by  equating  the  sum  of  its  coefficients 
to  zero  we  have 


[— r(r  — 1)  — r\a0  = 0,  thus  r2  = 0. 

Hence  this  indicial  equation  has  the  double  root  r = 0. 

First  Solution.  We  insert  this  value  r — 0 into  (12)  and  equate  the  sum  of  the  coefficients  of  the  power 
xs  to  zero,  obtaining 


s(s  — \)as  — (s  + l)stfs+i  + 3 sas  — (s  + l)tfs+i  + as  = 0 
thus  as  + i = as.  Hence  = ai  = a2  ~ ‘ and  by  choosing  aq  = 1 we  obtain  the  solution 

.VlW  = 2 -vm  = 7^ — (Ul  < 1). 

m = 0 l~x 

Second  Solution.  We  get  a second  independent  solution  y2  by  the  method  of  reduction  of  order  (Sec.  2.1), 
substituting  y2  — uy\  and  its  derivatives  into  the  equation.  This  leads  to  (9),  Sec.  2.1,  which  we  shall  use  in 
this  example,  instead  of  starting  reduction  of  order  from  scratch  (as  we  shall  do  in  the  next  example).  In  (9)  of 
Sec.  2.1  we  have  p = (3x  — l)/(x  — x),  the  coefficient  of  y'  in  (11)  in  standard  form.  By  partial  fractions, 


3x  — 1 

pdx=~\W^T)dx  = 


2 1\ 

— 7 H — )dx  = —2  In  (x  — 1)  — In  x. 
x — 1 x J 


Hence  (9),  Sec.  2.1,  becomes 


u'  = U = yTze~fpdx 


C * ~ l)2  _ 1 

(jc  — 1 )2jc  x’ 


In  x 

u = In  x,  )2  = uyi  = . 

1 — x 


y±  and  y2  are  shown  in  Fig.  109.  These  functions  are  linearly  independent  and  thus  form  a basis  on  the  interval 
0 < x < 1 (as  well  as  on  1 < x < “). 


Fig.  109.  Solutions  in  Example  2 


Case  3,  Second  Solution  with  Logarithmic  Term 

Solve  the  ODE 

(13)  (x2  — x)y"  — xy'  + y = 0. 

Solution.  Substituting  (2)  and  (2*)  into  (13),  we  have 

(x2  - x)  2 (m  + r)(m  + r - l)amxm+r~2  - x 2 (m  + r)amxm+r~1  + 2 amxm+r  = 0. 
m= 0 ra= 0 m= 0 
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We  now  take  xz,  x,  and  a inside  the  summations  and  collect  all  terms  with  power  xm+r  and  simplify  algebraically, 


^ ('»  + r — 1 )2  amxm+r  — 2 (m  + rXm  + r — l)amxm+r  1 = 0. 

m= 0 m= 0 


In  the  first  series  we  set  m = s and  in  the  second  m = s + 1 , thus  s = m — 1 . Then 


(14)  2(-s  + r ~ D\xs+r  ~ 2 + r + l)^  + r)as+1^s+r  = 0. 

s=0  s=  — 1 

The  lowest  power  is  ^:r_1  (take  s = —1  in  the  second  series)  and  gives  the  indicial  equation 

r(r  - 1)  = 0. 

The  roots  are  r\  = 1 and  r2  — 0.  They  differ  by  an  integer.  This  is  Case  3. 

First  Solution.  From  (14)  with  r = = 1 we  have 


^[s2as  - (s  + 2)(j  + l)as+i]xs+1  = 0. 

s=0 


This  gives  the  recurrence  relation 


as+i  = 


(s  + 2)(j  + 1) 


(s  = 0,  !,•••)• 


Hence  a\  = 0,  = " * successively.  Taking  — 1,  we  get  as  a first  solution  yi  = xTiciq  — x. 

Second  Solution.  Applying  reduction  of  order  (Sec.  2.1),  we  substitute  y 2 = y\u  = xu,y2  — xu  + u and 
y2  — xu  + 2 u'  into  the  ODE,  obtaining 

(x2  — x){xu"  + 2 u)  — x{xu  + u)  + xu  = 0. 

xu  drops  out.  Division  by  x and  simplification  give 

(X2  — x)u"  + (x  — 2 )u  = 0. 

From  this,  using  partial  fractions  and  integrating  (taking  the  integration  constant  zero),  we  get 


u" 

1 

u 


In  u = In 


x - 1 


Taking  exponents  and  integrating  (again  taking  the  integration  constant  zero),  we  obtain 


, x - 1 1 1 1 

u = — 2 — = o-  u = In  x H — , y2  = xu  = x\n  x + \ . 

xxx  x 


yi  and  72  are  linearly  independent,  and  y2  has  a logarithmic  term.  Hence  yi  and  y2  constitute  a basis  of  solutions 
for  positive  x. 


The  Frobenius  method  solves  the  hypergeometric  equation,  whose  solutions  include 
many  known  functions  as  special  cases  (see  the  problem  set).  In  the  next  section  we  use 
the  method  for  solving  Bessel’s  equation. 
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1.  WRITING  PROJECT.  Power  Series  Method  and 
Frobenius  Method.  Write  a report  of  2-3  pages 
explaining  the  difference  between  the  two  methods.  No 
proofs.  Give  simple  examples  of  your  own. 


2-13 


FROBENIUS  METHOD 


Find  a basis  of  solutions  by  the  Frobenius  method.  Try  to 
identify  the  series  as  expansions  of  known  functions.  Show 
the  details  of  your  work. 

2.  (x  + 2 fy"  + (x  + 2)y'  - y = 0 

3.  xy"  + 2y  + xy  = 0 


4.  xy"  + y = 0 

5.  xy"  + (2x  + \)y'  + (x  + l)y  = 0 

6.  xy"  + 2xay'  + (x2  — 2)y  = 0 

7.  y"  + (x-  I )y  = 0 

8.  xy"  + y'  — xy  = 0 

9.  2x(x  - \)y"  - (x  + l)y'  + y = 0 

10.  xy"  + 2y  + 4.ry  = 0 

11.  xy"  + (2  — 2x)y'  + (x  — 2)y  = 0 

12.  x2y"  + 6xy'  + (4x2  + 6)y  = 0 

13.  xy"  + (1  — 2x)y'  + (x  — l)y  = 0 

14.  TEAM  PROJECT.  Hypergeometric  Equation,  Series, 
and  Function.  Gauss’s  hypergeometric  ODE5  is 


(15)  x{l  — x)y"  + [c  — (a  + b + l)x]y'  — aby  = 0. 


Here,  a,  b,  c are  constants.  This  ODE  is  of  the  form 
p2y"  + Piy'  + Poy  — 0,  where p2,  pj,  p0  are  polyno- 
mials of  degree  2,  1,0,  respectively.  These  polynomials 
are  written  so  that  the  series  solution  takes  a most  prac- 
tical form,  namely, 


(16) 


ab  a(a  + 1 )b(b  + 1)  2 

ViU)  = 1 H x x 

71  1!  c 2!  c(c  + 1) 


a(a  + 1 )(a  + 2)b{b  + 1 ){b  + 2) 
3!  c(c  + l)(c  + 2) 


x3  + 


of  (15)  [see  the  small  sample  of  elementary  functions 
in  part  (c)].  This  accounts  for  the  importance  of  (15). 

(a)  Hypergeometric  series  and  function.  Show  that 
the  indicial  equation  of  (15)  has  the  roots  = 0 and 
t'2  — 1 ~ c.  Show  that  for  ri  = 0 the  Frobenius 
method  gives  (16).  Motivate  the  name  for  (16)  by 
showing  that 

F(l,  1,  1;  x)  = F(l,b,  b\x)  = F(a.  1 ,a\x)  = — — . 

1 - x 

(b)  Convergence.  For  what  a or  h will  ( 1 6)  reduce  to 
a polynomial?  Show  that  for  any  other  a,  b,  c 
(c  =A  0,  —1,  — 2,  ■ ■ ■ ) the  series  (16)  converges  when 
\x\  < 1. 

(c)  Special  cases.  Show  that 

(1  + x)n  = F(—n,  b,  b\  —x), 

(1  — x)n  = 1 — nxF(  1 — n,  1,  2;  x), 
arctanx  = xF(g,  l,f;  — xz) 
arcsin  x = xr  (j , g , 2 1 x )• 

In  (1  + x)  = xF(\ , 1,  2;  -x), 

In  j ^ ^ = 2xF(\ , 1 , | ; x2). 


Find  more  such  relations  from  the  literature  on  special 
functions,  for  instance,  from  [GenRefl]  in  App.  1. 

(d)  Second  solution.  Show  that  for  r2  = 1 — c the 
Frobenius  method  yields  the  following  solution  (where 
c A 2,  3,4,  - -): 


y2(x)  = x1  c[  1 + 


(17) 


(a  — c + 1 ){b  — c + 1) 
1!  (-c  + 2) 


(a  — c + 1 )(a  — c + 2 ){b  — c + 1 ){b  — c + 2) 
2!  ( c + 2)(—c  + 3) 


Show  that 

y2{x)  = x1~cF(a  — c + 1 ,b  — c + 1,  2 — c;  x). 


This  series  is  called  the  hypergeometric  series.  Its  sum 
yi(x)  is  called  the  hypergeometric  function  and  is 
denoted  by  F(a,  b,  c;  x).  Here,  c # 0,-1,  —2,  ■ ■ ■ . By 
choosing  specific  values  of  a,  b , c we  can  obtain  an 
incredibly  large  number  of  special  functions  as  solutions 


(e)  On  the  generality  of  the  hypergeometric  equation. 

Show  that 

(18)  ( t 2 + At  + B)y  + (Ct  + D)y  + Ky  = 0 


5CARL  FRIEDRICH  GAUSS  (1777-1855),  great  German  mathematician.  He  already  made  the  first  of  his  great 
discoveries  as  a student  at  Helmstedt  and  Gottingen.  In  1 807  he  became  a professor  and  director  of  the  Observatory 
at  Gottingen.  His  work  was  of  basic  importance  in  algebra,  number  theory,  differential  equations,  differential 
geometry,  non-Euclidean  geometry,  complex  analysis,  numeric  analysis,  astronomy,  geodesy,  electromagnetism, 
and  theoretical  mechanics.  He  also  paved  the  way  for  a general  and  systematic  use  of  complex  numbers. 
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with  y = dy/dt,  etc.,  constant  A,  B,  C , D , K,  and  t2  + 
At  + B = (f  — ?i)(f  — t2),  1 1 ^ can  be  reduced  to 
the  hypergeometric  equation  with  independent  variable 


15-20 


HYPERGEOMETRIC  ODE 


Find  a general  solution  in  terms  of  hypergeometric 
functions. 


t ~ 1 1 
h ~ h 


and  parameters  related  by  Ct\  + D = — c(t2  ~ fi)> 
C — a + b + 1,  K = ah.  From  this  you  see  that  (15) 
is  a “normalized  form”  of  the  more  general  (18)  and 
that  various  cases  of  (18)  can  thus  be  solved  in  terms 
of  hypergeometric  functions. 


15.  2x{\  — x)y"  — (1  + 6.^)v,  — 2y  = 0 

16.  x(l  — x)y"  + (|  + 2x)y'  — 2y  — 0 

17.  4jc(1  - x)y"  + y + 8y  = 0 

18.  4 (f2  - 3/  + 2)y  - 2y  I y = 0 

19.  2(t2  -5 1 + 6)y  + (2t  - 3 )y  - 8y  = 0 

20.  3f(l  + t)y  + ty  — y = 0 


5^  Bessels  Equation.  Bessel  Functions 7v(x) 

One  of  the  most  important  ODEs  in  applied  mathematics  in  Bessel’s  equation,6 
(1)  x2y"  + xy'  + ( xz  — v2)y  = 0 

where  the  parameter  v (nu)  is  a given  real  number  which  is  positive  or  zero.  Bessel’s 
equation  often  appears  if  a problem  shows  cylindrical  symmetry,  for  example,  as  the 
membranes  in  Sec.  12.9.  The  equation  satisfies  the  assumptions  of  Theorem  1.  To  see  this, 
divide  (1)  by  x 2 to  get  the  standard  form y"  + y'/x  + (1  — v2/x2)y  = 0.  Hence,  according 
to  the  Frobenius  theory,  it  has  a solution  of  the  form 


(2) 


oo 

}’{X)  = 2 am,Xm+r  («0  + 0). 

m =0 


Substituting  (2)  and  its  first  and  second  derivatives  into  Bessel’s  equation,  we  obtain 


2 ( tn  + r)(m  + r — 1 )amxm+7  + 2 (m  + f)aTOxm+r 

m=0  m= 0 


+ 2 amxm+r+2  - v2  2 amxm+r  = 0. 

m= 0 m=0 

We  equate  the  sum  of  the  coefficients  of  xs+ 7 to  zero.  Note  that  this  power  xs+r 
corresponds  to  m = s in  the  first,  second,  and  fourth  series,  and  to  m = s — 2 in  the  third 
series.  Hence  for  s = 0 and  s = 1,  the  third  series  does  not  contribute  since  m ^ 0. 


6FRIEDRICH  WILHELM  BESSEL  (1784-1846),  German  astronomer  and  mathematician,  studied  astronomy 
on  his  own  in  his  spare  time  as  an  apprentice  of  a trade  company  and  finally  became  director  of  the  new  Konigsberg 
Observatory. 

Formulas  on  Bessel  functions  are  contained  in  Ref.  [GenReflO]  and  the  standard  treatise  [A  13]. 
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For  s = 2,  3,  ■ • • all  four  series  contribute,  so  that  we  get  a general  formula  for  all  these  s. 
We  find 

(a)  r(r  — I )«0  4-  ra0  — v2a0  = 0 (s  = 0) 

(3)  (b)  ( r + 1 )ra1  + (r  + 1)0!  — v2a1  = 0 (s  = 1) 

(c)  (i  + r)(s  + r — l)as  + (s  + r)as  + as_ 2 — v2as  = 0 (s  = 2,  3,  ■ ■ • 

From  (3  a)  we  obtain  the  indicial  equation  by  dropping  flo, 

(4)  (r  + v)(r  - v)  = 0. 

The  roots  are  74  = v (=5  0)  and  r2  = — v. 

Coefficient  Recursion  for  r = r1  = v.  For  r = v,  Eq.  (3b)  reduces  to  (2v  + 1 )a1  = 0. 
Hence  a±  = 0 since  v g 0.  Substituting  r = v in  (3c)  and  combining  the  three  terms 
containing  as  gives  simply 

(5)  ( s + 2v)sas  + as_2  = 0. 

Since  a\  = 0 and  v g 0,  it  follows  from  (5)  that  a->,  = 0,  t/5  = 0,  • ■ - . Hence  we  have  to 
deal  only  with  even-numbered  coefficients  as  with  s = 2m.  For  ,v  = 2m,  Eq.  (5)  becomes 


(2m  + 2v)2ma2rn  + a2m_2  = 0. 


Solving  for  «2m  gives  the  recursion  formula 


(6) 


1 

a2m  — 9 a2m—2i 

2zm(v  + m) 


m = 1 , 2,  • • • . 


From  (6)  we  can  now  determine  a2,  £4,  ■ ■ • successively.  This  gives 


a0 

a2  ~ ^ 

22(v  + 1) 

a2  _ a0 

2z2(v  + 2)  242!  (v  + l)(v  + 2) 


and  so  on,  and  in  general 


(7) 


a2m 


(-l)mfl0 

2Zmm\  (v  + l)(v  + 2)  ■ ■ ■ (v  + m)  ’ 


m = 1 , 2,  • • • . 


Bessel  Functions  Jn(x)  for  Integer  v = n 

Integer  values  of  v are  denoted  by  n.  This  is  standard.  For  v = n the  relation  (7)  becomes 

t 1 \7n 

(-1)  a0 

(8)  a2m  = — , m — 1 , 2,  • ■ ■ . 

2 2mra!  ( n + 1 ){n  + 2 )■••(«  + m) 
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EXAMPLE  1 


a()  is  still  arbitrary,  so  that  the  series  (2)  with  these  coefficients  would  contain  this  arbitrary 
factor  «().  This  would  be  a highly  impractical  situation  for  developing  formulas  or 
computing  values  of  this  new  function.  Accordingly,  we  have  to  make  a choice.  The  choice 
«o  = 1 would  be  possible.  A simpler  series  (2)  could  be  obtained  if  we  could  absorb  the 
growing  product  ( n + 1 )(n  + 2 )•••(«  + in)  into  a factorial  function  (n  + m)\  What 
should  be  our  choice?  Our  choice  should  be 


(9) 


«o  = 


1 


because  then  n\  (n  + 1)  •••(«  + m)  = (n  + m) ! in  (8),  so  that  (8)  simply  becomes 


(10) 


^2  m 


(-i  r 

22m+nm\  (n  + m) ! 


m = 1 , 2,  • • ■ . 


By  inserting  these  coefficients  into  (2)  and  remembering  that  C\  = 0,  C3  = 0,  • ■ • we  obtain 
a particular  solution  of  Bessel’s  equation  that  is  denoted  by  Jn(x): 


(ID 


Jn(x)  = xn  2 


(-l)mx2m 


-)2m+n. 


m = 0 


m\  ( n + m)\ 


(n  g 0). 


Jn{x)  is  called  the  Bessel  function  of  the  first  kind  of  order  n.  The  series  (11)  converges 
for  all  x,  as  the  ratio  test  shows.  Hence  Jn(x)  is  defined  for  all  x.  The  series  converges 
very  rapidly  because  of  the  factorials  in  the  denominator. 


Bessel  Functions  J0[x)  and  7,(x) 

For  n = 0 we  obtain  from  ( 1 1 ) the  Bessel  function  of  order  0 


(12) 


Mx)  = 


" (—  \)mx2m 

m=0  1 \m-) 


22(1!)2  24(2!)z 


26(3!)2 


which  looks  similar  to  a cosine  (Fig.  110).  For  n = 1 we  obtain  the  Bessel  function  of  order  1 


(13) 


AW  = 2 


-o  22m+1«i!  (m  + 1)! 


+ - ■ 


which  looks  similar  to  a sine  (Fig.  110).  But  the  zeros  of  these  functions  are  not  completely  regularly  spaced 
(see  also  Table  A1  in  App.  5)  and  the  height  of  the  “waves”  decreases  with  increasing  x.  Heuristically,  n/x 2 
in  (1)  in  standard  form  [(1)  divided  by  x2\  is  zero  (if  n = 0)  or  small  in  absolute  value  for  large  x , and  so  is 
y / x , so  that  then  Bessel’s  equation  comes  close  to  y ' + y = 0,  the  equation  of  cos  x and  sin  x;  also  y /x  acts 
as  a “damping  term,”  in  part  responsible  for  the  decrease  in  height.  One  can  show  that  for  large  x, 


(14) 


2 / UTT  7 T 

AW~v/  — cos(r  - 2 ~4:  4 


where  ~ is  read  “asymptotically  equal”  and  means  that  for  fixed  n the  quotient  of  the  two  sides  approaches  1 


as  x 
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Formula  (14)  is  surprisingly  accurate  even  for  smaller  x(>0).  For  instance,  it  will  give  you  good  starting 
values  in  a computer  program  for  the  basic  task  of  computing  zeros.  For  example,  for  the  first  three  zeros  of  J0 
you  obtain  the  values  2.356  (2.405  exact  to  3 decimals,  error  0.049),  5.498  (5.520,  error  0.022),  8.639  (8.654, 
error  0.015),  etc.  I 


Bessel  Functions  Jv(x)  for  any  v 0.  Gamma  Function 

We  now  proceed  from  integer  v = n to  any  v it  0.  We  had  aft  = 1/(2 nn\)  in  (9).  So  we 
have  to  extend  the  factorial  function  n\  to  any  v =?  0.  For  this  we  choose 


(15) 


C'°  2T(v  + 1) 


with  the  gamma  function  F(v  + 1 ) defined  by 


(16) 


T(v  + 1)  = 


e ttv  dt 


(v>  - 1). 


(CAUTION!  Note  the  convention  v + 1 on  the  left  but  v in  the  integral.)  Integration 
by  parts  gives 


+ v 


Jo 


F(v  + 1)  = -e~ftv 
This  is  the  basic  functional  relation  of  the  gamma  function 
(17)  T(v  + 1)  = vT(v). 

Now  from  (16)  with  v = 0 and  then  by  (17)  we  obtain 


e ftv  1 dt  = 0 + vl». 


HD  = 


e t dt  = —e  t 


= 0 - (-1)  = 1 


and  then  T(2)  = 1 • T(l)  =1!,  T(3)  = 2T(1)  = 2!  and  in  general 


(18) 


T(n  + 1)  = n\ 


(n  = 0,  !,■•■). 
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THEOREM  1 


Hence  the  gamma  function  generalizes  the  factorial  function  to  arbitrary  positive  v. 

Thus  (15)  with  v = n agrees  with  (9). 

Furthermore,  from  (7)  with  a0  given  by  (15)  we  first  have 


t/t 2,YYl  o 

2 2mm\  (v  + l)(v  + 2)  • ■ • (v  + m)2T(v  + 1) 

Now  (17)  gives  (v  + l)T(v  + 1)  = T(v  + 2),  (v  + 2)T(v  + 2)  = T(v  + 3)  and  so  on, 
so  that 


(v  + l)(v  + 2)  • ' • (v  + 777)r(v  + 1)  = T(v  + 777  + 1 ). 


Hence  because  of  our  (standard!)  choice  (15)  of  a0  the  coefficients  (7)  are  simply 


(19) 


a2  m 


->2  m+v 


(~ir 

in\  T(v  + 777  + 1) 


With  these  coefficients  and  r=i'1  = vwe  get  from  (2)  a particular  solution  of  (1),  denoted 
by  Jfx)  and  given  by 


(20) 


Jfx)  = xv  ^ 


(- l)mx2m 


■>2  m+v 


m=0 


m\  T(v  + 777  + 1) 


Jfx ) is  called  the  Bessel  function  of  the  first  kind  of  order  v.  The  series  (20)  converges 
for  all  x,  as  one  can  verify  by  the  ratio  test. 

Discovery  of  Properties  from  Series 

Bessel  functions  are  a model  case  for  showing  how  to  discover  properties  and  relations  of 
functions  from  series  by  which  they  are  defined.  Bessel  functions  satisfy  an  incredibly  large 
number  of  relationships — look  at  Ref.  [A13]  in  App.  1 ; also,  find  out  what  your  CAS  knows. 
In  Theorem  3 we  shall  discuss  four  formulas  that  are  backbones  in  applications  and  theory. 


Derivatives,  Recursions 

The  derivative  of  Jfx)  with  respect  to  x can  be  expressed  by  Jv-fx)  or  Jv+fx)  by 

the  formulas 

(a) 

(21) 

[xvJfx)}'  = xvJv_fx) 

(b) 

[x~vJfx)]'  = -x~vJv+fx). 

Furthermore,  Jfx)  and  its  derivative  satisfy  the  recurrence  relations 

(c) 

(21) 

2v 

J V — 1 C^)  J V + 1 C^)  ‘AX-'O 

(d) 

Jv~fx)  - Jv+fx)  = 2 j'fx). 
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PROOF 


EXAMPLE  2 


(a)  We  multiply  (20)  by  xv  and  take  x2v  under  the  summation  sign.  Then  we  have 


= 2 


^ iyra^2m+2v 


->2  m+v 


m= 0 


ml  T(v  + m + 1) 


We  now  differentiate  this,  cancel  a factor  2,  pull  x2v~ 1 out,  and  use  the  functional 
relationship  T(v  + m + 1)  = (v  + m)T{y  + m ) [see  (17)].  Then  (20)  with  v — 1 instead 
of  v shows  that  we  obtain  the  right  side  of  (21a).  Indeed, 


(xvjvy 


“ (— l)m2(/77  + V)x2m+2V~1 

n=0  22m+vm!  T(v  + 771+1) 


oc  (-—1 'Imv2  m 

= tv1  y , . 

o2ra+v- 1 i t-'/  . \ 

m=0  2 77i!l(v  + 77i) 


(b)  Similarly,  we  multiply  (20)  by  x v,  so  that  xv  in  (20)  cancels.  Then  we  differentiate, 
cancel  2m,  and  use  ml  = m(m  — 1)!.  This  gives,  with  m = s + 1, 


(x  vjv)'  = y 


(-l)mx2m-i 


-)2m+v— 1 


m=  1 


(m  — 1)!  T(v  + 777  + I) 


= 2 


(-l)s+1x2s+1 


) 2s + v + 1 


s=0 


s'!  F(v  + 5 + 2) 


Equation  (20)  with  v + 1 instead  of  v and  s instead  of  m shows  that  the  expression  on 
the  right  is  —x~vJv+ i(x).  This  proves  (21b). 

(c),  (d)  We  perform  the  differentiation  in  (21a).  Then  we  do  the  same  in  (21b)  and 
multiply  the  result  on  both  sides  by  x2v . This  gives 

(a*)  vxv_1Jv  + xvfv  = xvJv- 1 
(b*)  —vxv_1Jv  + xvfv  = —xvJv+ 1- 

Substracting  (b*)  from  (a*)  and  dividing  the  result  by  xv  gives  (21c).  Adding  (a*)  and 
(b*)  and  dividing  the  result  by  xv  gives  (2 Id). 


Application  of  Theorem  1 in  Evaluation  and  Integration 

Formula  (21c)  can  be  used  recursively  in  the  form 


2v 

4+lW  = —Jv(x)  - Jv-l(x) 


for  calculating  Bessel  functions  of  higher  order  from  those  of  lower  order.  For  instance,  J2& ) = 2Ji(x)/x  — Jo(x), 
so  that  J2  can  be  obtained  from  tables  of  Jq  and  J\  (in  App.  5 or,  more  accurately,  in  Ref.  [GenRefl]  in  App.  1). 

To  illustrate  how  Theorem  1 helps  in  integration,  we  use  (21b)  with  v = 3 integrated  on  both  sides.  This 
evaluates,  for  instance,  the  integral 


/ = 


x 3J^(x)  dx  = —x  %(x) 


— 7/3(2)  + /3(1)- 

o 


A table  of  J3  (on  p.  398  of  Ref.  [GenRefl])  or  your  CAS  will  give  you 

• 0.128943  + 0.019563  = 0.003445. 


Your  CAS  (or  a human  computer  in  precomputer  times)  obtains  J3  from  (21),  first  using  (21c)  with  v = 2, 
that  is,  J%  = 4x_1J2  ~ Ji,  then  (21c)  with  v = 1,  that  is,  J2  = 2*-1/i  — Jq.  Together, 
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/ = *_3(4-*_1(2a;_17i  - J0)  - 7,) 


= 4127,(2)  - 270(2)  - 7,(2)]  + [87,(1) 


47 0( 1 ) - 7,(1)] 


= 47,(2)  + j70(2)  + 77,(1)  - 47„(1). 


This  is  what  you  get,  for  instance,  with  Maple  if  you  type  int(  • • • )•  And  if  you  type  evalf(int(  • • • )),  you  obtain 
0.003445448,  in  agreement  with  the  result  near  the  beginning  of  the  example. 


Bessel  Functions  Jv  with  Half-Integer  v Are  Elementary 

We  discover  this  remarkable  fact  as  another  property  obtained  from  the  series  (20)  and 
confirm  it  in  the  problem  set  by  using  Bessel’s  ODE. 

Elementary  Bessel  Functions Jv  with  v = ±|,  ±§,  ±|,-  ■ • . The  Value  T(^) 

We  first  prove  (Fig.  Ill) 


(22) 


(a)  Jy2(x) 


(b)  /_  1/2W 


cos  x. 


The  series  (20)  with  v = | is 

oo  ^ j ^ 2m  1 2 00  ( j ^ 2m + 1 

Jl/z(X)  = ^ 2 22m+i/2m!  r + |}  = y-  1 22m+1lll!  rtm  + !) 

m = 0 v m = 0 v z/ 

The  denominator  can  be  written  as  a product  AB,  where  (use  (16)  in  B ) 

A = 2mrn!  = 2m(2m  - 2)(2 m - 4)  • ■ ■ 4 • 2, 

B = 2m+1r(m  + 1)  = 2 m + \m  + \)(m  - |)  § ■ §r(|) 

= (2m  + l)(2m  — 1)  • • • 3 ■ 1 - Vtt; 


here  we  used  (proof  below) 


(23) 


T(!)  = Vtt. 


The  product  of  the  right  sides  of  A and  B can  be  written 

AB  = (2m  + l)2m(2m  - 1)  • • • 3 ■ 2 ■ 1 Vir  = (2m  + l)lVif. 


Hence 


Jy2(x) 


j- j yn^.2m+l 

(2m  + 1)! 


sin  .v. 


Fig.  111.  Bessel  functions  Jy2  and  7_,/2 
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This  proves  (22a).  Differentiation  and  the  use  of  (21a)  with  v = g now  gives 


[Vx/i/2(x)]' 


x1,2J-  y2(x). 


This  proves  (22b).  From  (22)  follow  further  formulas  successively  by  (21c),  used  as  in  Example  2. 

We  finally  prove  Tfe)  = V7T  by  a standard  trick  worth  remembering.  In  (15)  we  set  t = u.  Then 
dt  = 2 u du  and 


e_ff_1/2rfr  = 2 


du. 


We  square  on  both  sides,  write  v instead  of  u in  the  second  integral,  and  then  write  the  product  of  the  integrals 
as  a double  integral: 


= 4 


du 


dv  = 4 


~(lLZ  + VZ) 


du  dv. 


We  now  use  polar  coordinates  r,  6 by  setting  u = r cos  6,  v = r sin  6.  Then  the  element  of  area  is  du  dv  — r dr  dd 
and  we  have  to  integrate  over  r from  0 to  °°  and  over  6 from  0 to  77/2  (that  is,  over  the  first  quadrant  of  the 
mu -plane): 


By  taking  the  square  root  on  both  sides  we  obtain  (23). 


General  Solution.  Linear  Dependence 

For  a general  solution  of  Bessel’s  equation  (1)  in  addition  to  Jv  we  need  a second  linearly 
independent  solution.  For  v not  an  integer  this  is  easy.  Replacing  v by  — v in  (20),  we 
have 


(24) 


oo 

J-V(x)  = x~v  2 

771  = 0 


(-l)mx2m 

22m~vml  T{m  - v + 1)’ 


Since  Bessel’s  equation  involves  v2,  the  functions  Jv  and  are  solutions  of  the  equation 
for  the  same  v.  If  v is  not  an  integer,  they  are  linearly  independent,  because  the  first  terms 
in  (20)  and  in  (24)  are  finite  nonzero  multiples  of  xv  and  x~v.  Thus,  if  v is  not  an  integer, 
a general  solution  of  Bessel’s  equation  for  all  x =£  0 is 


y(x)  = cxJv(x)  + c2J-v(x ) 


This  cannot  be  the  general  solution  for  an  integer  v = n because,  in  that  case,  we  have 
linear  dependence.  It  can  be  seen  that  the  first  terms  in  (20)  and  (24)  are  finite  nonzero 
multiples  of  xv  and  x~v,  respectively.  This  means  that,  for  any  integer  v = n,  we  have 
linear  dependence  because 


(25) 


J-n(x)  = (“I  f Jn(x) 


(n  = 1,  2,  • ■ ■ ). 
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PROOF  To  prove  (25),  we  use  (24)  and  let  v approach  a positive  integer  n.  Then  the  gamma 
function  in  the  coefficients  of  the  first  n terms  becomes  infinite  (see  Fig.  553  in  App. 
A3.1),  the  coefficients  become  zero,  and  the  summation  starts  with  m = n.  Since  in 
this  case  T(m  — n + 1)  = (m  — n) ! by  (18),  we  obtain 


(26) 


J—n(x) 


2 


(-l)mx2m_Tl 
2Zm~nm\  (m  - n)\ 


2 

s=0 


{-\)n+sx2s+n 
2 Zs+n(n  + y)!  i! 


(m  = n + s ). 


The  last  series  represents  (—  1 )nJn(x),  as  you  can  see  from  (11)  with  m replaced  by  s.  This 
completes  the  proof.  ■ 


The  difficulty  caused  by  (25)  will  be  overcome  in  the  next  section  by  introducing  further 
Bessel  functions,  called  of  the  second  kind  and  denoted  by  Yv. 


1.  Convergence.  Show  that  the  series  (1 1)  converges  for 
all  x.  Why  is  the  convergence  very  rapid? 


2-10 


ODEs  REDUCIBLE  TO  BESSEL’S  ODE 


This  is  just  a sample  of  such  ODEs;  some  more  follow  in 
the  next  problem  set.  Find  a general  solution  in  terms  of  Jv 
and  J_v  or  indicate  when  this  is  not  possible.  Use  the 
indicated  substitutions.  Show  the  details  of  your  work. 

2.  x2y"  +xy'  + (x2  - 49)y  = 0 

3.  xy"  + y + \y  = 0 (Vx  = z) 

4.  y"  + (e-2*  - l)y  =0  (e~x  = z) 

5.  Two-parameter  ODE 

x2y"  + xy'  + (A2x2  — v2)y  = 0 (Ax  — z) 

6.  xzy"  + I (x  + 1)  y = 0 (y  = us/x,  Vx  = z) 

7.  x2y”  + xy'  + \ (x2  — l)y  = 0 (x  = 2z) 

8.  (2x  + l)2y"  + 2(2x  + l)v'  + 16x(x  + l)y  = 0 
(2x  + 1 = z) 


9.  xy"  + (2v  + l)y’  + xy  = 0 (y  = x vu) 

10.  x2y”  + (1  — 2v)xy’  + v2(x2v  + 1 — v2)y  = 0 
( v = xvu,  xv  = z) 

11.  CAS  EXPERIMENT.  Change  of  Coefficient.  Find 
and  graph  (on  common  axes)  the  solutions  of 


y"  + fcc_1y'  + y = 0,  y(0)  = 1,  y'(0)  = 0, 


for  k = 0,  1,  2,  ■ ■ • , 10  (or  as  far  as  you  get  useful 
graphs).  For  what  k do  you  get  elementary  functions? 
Why?  Try  for  noninteger  k,  particularly  between  0 and  2, 
to  see  the  continuous  change  of  the  curve.  Describe  the 
change  of  the  location  of  the  zeros  and  of  the  extrema  as 
k increases  from  0.  Can  you  interpret  the  ODE  as  a model 
in  mechanics,  thereby  explaining  your  observations? 

12.  CAS  EXPERIMENT.  Bessel  Functions  for  Large  x. 

(a)  Graph  Jn(x)  for  n = 0,  ■ ■ • , 5 on  common  axes. 


(b)  Experiment  with  (14)  for  integer  n.  Using  graphs, 
find  out  from  which  x = xn  on  the  curves  of  (11) 
and  (14)  practically  coincide.  How  does  x„  change 
with  n? 

(c)  What  happens  in  (b)  if  n = ±|?  (Our  usual  notation 
in  this  case  would  be  v.) 

(d)  How  does  the  error  of  (14)  behave  as  a func- 
tion of  x for  fixed  nl  [Error  = exact  value  minus 
approximation  (14).] 

(e)  Show  from  the  graphs  that  Jq(x)  has  extrema  where 
J\(x)  = 0.  Which  formula  proves  this?  Find  further 
relations  between  zeros  and  extrema. 
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ZEROS  of  Bessel  functions  play  a key  role  in 


modeling  (e.g.  of  vibrations;  see  Sec.  12.9). 


13.  Interlacing  of  zeros.  Using  (21)  and  Rolle’s  theorem, 
show  that  between  any  two  consecutive  positive  zeros 
of  Jn(x)  there  is  precisely  one  zero  of  Jn  + j(x). 

14.  Zeros.  Compute  the  first  four  positive  zeros  of  Jq(x) 
and  7i(x)  from  (14).  Determine  the  error  and  comment. 

15.  Interlacing  of  zeros.  Using  (21)  and  Rolle’s  theorem, 
show  that  between  any  two  consecutive  zeros  of  Jq(x) 
there  is  precisely  one  zero  of  J i(x). 
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HALF-INTEGER  PARAMETER:  APPROACH 
BY  THE  ODE 


16.  Elimination  of  first  derivative.  Show  that  y = uv 
with  u(x)  = exp  (— g f p(x)  dx ) gives  from  the  ODE 
y"  + p(x) y + q(x)y  = 0 the  ODE 

u"  + [q(x)  - | p(x)2  - |p'(x)]  u = 0, 


not  containing  the  first  derivative  of  u. 
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17.  Bessel’s  equation.  Show  that  for  (1)  the  substitution 
in  Prob.  16  is  y = ux  ~ 1 / 2 and  gives 

(27)  x2u"  + (x2  + \ — v2)u  = 0. 

18.  Elementary  Bessel  functions.  Derive  (22)  in  Example  3 
from  (27). 


APPLICATION  OF  (21):  DERIVATIVES, 
INTEGRALS 

Use  the  powerful  formulas  (21)  to  do  Probs.  19-25.  Show 
the  details  of  your  work. 

19.  Derivatives.  Show  that  Jq(x)  = —J j(x),  J\(x)  = 
J0(x)  - JJx)/x,  J'2{x)  = \\JJx)  - Ux)\. 

20.  Bessel’s  equation.  Derive  (1)  from  (21). 


21.  Basic  integral  formula.  Show  that 

jxvJv_1(x)  dx  = xvJv(x)  + c. 

22.  Basic  integral  formulas.  Show  that 

J .r-17v  + i(x)  dx  = — x~vJv{x)  + c, 

J Jv+i(x)  dx  = |iv_!(x)  dx  — 2 Jv(x). 

23.  Integration.  Show  that  fx2J0(x)dx  = x2Ji(x)  + 
xj0(x)  — f Jo(x)  dx.  (The  last  integral  is  nonelemen- 
tary; tables  exist,  e.g.,  in  Ref.  [A13]  in  App.  1.) 

24.  Integration.  Evaluate  ^x~1J^{x)  dx. 

25.  Integration.  Evaluate  J J^{x)  dx. 


Bessel  Functions  Yv[x).  General  Solution 

To  obtain  a general  solution  of  Bessel’s  equation  (1),  Sec.  5.4,  for  any  v,  we  now  introduce 
Bessel  functions  of  the  second  kind  Yv(x),  beginning  with  the  case  v = n = 0. 

When  n = 0,  Bessel’s  equation  can  be  written  (divide  by  x ) 

(1)  xy"  + y'  + xy  = 0. 

Then  the  indicial  equation  (4)  in  Sec.  5.4  has  a double  root  r = 0.  This  is  Case  2 in  Sec. 
5.3.  In  this  case  we  first  have  only  one  solution,  Jo(x).  From  (8)  in  Sec.  5.3  we  see  that 
the  desired  second  solution  must  be  of  the  form 

oo 

(2)  y2(x)  = J0(x)  In  x + 2 AmXm- 

m= 1 

We  substitute  y2  and  its  derivatives 

y'2  = Tolnx  + ^ inAmxm~ 1 

m=  1 

y2  = J'qIwx  + — 1 + 2 m(m  - 1 )Amxm~2 

X X , 

m=  1 

into  (1).  Then  the  sum  of  the  three  logarithmic  terms  xJq  In  x,  J'o  In  x,  and  xJ0  In  x is  zero 
because  J0  is  a solution  of  (1).  The  terms  —Jq/x  and  J0/x  (from  xy  and  y ) cancel.  Hence 
we  are  left  with 


2 Jo  + 2 m(m  - 1 )Amxm~1  + 2 mAmxm~1  + 2 Aw.xm+1  = 0. 

m=  1 m=  1 m= 1 
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Addition  of  the  first  and  second  series  gives  2m2ATOxm  . The  power  series  of  j'0(x ) is 
obtained  from  (12)  in  Sec.  5.4  and  the  use  of  ml/m  = (m  — 1)!  in  the  form 


j'o(x) 


» (— l)m2mx2m_1 
m=l  2 


(-l)mX2m“l 


-*2m— 1 


m=  1 


ml  (m  — 1)1 


Together  with  'Lm2Amxrn  1 and  2Amxm+1  this  gives 


(3*)  2 

m=  1 


(-\)mxZrn~1 


+ 2 m2Ar 


22m~2rnl  (m  - 1)1  “ ! 


+ 2 A 

m=  1 


7xm+1  = 0. 


First,  we  show  that  the  Am  with  odd  subscripts  are  all  zero.  The  power  x°  occurs  only  in 
the  second  series,  with  coefficient  Aj.  Hence  Ai  = 0.  Next,  we  consider  the  even  powers 
x2s . The  first  series  contains  none.  In  the  second  series,  m — 1 = 2s  gives  the  term 
(2 s + 1)  A2S+1X  • In  the  third  series,  m + 1 = 2s.  Hence  by  equating  the  sum  of  the 
coefficients  of  x2s  to  zero  we  have 


(2s  + 1)  A2s+i  + A2s_i  - 0,  s - 1,  2,  • 

Since  Ai  = 0,  we  thus  obtain  A3  = 0,  A5  = 0,  • ■ • , successively. 

We  now  equate  the  sum  of  the  coefficients  of  x2s+1  to  zero.  For  s = 0 this  gives 

l 


- 1 + 4A2  = 0, 


thus 


^2  — 4- 


For  the  other  values  of  s we  have  in  the  first  series  in  (3*)  2m  — 1 = 2s  + 1,  hence 
m = s 4-  1 , in  the  second  m — 1 = 2s  + 1 . and  in  the  third  in  + 1=2 s 4-  1 . We  thus  obtain 

xS+l 


(-i  r 


22s(s  + 1)!  s! 


+ (2s  + 2)  A2s+2  + A2s  — 0. 


For  s = 1 this  yields 
and  in  general 

(3) 


g + I6A4  + A2  — 0, 


(-D 


m—  1 


A2m  22m(m,)2 


thus 


1 1 


A4  — — 


128 


1 + - + - + 
2 3 


+ 


Using  the  short  notations 
(4)  hx  = 1 


hm  ~ 1 + 2 + 


+ 


771  = 1 , 2,  ' ' 


m = 2,  3, 


and  inserting  (4)  and  Ai  = A3  = • • • =0  into  (2),  we  obtain  the  result 

00 

y2(.\)  = Jq(x)  In  v + ^ 


(-1 

2 2m(rn!)2 


2m 


(5) 


j ( <.1  | 1 2 3 4 11  6 

+T ~ + 
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Since  J0  and  v2  are  linearly  independent  functions,  they  form  a basis  of  (1)  for  x > 0. 
Of  course,  another  basis  is  obtained  if  we  replace  y2  by  an  independent  particular  solution 
of  the  form  a(y2  + bJ0 ),  where  a (¥=  0)  and  b are  constants.  It  is  customary  to  choose 
a = 2/7 r and  b = y — In  2,  where  the  number  y = 0.57721566490  ■ ■ • is  the  so-called 
Euler  constant,  which  is  defined  as  the  limit  of 

1 + + • ■ • + In  s 

2 s 

as  s approaches  infinity.  The  standard  particular  solution  thus  obtained  is  called  the  Bessel 
function  of  the  second  kind  of  order  zero  (Fig.  112)  or  Neumann’s  function  of  order 
zero  and  is  denoted  by  Yq(x).  Thus  [see  (4)] 


(6) 


Yfx)  = — 

77 


70(x)  ( In  + y 


+ 2 


(-D" 


2m 


m= 1 


2 2m(rn!)2 


For  small  x > 0 the  function  K0(x)  behaves  about  like  In  x (see  Fig.  112,  why?),  and 
lo(x)  — > — 00  as  x — > 0. 

Bessel  Functions  of  the  Second  Kind  Yn(x) 

For  v = n = 1 , 2,  • ■ ■ a second  solution  can  be  obtained  by  manipulations  similar  to  those 
for  n = 0,  starting  from  (10),  Sec.  5.4.  It  turns  out  that  in  these  cases  the  solution  also 
contains  a logarithmic  term. 

The  situation  is  not  yet  completely  satisfactory,  because  the  second  solution  is  defined 
differently,  depending  on  whether  the  order  v is  an  integer  or  not.  To  provide  uniformity 
of  formalism,  it  is  desirable  to  adopt  a form  of  the  second  solution  that  is  valid  for  all 
values  of  the  order.  For  this  reason  we  introduce  a standard  second  solution  Yfx)  defined 
for  all  v by  the  formula 


(7) 


(a) 

(b) 


Yv(x)  = [Jv(x)  COS  V7T  - J-V(x)] 

sin  vtt 

Yn(x)  = lim  YJx). 

v— »n 


This  function  is  called  the  Bessel  function  of  the  second  kind  of  order  v or  Neumann’s 
function7  of  order  v.  Figure  112  shows  Y0(x)  and  Yf  x). 

Let  us  show  that  Jv  and  Yv  are  indeed  linearly  independent  for  all  v (and  x > 0). 

For  noninteger  order  v,  the  function  Yv(x)  is  evidently  a solution  of  Bessel’s  equation 
because  Jv(x)  and  J_v  (jt)  are  solutions  of  that  equation.  Since  for  those  v the  solutions 
Jv  and  J_v  are  linearly  independent  and  Yv  involves  the  functions  Jv  and  Yv  are 


7 CARL  NEUMANN  (1832-1925),  German  mathematician  and  physicist.  His  work  on  potential  theory  using 
integer  equation  methods  inspired  VITO  VOLTERRA  (1800-1940)  of  Rome,  ERIK  IVAR  FREDHOLM  (1866-1927) 
of  Stockholm,  and  DAVID  HILBERT  (1962-1943)  of  Gottingen  (see  the  footnote  in  Sec.  7.9)  to  develop  the  field 
of  integral  equations.  For  details  see  Birkhoff,  G.  and  E.  Kreyszig,  The  Establishment  of  Functional  Analysis,  Historia 
Mathematica  11  (1984),  pp.  258-321. 

The  solutions  l^(x)  are  sometimes  denoted  by  Nv(x);  in  Ref.  [A13]  they  are  called  Weber’s  functions;  Euler’s 
constant  in  (6)  is  often  denoted  by  C or  In  y. 
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Fig.  112.  Bessel  functions  of  the  second  kind  Vo  and  Yi. 

(For  a small  table,  see  App.  5.) 

linearly  independent.  Furthermore,  it  can  be  shown  that  the  limit  in  (7b)  exists  and  Yn 
is  a solution  of  Bessel’s  equation  for  integer  order;  see  Ref.  [A13]  in  App.  1.  We  shall 
see  that  the  series  development  of  Yn(x ) contains  a logarithmic  term.  Hence  Jn(x)  and 
Yn(x)  are  linearly  independent  solutions  of  Bessel’s  equation.  The  series  development 
of  Yn(x ) can  be  obtained  if  we  insert  the  series  (20)  in  Sec.  5.4  and  (2)  in  this  section 
for  Jv(x ) and  J-V(x)  into  (7a)  and  then  let  v approach  n;  for  details  see  Ref.  [A13].  The 
result  is 


(8) 


2 ( x \ xn  ^ (-l)m_1(/im  + hm+n)  2m 

M - - JM  (in  - + y)  + - 2 + -«2m 

N 7 m=0 

— 71  ^ — X , i \ i 

_ X (/7  - m - 1)!  2m 

rrr  2^  ^27)1  — 71^  I X 

77  1 ml 

m=0 


where  x > 0,  n = 0,  1,  • ■ • , and  [as  in  (4)]  h0  = 0,  hi  = 1, 


hm  ~ 1 +»+•••■* , hm+n  ~ 1 +-  + •••  H — . 

2 m 2 m + n 


For  n = 0 the  last  sum  in  (8)  is  to  be  replaced  by  0 [giving  agreement  with  (6)]. 
Furthermore,  it  can  be  shown  that 

Y-n(x)  = (-1  )nYn(x). 

Our  main  result  may  now  be  formulated  as  follows. 


THEOREM  1 


General  Solution  of  Bessel’s  Equation 

A general  solution  of  Bessel’s  equation  for  all  values  of  v (and  x > 0)  is 
(9)  y(v)  = CxJv(x)  + C2Yv(x). 


We  finally  mention  that  there  is  a practical  need  for  solutions  of  Bessel’s  equation  that 
are  complex  for  real  values  of  x.  For  this  purpose  the  solutions 

H^\x)  = Jv(x ) + iYv(x) 

H(?(x)  = Jv(x)  - iYv(x ) 


(10) 
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are  frequently  used.  These  linearly  independent  functions  are  called  Bessel  functions  of 
the  third  kind  of  order  v or  first  and  second  Hankel  functions8  of  order  v. 

This  finishes  our  discussion  on  Bessel  functions,  except  for  their  “orthogonality,”  which 
we  explain  in  Sec.  11.6.  Applications  to  vibrations  follow  in  Sec.  12.10. 


PRQBL  £MS  ET  5r5 


1-9 


FURTHER  ODE’s  REDUCIBLE 
TO  BESSEL’S  ODE 


Find  a general  solution  in  terms  of  Jv  and  Yv.  Indicate 

whether  you  could  also  use  /_v  instead  of  Yv.  Use  the 

indicated  substitution.  Show  the  details  of  your  work. 

1.  x1 2y"  + xy'  + ( x 2 — 16)  y = 0 

2.  xy"  + 5y'  + xy  = 0 (y  = u/x2) 

3.  9x2y"  + 9 xy'  + (36.x4 5  — 16)y  = 0 (x2  = z) 

4.  y"  + xy  = 0 {y  = mVx,  |x3^2  = z) 

5.  4xy"  + 4y'  + y = 0 (Vx  = z) 

6.  xy"  + y'  + 36y  = 0 (12Vx  = z) 

7.  y"  + k2x2y  = 0 (y  = uVx,  gfcx2  = z) 

8.  y"  + k2x\  = 0 (y  = u~\Jx,  \kx2  = z) 

9.  xy"  — 5y’  + xy  = 0 (y  = x3w) 

10.  CAS  EXPERIMENT.  Bessel  Functions  for  Large  x. 
It  can  be  shown  that  for  large  x, 


(11)  F„(x)  ~ V 2/ (7rx)  sin  (x  — \mr  — r) 


with  ~ defined  as  in  (14)  of  Sec.  5.4. 

(a)  Graph  Yn(x)  for  n — 0,  ■ ■ ■ , 5 on  common  axes.  Are 
there  relations  between  zeros  of  one  function  and 
extrema  of  another?  For  what  functions? 

(b)  Find  out  from  graphs  from  which  x = xn  on  the 
curves  of  (8)  and  (11)  (both  obtained  from  your  CAS) 
practically  coincide.  How  does  x„  change  with  n? 


(c)  Calculate  the  first  ten  zeros  xm,  m = 1,  ■ ■ ■ , 10,  of 
Y0(x)  from  your  CAS  and  from  (11).  How  does  the  error 
behave  as  m increases? 


(d)  Do  (c)  for  Fi(x)  and  Y^x).  How  do  the  errors 
compare  to  those  in  (c)? 
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HANKEL  AND  MODIFIED 
BESSEL  FUNCTIONS 


11.  Hankel  functions.  Show  that  the  Hankel  functions  (10) 
form  a basis  of  solutions  of  Bessel’s  equation  for  any  v. 

12.  Modified  Bessel  functions  of  the  first  kind  of  order 

v are  defined  by  Iv(x)  = i~vJv(ix),  i = V— 1.  Show 
that  lv  satisfies  the  ODE 

(12)  x2y”  + xy’  — (x2  + v2)y  = 0. 


13.  Modified  Bessel  functions.  Show  that  /v(x)  has  the 
representation 


(13) 


U,x)  = 


oo  2 m + v 

2 2m  + vm\  T(m  + v + 1)' 


14.  Reality  of  Show  that  /„(x)  is  real  for  all  real  x (and 
real  v),  7v(x)  # 0 for  all  real  x =A  0,  and  I-n(x)  = In(x), 
where  n is  any  integer. 

15.  Modified  Bessel  functions  of  the  third  kind  (sometimes 
called  of  the  second  kind ) are  defined  by  the  formula  (14) 
below.  Show  that  they  satisfy  the  ODE  (12). 


(14)  Kfx)  = 77  [/_ v(x)  - Iv(x)]. 

2 sin  vi T 


GHAFTER-5  REVIEWQU  E S T I O N S AND  PROBLEMS 


1.  Why  are  we  looking  for  power  series  solutions  of  ODEs? 

2.  What  is  the  difference  between  the  two  methods  in  this 
chapter?  Why  do  we  need  two  methods? 

3.  What  is  the  indicial  equation?  Why  is  it  needed? 

4.  List  the  three  cases  of  the  Frobenius  method,  and  give 
examples  of  your  own. 

5.  Write  down  the  most  important  ODEs  in  this  chapter 
from  memory. 


6.  Can  a power  series  solution  reduce  to  a polynomial? 
When?  Why  is  this  important? 

7.  What  is  the  hypergeometric  equation?  Where  does  the 
name  come  from? 

8.  List  some  properties  of  the  Legendre  polynomials. 

9.  Why  did  we  introduce  two  kinds  of  Bessel  functions? 

10.  Can  a Bessel  function  reduce  to  an  elementary  func- 
tion? When? 


8HERMANN  HANKEL  (1839-1873),  German  mathematician. 


Summary  of  Chapter  5 
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POWER  SERIES  METHOD 
OR  FROBENIUS  METHOD 


Find  a basis  of  solutions.  Try  to  identify  the  series  as 
expansions  of  known  functions.  Show  the  details  of  your 
work. 

11.  y"  + 4v  = 0 

12.  xy"  + (1  — 2 x)y'  + (x  — l)y  = 0 

13.  (x  - 1 fy"  - (x  - \)y'  - 35y  = 0 


14.  16(x  + l)2y"  + 3y  = 0 

15.  x2y"  + xy'  + (x2  - 5)y  = 0 

16.  x2y"  + 2x3  y + (x2  — 2)y  = 0 

17.  xy"  — (x  + \)y'  + y = 0 

18.  xy"  + 3y'  + 4x3v  = 0 

19.  y"  +^~y  = 0 

4x 

20.  xy"  + y'  — xy  = 0 


Series  Solution  of  ODEs.  Special  Functions 


The  power  series  method  gives  solutions  of  linear  ODEs 
(1)  y"  + p(x)y'  + q{x)y  = 0 

with  variable  coefficients  p and  q in  the  form  of  a power  series  (with  any  center  xq, 
e.g.,  x0  = 0) 


(2)  y(x)  = ^ am(x  - x0)m  = a0  + a^x  - x0)  + a2(x  - x0f  + ■ ■ • . 

m = 0 

Such  a solution  is  obtained  by  substituting  (2)  and  its  derivatives  into  (1).  This  gives 
a recurrence  formula  for  the  coefficients.  You  may  program  this  formula  (or  even 
obtain  and  graph  the  whole  solution)  on  your  CAS. 

If  p and  q are  analytic  at  xq  (that  is,  representable  by  a power  series  in  powers 
of  x - xq  with  positive  radius  of  convergence;  Sec.  5.1),  then  (1)  has  solutions  of 
this  form  (2).  The  same  holds  if  h,  p,  q in 

h(x)y"  + p(x)y'  + q(x)y  = 0 


are  analytic  at  xo  and  h(x o)  A 0,  so  that  we  can  divide  by  h and  obtain  the  standard 
form  (1).  Legendre’s  equation  is  solved  by  the  power  series  method  in  Sec.  5.2. 
The  Frobenius  method  (Sec.  5.3)  extends  the  power  series  method  to  ODEs 


(3) 


a(x)  , b(x) 
x - x0  (x  - x0f 


y + c — ~y'+  t:. — r^y  = 0 


whose  coefficients  are  singular  (i.e.,  not  analytic)  at  x0,  but  are  “not  too  bad,” 
namely,  such  that  a and  b are  analytic  at  x0.  Then  (3)  has  at  least  one  solution  of 
the  form 

oc 

(4)  y(x)  = (x  - x0f  2 am(x  ~ xo)m  = ao(x  ~ x0)r  + a^x  - x0)r+1  + • ■ • 

m = 0 
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where  r can  be  any  real  (or  even  complex)  number  and  is  determined  by  substituting 
(4)  into  (3)  from  the  indicial  equation  (Sec.  5.3),  along  with  the  coefficients  of  (4). 
A second  linearly  independent  solution  of  (3)  may  be  of  a similar  form  (with  different 
r and  am’ s)  or  may  involve  a logarithmic  term.  Bessel’s  equation  is  solved  by  the 
Frobenius  method  in  Secs.  5.4  and  5.5. 

“Special  functions”  is  a common  name  for  higher  functions,  as  opposed  to  the 
usual  functions  of  calculus.  Most  of  them  arise  either  as  nonelementary  integrals  [see 
(24)-(44)  in  App.  3.1]  or  as  solutions  of  (1)  or  (3).  They  get  a name  and  notation 
and  are  included  in  the  usual  CASs  if  they  are  important  in  application  or  in  theory. 
Of  this  kind,  and  particularly  useful  to  the  engineer  and  physicist,  are  Legendre’s 
equation  and  polynomials  Po,  Pi,  ■ ■ • (Sec.  5.2),  Gauss’s  hyper  geometric  equation 
and  functions  F(a,  b,  c;  x)  (Sec.  5.3),  and  Bessel’s  equation  and  functions  Jv  and 
Yv  (Secs.  5.4,  5.5). 


CHAPTER  6 


Laplace  Transforms 


Laplace  transforms  are  invaluable  for  any  engineer’s  mathematical  toolbox  as  they  make 
solving  linear  ODEs  and  related  initial  value  problems,  as  well  as  systems  of  linear  ODEs, 
much  easier.  Applications  abound:  electrical  networks,  springs,  mixing  problems,  signal 
processing,  and  other  areas  of  engineering  and  physics. 

The  process  of  solving  an  ODE  using  the  Laplace  transform  method  consists  of  three 
steps,  shown  schematically  in  Fig.  1 13: 

Step  1.  The  given  ODE  is  transformed  into  an  algebraic  equation,  called  the  subsidiary 
equation. 

Step  2.  The  subsidiary  equation  is  solved  by  purely  algebraic  manipulations. 

Step  3.  The  solution  in  Step  2 is  transformed  back,  resulting  in  the  solution  of  the  given 
problem. 


Fig.  113.  Solving  an  IVP  by  Laplace  transforms 


The  key  motivation  for  learning  about  Laplace  transforms  is  that  the  process  of  solving 
an  ODE  is  simplified  to  an  algebraic  problem  (and  transformations).  This  type  of 
mathematics  that  converts  problems  of  calculus  to  algebraic  problems  is  known  as 
operational  calculus.  The  Laplace  transform  method  has  two  main  advantages  over  the 
methods  discussed  in  Chaps.  1^1: 

I.  Problems  are  solved  more  directly:  Initial  value  problems  are  solved  without  first 
determining  a general  solution.  Nonhomogenous  ODEs  are  solved  without  first  solving 
the  corresponding  homogeneous  ODE. 

II.  More  importantly,  the  use  of  the  unit  step  function  (Heaviside  function  in  Sec.  6.3) 
and  Dirac’s  delta  (in  Sec.  6.4)  make  the  method  particularly  powerful  for  problems  with 
inputs  (driving  forces)  that  have  discontinuities  or  represent  short  impulses  or  complicated 
periodic  functions. 
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The  following  chart  shows  where  to  find  information  on  the  Laplace  transform  in  this 
book. 


Topic 

Where  to  find  it 

ODEs,  engineering  applications  and  Laplace  transforms 
PDEs,  engineering  applications  and  Laplace  transforms 
List  of  general  formulas  of  Laplace  transforms 
List  of  Laplace  transforms  and  inverses 

Chapter  6 
Section  12.11 
Section  6.8 
Section  6.9 

Note:  Your  CAS  can  handle  most  Laplace  transforms. 

Prerequisite:  Chap.  2 

Sections  that  may  be  omitted  in  a shorter  course:  6.5,  6.7 
References  and  Answers  to  Problems:  App.  1 Part  A,  App.  2. 


6.1  Laplace  Transform.  Linearity. 

First  Shifting  Theorem  (s-Shifting) 

In  this  section,  we  learn  about  Laplace  transforms  and  some  of  their  properties.  Because 
Laplace  transforms  are  of  basic  importance  to  the  engineer,  the  student  should  pay  close 
attention  to  the  material.  Applications  to  ODEs  follow  in  the  next  section. 

Roughly  speaking,  the  Laplace  transform,  when  applied  to  a function,  changes  that 
function  into  a new  function  by  using  a process  that  involves  integration.  Details  are  as 
follows. 

If  f{t)  is  a function  defined  for  all  t g?  0,  its  Laplace  transform1  is  the  integral  of /(f) 
times  e~st  from  t = 0 to  °°.  It  is  a function  of  s,  say,  F(s),  and  is  denoted  by  ££(/);  thus 


(1) 


m = m 


e~stm  dt. 


Here  we  must  assume  that  /(f)  is  such  that  the  integral  exists  (that  is,  has  some  finite 
value).  This  assumption  is  usually  satisfied  in  applications — we  shall  discuss  this  near  the 
end  of  the  section. 


1 PIERRE  SIMON  MARQUIS  DE  LAPLACE  (1749-1827),  great  French  mathematician,  was  a professor  in 
Paris.  He  developed  the  foundation  of  potential  theory  and  made  important  contributions  to  celestial  mechanics, 
astronomy  in  general,  special  functions,  and  probability  theory.  Napoleon  Bonaparte  was  his  student  for  a year. 
For  Laplace’s  interesting  political  involvements,  see  Ref.  [GenRef2],  listed  in  App.  1. 

The  powerful  practical  Laplace  transform  techniques  were  developed  over  a century  later  by  the  English 
electrical  engineer  OLIVER  HEAVISIDE  (1850-1925)  and  were  often  called  “Heaviside  calculus.” 

We  shall  drop  variables  when  this  simplifies  formulas  without  causing  confusion.  For  instance,  in  (1)  we 
wrote  i£(/)  instead  of  !£(f)(s)  and  in  (1*)  ££-1(F)  instead  of  i£-1(F)(f). 
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EXAMPLE  1 


EXAMPLE  2 


Not  only  is  the  result  F(s)  called  the  Laplace  transform,  but  the  operation  just  described, 
which  yields  F(s ) from  a given  f(t),  is  also  called  the  Laplace  transform.  It  is  an  “integral 
transform” 


F(s) 


r 00 

k(s,  t)f(t ) dt 


with  “kernel”  k(s,  t)  = e~s  . 

Note  that  the  Laplace  transform  is  called  an  integral  transform  because  it  transforms 
(changes)  a function  in  one  space  to  a function  in  another  space  by  a process  of  integration 
that  involves  a kernel.  The  kernel  or  kernel  function  is  a function  of  the  variables  in  the 
two  spaces  and  defines  the  integral  transform. 

Furthermore,  the  given  function /(f)  in  (1)  is  called  the  inverse  transform  of  F(s)  and 
is  denoted  by  :£~X(F);  that  is,  we  shall  write 


(1*) 


m = 


Note  that  (1)  and  (1*)  together  imply  if  = / and  if{if  1(F’))  = F. 


Notation 

Original  functions  depend  on  1 and  their  transforms  on  s — keep  this  in  mind!  Original 
functions  are  denoted  by  lowercase  letters  and  their  transforms  by  the  same  letters  in  capital , 
so  that  F(s)  denotes  the  transform  of /(f),  and  Y(s)  denotes  the  transform  of  v(f),  and  so  on. 


Laplace  Transform 

Let/(t)  = 1 when  t S 0.  Find  F(s). 

Solution.  From  (1)  we  obtain  by  integration 


£(/)  = 2(1)  = e~sldt  = —e~ 
'o 


o 


{S  > 0). 


Such  an  integral  is  called  an  improper  integral  and,  by  definition,  is  evaluated  according  to  the  rule 

,T 


e stf(t)  dt  = lim  J e st/(f)  dt. 
0 


o 


— str 


Hence  our  convenient  notation  means 


r 00 

~-e~st 

T 

1 

1 

H 

+ 

O 

e~st  dt  = lim 

= lim 

8 

t 

Eh 

o 

s 

8 

l 

O 

s s 

We  shall  use  this  notation  throughout  this  chapter. 


(s  > 0). 


Laplace  Transform  i£(eat)  of  the  Exponential  Function  eat 

Let/(t)  = eat  when  (SO,  where  a is  a constant.  Find  ££(/). 
Solution.  Again  by  (1), 


i£(eat)  = 


1 dt  = 


1 

( 

a — s 


hence,  when  s — a > 0, 


%(eat)  = 


s — 
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THEOREM  1 


PROOF 


EXAMPLE  3 


EXAMPLE  4 


Must  we  go  on  in  this  fashion  and  obtain  the  transform  of  one  function  after  another 
directly  from  the  definition?  No!  We  can  obtain  new  transforms  from  known  ones  by  the 
use  of  the  many  general  properties  of  the  Laplace  transform.  Above  all,  the  Laplace 
transform  is  a “linear  operation,”  just  as  are  differentiation  and  integration.  By  this  we 
mean  the  following. 


Linearity  of  the  Laplace  Transform 

The  Laplace  transform  is  a linear  operation;  that  is,  for  any  functions  f(t)  and  g(t) 
whose  transforms  exist  and  any  constants  a and  b the  transform  of  af(t ) + bg(t) 
exists,  and 

${af(t)  + bg(t)  j = aX{f{t)}  + b£{g{t)}. 


This  is  true  because  integration  is  a linear  operation  so  that  (1)  gives 


2{a/(0  + bg(t)\ 


e St[af(t)  + bg(t )]  dt 


e~stf(t)  dt  + b 


gif)  dt 


ai£{f(t)\  + bf[g(t)\.  m 


Application  of  Theorem  1:  Hyperbolic  Functions 

Find  the  transforms  of  cosh  at  and  sinh  at. 

Solution.  Since  cosh  at  = \{eat  + e~at ) and  sinh  at  = \{eat  — e~at ),  we  obtain  from  Example  2 and 
Theorem  1 


££(cosh  at)  = ^ f£(eat)  + i£(e  “*))  = ^ ( - + — ) = 2 - 

2 2 \s  - a s + aj  sz  - a2 

f£(sinh at)  = -(2(eat)  - iE(e~at))  = -(— — ) = “ 

2 2 \s  — a s + aj  s — a 


Cosine  and  Sine 

Derive  the  formulas 


££( cos  cot)  = 


££(sin  cot)  = 


Solution.  We  write  Lc  — cos  cot)  and  Ls  = ££(sin  cot).  Integrating  by  parts  and  noting  that  the  integral- 
free  parts  give  no  contribution  from  the  upper  limit  °o,  we  obtain 


Lr  = | e cos  cot  dt  = _ cos  cot 
•'o 


-st  • 1 CO 

, e sin  cot  dt  = L, 

o Jo 


Ls  = I e st  sin  cot  dt  = _ - sin  cot 


H e cos  cot  dt  = — Lc 

s I s 

0 Jo 
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By  substituting  Ls  into  the  formula  for  Lc  on  the  right  and  then  by  substituting  Lc  into  the  formula  for  Ls  on 
the  right,  we  obtain 


Lc 


Ls 


L1  + 


T,  1 + 


Ln  = 


Ls  = 


Basic  transforms  are  listed  in  Table  6. 1 . We  shall  see  that  from  these  almost  all  the  others 
can  be  obtained  by  the  use  of  the  general  properties  of  the  Laplace  transform.  Formulas 
1-3  are  special  cases  of  formula  4,  which  is  proved  by  induction.  Indeed,  it  is  true  for 
n = 0 because  of  Example  1 and  0!  = 1.  We  make  the  induction  hypothesis  that  it  holds 
for  any  integer  n is  0 and  then  get  it  for  n + 1 directly  from  (1).  Indeed,  integration  by 
parts  first  gives 


f£(tn+1) 


e~sttn+1dt  = — — e~sttn+1 
s 

oo 

n + 1 

+ 

s 

jo 

0 

Now  the  integral-free  part  is  zero  and  the  last  part  is  ( n + 1 )/s  times  !£(t  ri).  From  this 
and  the  induction  hypothesis, 


2(tn+1)  = - 


Ta — 


(77  + 1)! 


This  proves  formula  4. 


Table  6.1  Some  Functions  f[t ) and  Their  Laplace  Transforms  !£[f) 


fit) 

2(f) 

1 

1 

1/s 

2 

t 

1/s2 

3 

t2 

2 !/s3 

tn 

n\ 

in  = 0,  1,  • • •) 

n+  1 
S 

5 

ta 

(a  positive) 

T(a  + 1) 

a+  1 
S 

6 

at 

e 

l 

s — a 
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THEOREM  2 


PROOF 


EXAMPLE  5 


r(a  + 1)  in  formula  5 is  the  so-called  gamma  function  [(15)  in  Sec.  5.5  or  (24)  in 
App.  A3.1].  We  get  formula  5 from  (1),  setting  st  = x: 


££(fa) 


-st.a 


tadt  = 


dx 

s 


1 


a + 1 


S 


e~xxadx 

■'o 


where  s > 0.  The  last  integral  is  precisely  that  defining  T(a  + 1),  so  we  have 
T (a  + l)/s°  + 1,  as  claimed.  (CAUTION  f(a  + 1)  has  xa  in  the  integral,  not  xa+  ’.) 
Note  the  formula  4 also  follows  from  5 because  T(n  + 1 ) = n\  for  integer  n 0. 
Formulas  6-10  were  proved  in  Examples  2-4.  Formulas  11  and  12  will  follow  from  7 
and  8 by  “shifting,”  to  which  we  turn  next. 


s-Shifting:  Replacing  s by  s — a in  the  Transform 

The  Laplace  transform  has  the  very  useful  property  that,  if  we  know  the  transform  of  fit), 
we  can  immediately  get  that  of  eatf(t),  as  follows. 


First  Shifting  Theorem,  s-Shifting 

If  fit)  has  the  transform  F(s)  ( where  s > kfor  some  k ),  then  eatf(f)  has  the  transform 
F(s  — a)  ( where  s — a > k).  In  formulas, 

if  [ eatf(t) } = F(s  - a) 

or,  if  we  take  the  inverse  on  both  sides, 

eatf{t)  = 2~l[F(s  -a)}. 


We  obtain  F(s  — a)  by  replacing  s with  s — a in  the  integral  in  (1),  so  that 


F(s  — a) 


e~(s~a}tf(t)  dt 


e-st[eatf(t)]  dt  = £{eatnt)}. 
J o 


If  F(s)  exists  (i.e.,  is  finite)  for  s greater  than  some  k,  then  our  first  integral  exists  for 
s — a > k.  Now  take  the  inverse  on  both  sides  of  this  formula  to  obtain  the  second  formula 
in  the  theorem.  (CAUTION:  -a  in  F(s  - a)  but  +a  in  eatf(t).) 


s-Shifting:  Damped  Vibrations.  Completing  the  Square 

From  Example  4 and  the  first  shifting  theorem  we  immediately  obtain  formulas  11  and  12  in  Table  6.1, 

££{eat  cos  ait)  = , £{eat  sin  ojt)  = . 

{s  ~ a)  + (x)  (s  — a)  + co 

For  instance,  use  these  formulas  to  find  the  inverse  of  the  transform 

3a  - 137 
i2  + 2i  + 401 


2(f)  = 


SEC.  6.1  Laplace  Transform.  Linearity.  First  Shifting  Theorem  (s-Shifting) 


209 


Solution.  Applying  the  inverse  transform,  using  its  linearity  (Prob.  24),  and  completing  the  square,  we  obtain 

f 3(i  + 1)  - 140]  A i+i  1 A 

f = 2-^ > = 3 if- a > - ISrH 

l (s  + l)2  + 400  J l(i  + l)2  + 202J  l 

We  now  see  that  the  inverse  of  the  right  side  is  the  damped  vibration  (Fig.  1 14) 

fit)  = e”*(3  cos  20 1 — 7 sin  200- 


Fig.  114.  Vibrations  in  Example  5 


Existence  and  Uniqueness  of  Laplace  Transforms 

This  is  not  a big  practical  problem  because  in  most  cases  we  can  check  the  solution  of 
an  ODE  without  too  much  trouble.  Nevertheless  we  should  be  aware  of  some  basic  facts. 

A function /(t)  has  a Laplace  transform  if  it  does  not  grow  too  fast,  say,  if  for  all  t §=  0 
and  some  constants  M and  k it  satisfies  the  “growth  restriction” 

(2)  1/(0 1 ^Mekt. 

(The  growth  restriction  (2)  is  sometimes  called  “growth  of  exponential  order,”  which  may 

o 

be  misleading  since  it  hides  that  the  exponent  must  be  kt,  not  kt  or  similar.) 

f(t)  need  not  be  continuous,  but  it  should  not  be  too  bad.  The  technical  term  (generally 
used  in  mathematics)  is  piecewise  continuity,  fit)  is  piecewise  continuous  on  a finite 
interval  a 2=  t Si  b where /is  defined,  if  this  interval  can  be  divided  into  finitely  many 
subintervals  in  each  of  which/is  continuous  and  has  finite  limits  as  t approaches  either 
endpoint  of  such  a subinterval  from  the  interior.  This  then  gives  finite  jumps  as  in 
Fig.  1 15  as  the  only  possible  discontinuities,  but  this  suffices  in  most  applications,  and 
so  does  the  following  theorem. 


t 


Fig.  115.  Example  of  a piecewise  continuous  function  f(t). 
(The  dots  mark  the  function  values  at  the  jumps.) 
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THEOREM  3 


Existence  Theorem  for  Laplace  Transforms 

If f(t ) is  defined  and  piecewise  continuous  on  every  finite  interval  on  the  semi-axis 
(§0  and  satisfies  (2)  for  all  t §5  0 and  some  constants  M and  Ic,  then  the  Laplace 
transform  k£(f)  exists  for  all  s > k. 


PROOF  Since  /(f)  is  piecewise  continuous,  e~stf(t ) is  integrable  over  any  finite  interval  on  the 
/-axis.  From  (2),  assuming  that  s > k ( to  be  needed  for  the  existence  of  the  last  of  the 
following  integrals),  we  obtain  the  proof  of  the  existence  of  -£(  f ) from 


|2(/)l 


e~stf(f)dt 

< 

CO 

\f(t)\e~st  dt  g 

. 

O 

~~ i 

o 

Mekte  st 


dt 


Note  that  (2)  can  be  readily  checked.  For  instance,  cosh  t < et,  tn  < nle1'  (because  tn/n\ 
is  a single  term  of  the  Maclaurin  series),  and  so  on.  A function  that  does  not  satisfy  (2) 
for  any  M and  k is  e (take  logarithms  to  see  it).  We  mention  that  the  conditions  in 
Theorem  3 are  sufficient  rather  than  necessary  (see  Prob.  22). 


Uniqueness.  If  the  Laplace  transform  of  a given  function  exists,  it  is  uniquely 
determined.  Conversely,  it  can  be  shown  that  if  two  functions  (both  defined  on  the  positive 
real  axis)  have  the  same  transform,  these  functions  cannot  differ  over  an  interval  of  positive 
length,  although  they  may  differ  at  isolated  points  (see  Ref.  [A14]  in  App.  1).  Hence  we 
may  say  that  the  inverse  of  a given  transform  is  essentially  unique.  In  particular,  if  two 
continuous  functions  have  the  same  transform,  they  are  completely  identical. 


PROBLEM  SET  6.1 


1-16 


LAPLACE  TRANSFORMS 


Find  the  transform.  Show  the  details  of  your  work.  Assume 
that  a,  b,  a>,  0 are  constants. 


1.  3 1 + 12 
3.  COS  7 Tt 
5.  ezt  sinh  t 
7.  sin  ( cot  + 6) 

9. 

1 - 

i 

1 

11.  b 


b 


13. 


1 


-1  - 


2 

I 

I 


2.  (a  - btf 
4.  cos2  wt 
6.  e_tsinh4t 
8.  1.5  sin  (3 1 - tt/2) 

10. 


_l L_ 

a b 


17-24 


SOME  THEORY 


17.  Table  6.1.  Convert  this  table  to  a table  for  finding 

inverse  transforms  (with  obvious  changes,  e.g., 
k£~\l/sn)  = - 1),  etc). 

18.  Using  k£{f)  in  Prob.  10,  find  ££(/)),  where  ffit)  = 0 
if  t S 2 and  ffit)  = 1 if  t > 2. 


19.  Table  6.1.  Derive  formula  6 from  formulas  9 and  10. 

f2 

20.  Nonexistence.  Show  that  e does  not  satisfy  a 
condition  of  the  form  (2). 


21.  Nonexistence.  Give  simple  examples  of  functions 
(defined  for  all  /SO)  that  have  no  Laplace 
transform. 


22.  Existence.  Show  that  k£(l /Vf)  = s/fr/s.  [Use  (30) 
T(g)  = \ffr  in  App.  3.1.]  Conclude  from  this  that  the 
conditions  in  Theorem  3 are  sufficient  but  not 
necessary  for  the  existence  of  a Laplace  transform. 
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23.  Change  of  scale.  If  i£(/(f))  = F(s)  and  c is  any 
positive  constant,  show  that  i£(/(cf))  = F(s/c)/c  (Hint: 
Use  (1).)  Use  this  to  obtain  i£(cos  cot)  from  ££(cos  f). 

24.  Inverse  transform.  Prove  that  is  linear.  Hint: 
Use  the  fact  that  ££  is  linear. 


25-32 


INVERSE  LAPLACE  TRANSFORMS 


Given  F(s)  = ££(/),  find  fit),  a , b , L,  n are  constants.  Show 
the  details  of  your  work. 


25. 


0.25  + 1.8 
52  + 3.24 


26. 


5v  + 1 
5 2 - 25 


27. 


29. 


L s + n 7T 
12  228 
7 5« 

31.^°- 

52  — 5 — 2 


28. 


30. 


32. 


1 

(5  + VZ)(5  - V3) 
45  + 32 

5 2 - 16 
1 

(5  + a)(5  + 6) 


33-45 


APPLICATION  OF  s-SHIFTING 


In  Probs.  33-36  find  the  transform.  In  Probs.  37-45  find 
the  inverse  transform.  Show  the  details  of  your  work. 

33.  t2e~3t 
35.  0.5e_4'5t  sin  27 Tt 

7 7 


37. 


34.  ke  at  cos  cot 
36.  sinh  t cos  t 
6 


39. 


41. 


(5  + TT) 

21 

(5  + V2)4 
77 


38. 


40. 


(5  + If 

4 


5 2 - 25  - 3 


42. 


5 + 10775  + 2477 
uq  a\ 


«2 


43. 


45. 


5+1  (5  + If 

25  ~ 1 
- 65  + 18 
k0(s  + a)  + k1 


C * + If 

a(s  + k)  + bTT 

44.  

(5  + kf  + 772 


(5  + af 


6.1  Transforms  of  Derivatives  and  Integrals. 
ODEs 


The  Laplace  transform  is  a method  of  solving  ODEs  and  initial  value  problems.  The  crucial 
idea  is  that  operations  of  calculus  on  functions  are  replaced  by  operations  of  algebra 
on  transforms . Roughly,  differentiation  of /(f)  will  correspond  to  multiplication  of  ,f(/) 
by  s (see  Theorems  1 and  2)  and  integration  of /(f)  to  division  of  i£(/)  by  .v.  To  solve 
ODEs,  we  must  first  consider  the  Laplace  transform  of  derivatives.  You  have  encountered 
such  an  idea  in  your  study  of  logarithms.  Under  the  application  of  the  natural  logarithm, 
a product  of  numbers  becomes  a sum  of  their  logarithms,  a division  of  numbers  becomes 
their  difference  of  logarithms  (see  Appendix  3,  formulas  (2),  (3)).  To  simplify  calculations 
was  one  of  the  main  reasons  that  logarithms  were  invented  in  pre-computer  times. 


THEOREM  1 


Laplace  Transform  of  Derivatives 

The  transforms  of  the  first  and  second  derivatives  off(t)  satisfy 

(1)  2(/')  = i2(/)-/(0) 

(2)  ££(/")  = s22(/)  - sf( 0)  - /'( 0). 

Formula  (1)  holds  if  f(t)  is  continuous  for  all  f § 0 and  satisfies  the  growth 
restriction  (2)  in  Sec.  6.1  and  f (f)  is  piecewise  continuous  on  every  finite  inten’al 
on  the  semi-axis  f § 0.  Similarly,  (2)  holds  iffandf  are  continuous  for  all  t § 0 
and  satisfy  the  growth  restriction  and  f is  piecewise  continuous  on  every  finite 
interval  on  the  semi-axis  f§0. 
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PROOF 


THEOREM  2 


EXAMPLE  1 


EXAMPLE  2 


We  prove  (1)  first  under  the  additional  assumption  that  f is  continuous.  Then,  by  the 
definition  and  integration  by  parts, 


2(/') 


e~stf\t)dt 


Jo 


+ s 


e~stf{i)  dt. 


Jo 


Since  / satisfies  (2)  in  Sec.  6.1,  the  integrated  part  on  the  right  is  zero  at  the  upper  limit 
when  s > k,  and  at  the  lower  limit  it  contributes  — /( 0).  The  last  integral  is  ££(/).  It  exists 
for  j > k because  of  Theorem  3 in  Sec.  6. 1 . Hence  i£(f')  exists  when  s > k and  (1)  holds. 

If/  is  merely  piecewise  continuous,  the  proof  is  similar.  In  this  case  the  interval  of 
integration  off'  must  be  broken  up  into  parts  such  that/'  is  continuous  in  each  such  part. 

The  proof  of  (2)  now  follows  by  applying  (1)  to  f"  and  then  substituting  (1),  that  is 


££(/")  = siE(f')  -/'( 0)  = s[s£(f)  -/(0)]  = sz£(f)  - sf( 0)  -/'( 0). 


Continuing  by  substitution  as  in  the  proof  of  (2)  and  using  induction,  we  obtain  the 
following  extension  of  Theorem  1 . 


Laplace  Transform  of  the  Derivative  of  Any  Order 

Let  f,f' , • • • be  continuous  for  all  t §?  0 and  satisfy  the  growth  restriction 

(2)  in  Sec.  6.1 . Furthermore,  letf(n>  be  piecewise  continuous  on  every  finite  interval 
on  the  semi-axis  t = 0.  Then  the  transform  off (n)  satisfies 

(3)  ^(/(n))  = snmj)  - sn~xm  - sn~2f\ 0)  - • • • /(n_1)(0). 


Transform  of  a Resonance  Term  (Sec.  2.8) 

Let /(f)  = / sin  cot.  Then/(0)  = 0.  f’ (!)  = sin  o)t  + o)t  cos  oil.  [’ (0)  = (),/"  = 2o>  cos  oil  — oTi  sin  tot.  Hence 

by  (2), 


£(/")  = 2w  - - - cfiE(f)  = sz£(f),  thus 

s2  + w2 


££(/)  = X(t  sin  at)  = 


2 as 

(s2  + a)2)2 


Formulas  7 and  8 in  Table  6.1,  Sec.  6.1 

This  is  a third  derivation  of  ££(cos  at)  and  i£(sin  cot)\  cf.  Example  4 in  Sec.  6.1.  Let  fit)  = cos  at.  Then 
/( 0)  = l,/'(0)  = 0,f”(t)  = —co2  cos  cot.  From  this  and  (2)  we  obtain 

= s2X{f)  — s = —co2!£(f).  By  algebra,  ££(cos  cot)  = . 

s2  + co2 

Similarly,  let  g = sin  cot.  Then  g(0)  = 0,  gr  = co  cos  cot.  From  this  and  (1)  we  obtain 

, CO  CO 

!£(g  ) = s^£(g)  = co!£(cos  cot).  Hence,  ££(sin  cot)  = — i£(cos  cot)  = «. 

s s + oo 

Laplace  Transform  of  the  Integral  of  a Function 

Differentiation  and  integration  are  inverse  operations,  and  so  are  multiplication  and  division. 
Since  differentiation  of  a function/!/)  (roughly)  corresponds  to  multiplication  of  its  transform 
££(/)  by  s,  we  expect  integration  of /(f)  to  correspond  to  division  of  ,T(/  ) by  .v: 
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THEOREM  3 


PROOF 


EXAMPLE  3 


Laplace  Transform  of  Integral 

Let  F(s)  denote  the  transform  of a function  fit)  which  is  piecewise  continuous  for  ( § 0 
and  satisfies  a growth  restriction  (2),  Sec.  6.1.  Then,  for  s > 0,  s > k,  and  t > 0, 


thus 


ft 

/(t)  dr  = ££_1 


Jo 


Denote  the  integral  in  (4)  by  g(t).  Since  fit)  is  piecewise  continuous,  g(t)  is  continuous, 
and  (2),  Sec.  6.1,  gives 


1 5(0 1 = 


/(r)  dr 


|/(t)|  dr  g M 


t 

kr  / M ,kt 

e dr  = — (e 
k 


1) 

k 


(, k > 0). 


This  shows  that  g(t)  also  satisfies  a growth  restriction.  Also,  g\t)  = /(f),  except  at  points 
at  which /(f)  is  discontinuous.  Hence  g (t)  is  piecewise  continuous  on  each  finite  interval 
and,  by  Theorem  1,  since  g(0)  = 0 (the  integral  from  0 to  0 is  zero) 


#{/«}  = W(f)}  = s£{g(t)}  - g(0)  = s£{g(t)}. 


Division  by  5 and  interchange  of  the  left  and  right  sides  gives  the  first  formula  in  (4), 
from  which  the  second  follows  by  taking  the  inverse  transform  on  both  sides. 


Application  of  Theorem  3:  Formulas  19  and  20  in  the  Table  of  Sec.  6.9 

1 1 

Using  Theorem  3,  find  the  inverse  of and . 

, 2\  2/2  , 2\ 

5(5  + co  ) s (s  + co  ) 

Solution.  From  Table  6.1  in  Sec.  6.1  and  the  integration  in  (4)  (second  formula  with  the  sides  interchanged) 
we  obtain 


sin  cot  1 

cIt  = -~2  (1  — cos  cot). 

co  co 


This  is  formula  19  in  Sec.  6.9.  Integrating  this  result  again  and  using  (4)  as  before,  we  obtain  formula  20 
in  Sec.  6.9: 


£ 


/ 1 = [ (1  — COS  COT ) ch  ■ 

1 „2/  2 1 2-,  I 2 

IS  (s  + CO  )J  CO  JQ 


T Sin  COT 
.2  .3 


0 W 


t sin  cot 
2 3 


It  is  typical  that  results  such  as  these  can  be  found  in  several  ways.  In  this  example,  try  partial  fraction 
reduction. 


Differential  Equations,  Initial  Value  Problems 

Let  us  now  discuss  how  the  Laplace  transform  method  solves  ODEs  and  initial  value 
problems.  We  consider  an  initial  value  problem 

(5)  y"  + ay'  + by  = ft),  y(0)  = K0,  y'(  0)  = 
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where  a and  b are  constant.  Here  r(t)  is  the  given  input  ( driving  force)  applied  to  the 
mechanical  or  electrical  system  and  y(t)  is  the  output  ( response  to  the  input)  to  be  obtained. 
In  Laplace’s  method  we  do  three  steps: 

Step  1.  Setting  up  the  subsidiary  equation.  This  is  an  algebraic  equation  for  the  transform 
Y = ,r£(y)  obtained  by  transforming  (5)  by  means  of  (1)  and  (2),  namely, 

| s2Y  - sy(0)  - /(O)]  + a[sY  - y(0)]  + bY  = R(s) 

where  R(s)  = /£(r).  Collecting  the  F-terms,  we  have  the  subsidiary  equation 

(s2  + as  + b)Y  = (s  + a)y(  0)  + VlO)  + R(s). 

Step  2.  Solution  of  the  subsidiary  equation  by  algebra.  We  divide  by  ,v2  F as  + b and 
use  the  so-called  transfer  function 


(6) 


GW  = — 1 

s + as  + b 


1 

( S + gfl)2  + b — \(Y 


(Q  is  often  denoted  by  H,  but  we  need  H much  more  frequently  for  other  purposes.)  This 
gives  the  solution 

(7)  Y(s)  = [(s  + n)y(O)  + /(0)]gW  + R(s)Q(s). 

If  y(0)  = y'( 0)  = 0,  this  is  simply  Y = RQ\  hence 

Y ifi(output) 

Q ~ R ~ ^ (input) 


and  this  explains  the  name  of  Q.  Note  that  Q depends  neither  on  r(t)  nor  on  the  initial 
conditions  (but  only  on  a and  b). 

Step  3.  Inversion  of  Y to  obtain  y = !£~1(Y).  We  reduce  (7)  (usually  by  partial  fractions 
as  in  calculus)  to  a sum  of  terms  whose  inverses  can  be  found  from  the  tables  (e.g.,  in 
Sec.  6.1  or  Sec.  6.9)  or  by  a CAS,  so  that  we  obtain  the  solution  y(t)  = iE~1(Y)  of  (5). 

Initial  Value  Problem:  The  Basic  Laplace  Steps 

Solve 

y"-y  = t,  y(0)  = l,  /(0)  = 1. 

Solution.  Step  1.  From  (2)  and  Table  6.1  we  get  the  subsidiary  equation  [with  Y = i£(y)] 
s2Y  - sy( 0)  - y'(0)  — Y = 1/s2,  thus  (s2  - 1)Y  = s + 1 + 1/s2. 

Step  2.  The  transfer  function  is  Q = 1 /tv2  — 1),  and  (7)  becomes 


1 s + 1 

Y = (s  + 1)2  + - Q = 

s2  s2  - 1 


Simplification  of  the  first  fraction  and  an  expansion  of  the  last  fraction  gives 


Y = 


1 


5—1 


+ 
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Step  3.  From  this  expression  for  Y and  Table  6.1  we  obtain  the  solution 

y(t)  = ST\Y)  = j = + sinhf  - t. 

The  diagram  in  Fig.  116  summarizes  our  approach. 


£-space  s-space 


Fig.  116.  Steps  of  the  Laplace  transform  method 


Comparison  with  the  Usual  Method 

Solve  the  initial  value  problem 

y"+y'+9y  = 0.  y(0)  = 0.16,  /(0)  = 0. 

Solution.  From  (1)  and  (2)  we  see  that  the  subsidiary  equation  is 

s2Y  - 0.16^  + sY  - 0.16  + 9Y  = 0,  thus  (s2  + s + 9 )Y  = 0.16(5  + 1). 
The  solution  is 

0.16(5  + 1)  0.16(5  + |)  + 0.08 

52  + 5 + 9 (5  + \)2  + f 

Hence  by  the  first  shifting  theorem  and  the  formulas  for  cos  and  sin  in  Table  6.1  we  obtain 

i _fl2(  „ [35  0.08  [35  \ 

y{t)  = ££  (7)  = e t/2(^0.16  cos  y — t + sin  y — rj 

= e_o-5t(0.16  cos  2.96 1 + 0.027  sin  2.96/). 

This  agrees  with  Example  2,  Case  (III)  in  Sec.  2.4.  The  work  was  less. 


Advantages  of  the  Laplace  Method 

1.  Solving  a nonhomo geneous  ODE  does  not  require  first  solving  the 
homogeneous  ODE.  See  Example  4. 

2.  Initial  values  are  automatically  taken  care  of.  See  Examples  4 and  5. 

3.  Complicated  inputs  r(f)  (right  sides  of  linear  ODEs)  can  be  handled  very 
efficiently,  as  we  show  in  the  next  sections. 
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This  means  initial  value  problems  with  initial  conditions  given  at  some  t = t0>  0 instead  of  t = 0.  For  such  a 
problem  set  t l I ?0,  so  that  t = t0  gives  t = 0 and  the  Laplace  transform  can  be  applied.  For  instance,  solve 

y"  + y = It,  yi^TT)  = t,  /(jO  = 2 - V2. 

Solution.  We  have  t0  = \tt  and  we  set  t = 7 + \tt.  Then  the  problem  is 

y"  + y = 2(7+  \tt),  y(0)  = \tt,  y'(0)  = 2 - V2 

where  y(t)  = y(t).  Using  (2)  and  Table  6.1  and  denoting  the  transform  of  y by  Y,  we  see  that  the  subsidiary 
equation  of  the  “shifted”  initial  value  problem  is 

r_.  ~ 2 2 2 2 1 

s2Y  - s ■ \it  - (2  - V2)  + Y = + — , thus  (s2  + l)Y  = + + - -its  + 2 - V2. 

s s s s 2 

Solving  this  algebraically  for  Y,  we  obtain 

2 giTi  2 - V2 

(s2  + l)s2  (i2  +1)5  s2  + 1 52  + 1 

The  inverse  of  the  first  two  terms  can  be  seen  from  Example  3 (with  co  = 1),  and  the  last  two  terms  give  cos 
and  sin, 


y = !£  ^T)  = 2{t  — sin  t)  + 277(1  — cos  t)  + \tt  cos  t + (2  — V2)  sin  7 
= 2?  + \tt  — V2  sin?. 

~ i . ~ 1 . 

Now  t = t — 4 77,  sin  t = (sin  t — cos  t),  so  that  the  answer  (the  solution)  is 

V2 


= 2t  — sin  t + cos  t. 


P R0BtEfcFS:EFO 


l-n 


INITIAL  VALUE  PROBLEMS  (IVPS) 


Solve  the  IVPs  by  the  Laplace  transform.  If  necessary,  use 
partial  fraction  expansion  as  in  Example  4 of  the  text.  Show 
all  details. 

1.  y’  + 5.2y  = 19.4  sin  It,  y(0)  = 0 

2.  y'  + 2y  = 0,  y(0)  = 1.5 

3.  y"  ~ y'  ~ 6y  = 0,  y(0)  = 11,  y'(0)  = 28 

4.  y"  + 9y  = 10e~\  y(0)  = 0,  y'(O)  = 0 

5.  y"  - \y  = 0,  y(U)  = 12,  y'(0)  = 0 

6.  y"  — 6y'  + 5v  = 29  cos  2 1,  y(0)  = 3.2, 

y'(0)  = 6.2 

7.  y"  + ly  + 12y  = 21e3*,  y(0)  = 3.5, 
y'(0)  = -10 

8.  y"  - Ay’  + 4y  = 0,  y(0)  = 8.1,  y'(0)  = 3.9 

9.  y"  - Ay’  + 3y  = 6t  - 8,  y(0)  = 0,  y'(0)  = 0 

10.  y"  + 0.04y  = 0.02t2,  y(0)  = -25,  y'(0)  = 0 

11.  y"  + 3y'  + 2.25y  = 9 13  + 64,  y(0)  = 1, 
v'(0)  = 31.5 
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SHIFTED  DATA  PROBLEMS 


Solve  the  shifted  data  IVPs  by  the  Laplace  transform.  Show 
the  details. 

12.  y"  - 2y  - 3y  = 0,  y(4)  = -3, 
y'(4)  = -17 


13.  y — 6y  = 0,  y(—  1)  = 4 

14.  y"  + 2y  + 5y  = 50?  - 100,  v(2)  = -4, 

y\ 2)  = 14 

15.  y"  + 3y’  - Ay  = 6e2t_3,  y(1.5)  = 4, 
/(1.5)  = 5 


16-21 


OBTAINING  TRANSFORMS 
BY  DIFFERENTIATION 


Using  (1)  or  (2),  find  f£(/)  if/(?)  equals: 

16.  t cos  4?  17.  t e~at 

18.  cos2  2 1 19.  sin2  cot 

20.  sin4?.  Use  Prob.  19.  21.  cosh2? 
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22.  PROJECT.  Further  Results  by  Differentiation. 

Proceeding  as  in  Example  1,  obtain 

2 2 
S'  co 

(a)  S£(t  cos  cot)  = — — 

(a2  + co2)2 


and  from  this  and  Example  1:  (b)  formula  21,  (c)  22, 
(d)  23  in  Sec.  6.9, 

s2  + a2 

(e)  ££(f  cosh  at)  = — — , 

(a2  - a2)2 

(f ) S£(t  sinh  at)  = ~°‘S  . 

(a2  - a2)2 
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INVERSE  TRANSFORMS 
BY  INTEGRATION 


Using  Theorem  3,  find/(t)  if  ££(F)  equals: 


23. 

3 

24. 

20 

s2  + s/ 4 

a3  - 2tts‘ 

25. 

1 

26. 

1 

s(s2  + co2) 

4 2 

s — s 

27. 

s + 1 

28. 

3a  + 4 

J 4 + 9 A'2 

a4  + k2sz 

29. 

1 

„3  , „„2 

30.  PROJECT.  Comments  on  Sec.  6.2.  (a)  Give  reasons 
why  Theorems  1 and  2 are  more  important  than 
Theorem  3. 

(b)  Extend  Theorem  1 by  showing  that  if  /(f)  is 
continuous,  except  for  an  ordinary  discontinuity  (finite 
jump)  at  some?  = a (>0),  the  other  conditions  remaining 
as  in  Theorem  1,  then  (see  Fig.  1 17) 

(1*)  <£(/')  = sSE(f)  -f(0)  - [f(a  + 0)  - /(a  - 0)]e-“s. 

(c)  Verify  (1*)  for  /(f)  = e~l  if  0 < t < 1 and  0 if 
t > 1. 

(d)  Compare  the  Laplace  transform  of  solving  ODEs 
with  the  method  in  Chap.  2.  Give  examples  of  your 
own  to  illustrate  the  advantages  of  the  present  method 
(to  the  extent  we  have  seen  them  so  far). 


Fig.  117.  Formula  (1*) 


6.  Unit  Step  Function  (Heaviside  Function). 
Second  Shifting  Theorem  (t-Shifting) 

This  section  and  the  next  one  are  extremely  important  because  we  shall  now  reach  the 
point  where  the  Laplace  transform  method  shows  its  real  power  in  applications  and  its 
superiority  over  the  classical  approach  of  Chap.  2.  The  reason  is  that  we  shall  introduce 
two  auxiliary  functions,  the  unit  step  function  or  Heaviside  function  u(t  — a)  (below)  and 
Dirac’s  delta  S(t  — a)  (in  Sec.  6.4).  These  functions  are  suitable  for  solving  ODEs  with 
complicated  right  sides  of  considerable  engineering  interest,  such  as  single  waves,  inputs 
(driving  forces)  that  are  discontinuous  or  act  for  some  time  only,  periodic  inputs  more 
general  than  just  cosine  and  sine,  or  impulsive  forces  acting  for  an  instant  (hammerblows, 
for  example). 

Unit  Step  Function  (Heaviside  Function)  u(t  — a) 

The  unit  step  function  or  Heaviside  function  u(t  — a)  is  0 for  t < a,  has  a jump  of  size 
1 at  t = a (where  we  can  leave  it  undefined),  and  is  1 for  t > a,  in  a formula: 

f 0 if  t < a 

u(t  — a)  = 


(1) 


1 


if  t > a 


(a  0). 
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utt) 

u{t  — a) 

1 

l 

l 

l 

l 

Of  Oat 


Fig.  118  Unit  step  function  u(t)  Fig.  119.  Unit  step  function  u(t  — a) 


Figure  118  shows  the  special  case  u(t),  which  has  its  jump  at  zero,  and  Fig.  119  the  general 
case  u(t  — a)  for  an  arbitrary  positive  a.  (For  Heaviside,  see  Sec.  6.1.) 

The  transform  of  u(t  — a)  follows  directly  from  the  defining  integral  in  Sec.  6.1, 


f£{u(t  — a)} 


cc 

e~stu(t  — a)  dt 


-st 


\ dt  = — - 


here  the  integration  begins  at  t = a ( § 0)  because  u(t  — a)  is  0 for  t < a.  Hence 


-as 

(2)  £{u{t-a)}=—  (s  > 0). 

The  unit  step  function  is  a typical  “engineering  function”  made  to  measure  for  engineering 
applications,  which  often  involve  functions  (mechanical  or  electrical  driving  forces)  that 
are  either  “off”  or  “on.”  Multiplying  functions /(t)  with  u(t  — a),  we  can  produce  all  sorts 
of  effects.  The  simple  basic  idea  is  illustrated  in  Figs.  120  and  121.  In  Fig.  120  the  given 
function  is  shown  in  (A).  In  (B)  it  is  switched  off  between  t = 0 and  t = 2 (because 
u{1  — 2)  = 0 when  t < 2)  and  is  switched  on  beginning  at  t = 2.  In  (C)  it  is  shifted  to  the 
right  by  2 units,  say,  for  instance,  by  2 sec,  so  that  it  begins  2 sec  later  in  the  same  fashion 
as  before.  More  generally  we  have  the  following. 

Let  f(t)  = 0 for  all  negative  t.  Then  f(t  — a)u{t  — a)  with  a > 0 is  f(t)  shifted 
(. translated)  to  the  right  by  the  amount  a. 

Figure  121  shows  the  effect  of  many  unit  step  functions,  three  of  them  in  (A)  and 
infinitely  many  in  (B)  when  continued  periodically  to  the  right;  this  is  the  effect  of  a 
rectifier  that  clips  off  the  negative  half-waves  of  a sinuosidal  voltage.  CAUTION!  Make 
sure  that  you  fully  understand  these  figures,  in  particular  the  difference  between  parts  (B) 
and  (C)  of  Fig.  120.  Figure  120(C)  will  be  applied  next. 


m 


5- 


-5  - 


n 2k  t 


\j 


-5  - 


w 


2 k 2k  t 


-5  - 


5r  r 

o 


2 n+ 2 2n+2 


\J 


(A)  f(t)  = 5 sin  t (B)  f(t)u(t  - 2) 


(C)  at-2)u(t-2) 


Fig.  120.  Effects  of  the  unit  step  function:  (A)  Given  function. 
(B)  Switching  off  and  on.  (C)  Shift. 
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THEOREM  1 


PROOF 


(A)  k[u(t  - 1)  — 2 u{t  — 4)  + u(t  — 6)]  (B)  4 sin  (^jzt)[u{t)  — u(t  — 2)  + uU  — 4)  — + ■■■] 

Fig.  121.  Use  of  many  unit  step  functions. 


Time  Shifting  (t-Shifting):  Replacing  t by  t — a in  f(t) 

The  first  shifting  theorem  (“y-shifting”)  in  Sec.  6.1  concerned  transforms  Fis)  = :T,  [fit) ) 
and  F(s  — a)  = !£{  eatf(t) } . The  second  shifting  theorem  will  concern  functions  f(t ) and 
fit  — a).  Unit  step  functions  are  just  tools,  and  the  theorem  will  be  needed  to  apply  them 
in  connection  with  any  other  functions. 


Second  Shifting  Theorem;  Time  Shifting 

If  f(t)  has  the  transform  Fis),  then  the  “shifted  function” 

f 0 if  t < a 

(3)  fit)  = fit  - a)uit  - a)  = < 

l fit  — a)  if  t > a 

has  the  transform  e_asFis).  That  is,  if  fit)}  = Fis),  then 

(4)  X{fit  - a)uit  - a)}  = e~asFis). 

Or,  if  we  take  the  inverse  on  both  sides,  we  can  write 

(4*)  fit  — a)uit  — a)  = £e_1{e_osF(i)}. 


Practically  speaking,  if  we  know  Fis),  we  can  obtain  the  transform  of  (3)  by  multiplying 
Fis)  by  e~as.  In  Fig.  120,  the  transform  of  5 sin  t is  Fis)  = 5 /is2  + 1),  hence  the  shifted 
function  5 sin  it  — 2)u(t  — 2)  shown  in  Fig.  120(C)  has  the  transform 

e~2sFis)  = 5e~2s/is2  + 1). 

We  prove  Theorem  1.  In  (4),  on  the  right,  we  use  the  definition  of  the  Laplace  transform, 
writing  t for  t (to  have  t available  later).  Then,  taking  e~as  inside  the  integral,  we  have 


e~asFis) 


e as 


e~sJir)  dr 


r 00 

e~s<T+a)fij)  dr. 


Substituting  t + a = t,  thus  t = t — a,  dr  = dt  in  the  integral  (CAUTION,  the  lower 
limit  changes!),  we  obtain 


e'^Fis) 


e stfit  — a)  dt. 

'a 
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To  make  the  right  side  into  a Laplace  transform,  we  must  have  an  integral  from  0 to 
not  from  a to  But  this  is  easy.  We  multiply  the  integrand  by  u(t  — a).  Then  for  1 from 
0 to  a the  integrand  is  0,  and  we  can  write,  with  / as  in  (3), 


e~asF(s)  = 


e stf{t  — a)u(t  — a)  dt  = 


e~stm  dt. 


(Do  you  now  see  why  u(t  — a)  appears?)  This  integral  is  the  left  side  of  (4),  the  Laplace 
transform  of /(f)  in  (3).  This  completes  the  proof. 

Application  of  Theorem  1.  Use  of  Unit  Step  Functions 

Write  the  following  function  using  unit  step  functions  and  find  its  transform. 

!2  if  0 < r < 1 

|f2  if  1 < f < g7r  (Fig.  122) 

COS  t if  t>2  IT. 

Solution.  Step  1.  In  terms  of  unit  step  functions, 

m = 2(1  - U(t  - 1))  + - 1)  - u{t  - \i r))  + (cos  t)u(t  - \n r). 

Indeed,  2(1  — u(t  — 1))  gives /(r)  for  0 < t < 1,  and  so  on. 

Step  2.  To  apply  Theorem  1,  we  must  write  each  term  in  f(t)  in  the  form  fit  — a)u{t  — a).  Thus,  2(1  — u(t  — 1)) 
remains  as  it  is  and  gives  the  transform  2(1  — e~s)/s.  Then 

2{i'v'  - ■>}  - 2(i<'  - 1)1  + " - " + 1)*'  - •>}  -(? + ? + £)'" 

-!”■)}  ■ 2(K'  ■ H + f ('  4,7)+ /)"('  4*)} 


1 77  T 7" 


-tts/2 


^{(cos  t)u[t  - i 77-^j  = ^{-(sin^f  - /7r)),'(?  “ T77-)} 


+ 1 


Together, 


2 2„/l  1 1 

2(/)  = e-*  + hi + -2 +T- 

s s \s  s 2s 


, I 77  77" 

e — ( + — - -ij  + 


-tts/2  _ 


1 


-tts/2 


-tts/2 


yS~  2 S 85  , 

If  the  conversion  of  f{t)  to  fit  — a)  is  inconvenient,  replace  it  by 
(4**)  %{f(t)u(t  - a)}  = + a)}. 

(4**)  follows  from  (4)  by  writing /(r  — a)  = g(t),  hence  fit)  = g{t  + a)  and  then  again  writing /for  g.  Thus, 


- »}  - ■Mi1' + 1>2}  - + ' + i } - '■•(? + 7 + i) 

as  before.  Similarly  for  i£{\t2u(t  — j77)).  Finally,  by  (4**), 

£f{cosf«(t  - It7^|  = e-’ro/acgjcos^  + = e-™/2cg,_sin  t]  = -e~™'2 
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EXAMPLE  2 


EXAMPLE  3 


m 


Fig.  122.  /(f)  in  Example  1 


Application  of  Both  Shifting  Theorems.  Inverse  Transform 

Find  the  inverse  transform  /(f)  of 


F(s)  = + + . 

2,2  2 i 2 , , ~,2 

S + TT  S +77  (i  + 2) 

Solution.  Without  the  exponential  functions  in  the  numerator  the  three  terms  of  F(s)  would  have  the  inverses 
(sin  TTt)/ir,  (sin  77f)/77,  and  te~  * because  1 /sz  has  the  inverse  f,  so  that  1 /(s  + 2)  has  the  inverse  te~  4 by  the 
first  shifting  theorem  in  Sec.  6.1.  Hence  by  the  second  shifting  theorem  (f-shifting). 


/(f)  = — sin  (77 (f  - 1))  u(t  - 1)  + — sin  (77 (f  - 2))  u(t  -2 .)  + (»-  3 )e~ztt~3)u(t  - 3). 

Now  sin  ( 777  — 7T)  = —sin  7 77  and  sin  (777  — 27 r)  = sin  777,  so  that  the  first  and  second  terms  cancel  each  other 
when  t > 2.  Hence  we  obtain  f(t)  = 0 if  0 < t < 1,  —(sin  777)/ 7T  if  1 < t < 2,  0 if  2 < t < 3,  and 
(/  - 3)e-2<t-3>  if  t > 3.  See  Fig.  123. 


0.3  - 
0.2  - 
0.1  - 
0 

0 


l l I i I I 

1 2 3 4 5 6 t 

Fig.  123.  /(f)  in  Example  2 


Response  of  an  RC-Circuit  to  a Single  Rectangular  Wave 

Find  the  current  i(t)  in  the  /?C-circuit  in  Fig.  124  if  a single  rectangular  wave  with  voltage  is  applied.  The 
circuit  is  assumed  to  be  quiescent  before  the  wave  is  applied. 

Solution.  The  input  is  Vo[u(t  — a)  — u(t  — b)].  Hence  the  circuit  is  modeled  by  the  integro-differential 
equation  (see  Sec.  2.9  and  Fig.  124) 

<l(>)  l f‘ 

Ri(t)  H = Ri(t ) H i(t)  dr  = v(t)  = Vo[u(t  — a)  — u(t  — b)]. 

C C 


1/ 

v(t) 

i(t ) 

I 

v(t ) 

v 

VJR 

I 

1 MA 

vo 

0 a 


0 a 


Fig.  124.  RC-circuit,  electromotive  force  v(f),  and  current  in  Example  3 
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EXAMPLE  4 


Using  Theorem  3 in  Sec.  6.2  and  formula  (1)  in  this  section,  we  obtain  the  subsidiary  equation 

70)  To 

RI(s)  + — = — [e-as-  e~bs]. 
sC  s 

Solving  this  equation  algebraically  for  7(s),  we  get 

7(s)  = F(i)(e_as  - e~bs)  where  F(s)  = — and  ^(F)  = — e~t/mc\ 

s + 1 /(RQ  R 

the  last  expression  being  obtained  from  Table  6.1  in  Sec.  6.1.  Hence  Theorem  1 yields  the  solution  (Fig.  124) 

/(f)  = = #-1{e-a*F(,s)}  - 2_1{e_i>sF(s)}  = — [e~tt~aV(Rau(t  - a)  - e~a~bV(RC)u(t  - b)]\ 

R 


that  is,  /(f)  = 0 if  t < a,  and 


m = ■ 


Kxe 


-t/(R  C) 


if  a < t < b 


I (Fi  - K2)e~t/ma  ifu  > h 
where  Kx  = V0ea/(Rn/R  and  K2  = V0eb/(Ra/R. 


Response  of  an  RLC-Circuit  to  a Sinusoidal  Input  Acting  Over  a Time  Interval 

Find  the  response  (the  current)  of  the  RLC- circuit  in  Fig.  125,  where  E(t ) is  sinusoidal,  acting  for  a short  time 
interval  only,  say. 


E(t)  = 100  sin  400/  if  0 < t < 2tt  and  E(t)  = 0 if  t > 2tt 
and  current  and  charge  are  initially  zero. 

Solution.  The  electromotive  force  E{t)  can  be  represented  by  (100  sin  400/)(l  — u(t  — 2it)).  Hence  the 
model  for  the  current  i(t)  in  the  circuit  is  the  integro-differential  equation  (see  Sec.  2.9) 


0.1/'  + 11/  + 100  [ !(t)  dr  = (100  sin  400f)(l  - u(t  - 2tt)).  7(0)  = 0,  /'( 0)  = 0. 

From  Theorems  2 and  3 in  Sec.  6.2  we  obtain  the  subsidiary  equation  for  I(s ) = 5£(i) 


0.1  s/  + 11/  + 100-  = 


I _ 100  • 4005  / 1 _ e~ 


4002  \ & 


Solving  it  algebraically  and  noting  that  sz  + 1105  + 1000  = (5  + 10)(5  + 100),  we  obtain 


/(s)  = 


1000  ■ 400 

(s  + 10)(i  + 100)  (i2  + 4002  i2  + 4002 


For  the  first  term  in  the  parentheses  ( • • • ) times  the  factor  in  front  of  them  we  use  the  partial  fraction 
expansion 


400,0005  A B Ds  + K 

(s  + 10)(s  + 100)(52  + 4002)  _ s + 10  + s + 100  + .s2  + 4002' 

Now  determine  A,  B , D,  K by  your  favorite  method  or  by  a CAS  or  as  follows.  Multiplication  by  the  common 
denominator  gives 


400,000s-  = A(s  + 100)(s2  + 4002)  + B(s  + 10)(s2  + 4002)  + (Ds  + K)(s  + 10)(s  + 100). 
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We  set  s = — 10  and  — 100  and  then  equate  the  sums  of  the  s3  and  .v2  terms  to  zero,  obtaining  (all  values  rounded) 


is  = -10) 
is  = -100) 
(s3-terms) 
(s2-terms) 


-4,000,000  = 90(1 02  + 4002)A, 
-40,000,000  = — 90(1002  + 4002)B, 
0 = A + B + D, 


0 = 100A  + 10B  + 1 10Z)  + K, 


A = -0.27760 
B = 2.6144 
D = -2.3368 
K = 258.66. 


Since  K = 258.66  = 0.6467  • 400,  we  thus  obtain  for  the  first  term  7i  in  / = 7i  — 7 2 


0.2776  2.6144  2.33685  0.6467  - 400 

s + 10  + ~s  + 100  s2  + 4002  s2  + 4002 


From  Table  6.1  in  Sec.  6.1  we  see  that  its  inverse  is 

hit)  = — 0.2776e_lot  + 2.6144e“100t  - 2.3368  cos  400f  + 0.6467  sin  400f. 

This  is  the  current  i(t)  when  0 < t < 277.  It  agrees  for  0 < t < 277  with  that  in  Example  1 of  Sec.  2.9  (except 
for  notation),  which  concerned  the  same  /?LC-circuit.  Its  graph  in  Fig.  63  in  Sec.  2.9  shows  that  the  exponential 
terms  decrease  very  rapidly.  Note  that  the  present  amount  of  work  was  substantially  less. 

The  second  term  7i  of  7 differs  from  the  first  term  by  the  factor  e~27rs.  Since  cos  400(?  — 277)  = cos  400r 
and  sin  400(r  — 277)  = sin  400r,  the  second  shifting  theorem  (Theorem  1)  gives  the  inverse  12(f)  = 0 if 
0 < t < 277,  and  for  > 27 7 it  gives 

hit)  = -0.2776e_1O(t_2,r)  + 2.6144e“100<t_2,r)  - 2.3368  cos  400f  + 0.6467  sin  400f. 

Hence  in  i(t)  the  cosine  and  sine  terms  cancel,  and  the  current  for  t > 2tt  is 

i(t)  = — 0.2776(e~lot  - c-10«-2rt)  + 2.6144(c_100t  - c-100«-2"-)). 

It  goes  to  zero  very  rapidly,  practically  within  0.5  sec. 


C=  IQ-2  F 


Eit) 


Fig.  125.  RLC-circuit  in  Example  4 


1.  Report  on  Shifting  Theorems.  Explain  and  compare 
the  different  roles  of  the  two  shifting  theorems,  using  your 
own  formulations  and  simple  examples.  Give  no  proofs. 


2-11 


SECOND  SHIFTING  THEOREM, 
UNIT  STEP  FUNCTION 


Sketch  or  graph  the  given  function,  which  is  assumed  to  be 
zero  outside  the  given  interval.  Represent  it,  using  unit  step 
functions.  Find  its  transform.  Show  the  details  of  your  work. 

2.  t (0  < t < 2)  3.  t ~ 2 it  > 2) 

4.  cos  At  (0  < t < 7r)  5.  et  (0  < t < tt/2) 


6.  sin  TTt  (2  < t < 4) 

8.  t2  (1  <t<  2) 

10.  sinh  t (0  < t < 2) 


7.  e-7rt  (2  < t < 4) 

9.  t2{t>l) 

11.  sin  t (7t/2  < t < 7r) 


12-17 


INVERSE  TRANSFORMS  BY  THE 
2ND  SHIFTING  THEOREM 


Find  and  sketch  or  graph /(f)  if  2?(/)  equals 

12.  e_37(.s  - l)3  13.  6(1  - e-^/is2  + 9) 

14.  4(e-2s  - 2 e~5s)/s  15.  e_3s/i4 

16.  2(e~s  - e~3s)/(s2  - 4) 

17.  (1  + e_2,T(s  + 1))(i  + l)/((s  + l)2  + 1) 
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18-27 


IVPs,  SOME  WITH  DISCONTINUOUS 
INPUT 


Using  the  Laplace  transform  and  showing  the  details,  solve 

18.  9y"  - 6y  + y = 0,  y(0)  = 3,  y'(0)  = 1 

19.  y"  + 6 y + 8v  = e~at  - e~5t,  y(0)  = 0,  v'(0)  = 0 

20.  y"  + 10v'  + 24y  = 144f2,  y(0)  = 19/12, 
y'(0)  = -5 

21.  y"  + 9y  — 8 sin  t if  0 < t < 77  and  0 if  t > 7r; 
y(0)  = 0,  y'(0)  = 4 

22.  y"  + 3y'  + 2y  = 4/  if  0 < f < 1 and  8 if  t > 1; 
y(0)  = 0,  y'(0)  = 0 

23.  y"  + y'  — 2y  = 3 sin  t — cos  f if  0 < t < 277  and 
3 sin  2 1 — cos  2t  if  t > 2i r;  y(0)  = 1,  y;(0)  = 0 

24.  y"  + 3y'  + 2y  = 1 if  0 < t < 1 and  0 if  t > 1; 
y(0)  = 0,  y'(0)  = 0 

25.  y"  + y = t if  0 < t < 1 and  0 if  t > 1;  y(0)  = 0, 
y'(0)  = 0 

26.  Shifted  data,  y"  + 2 y'  + 5y  = 1 0 sin  /iff)  < / < 277 
and  0 if  t > 2tt;  y(7r)  = 1,  y '(tt)  = 2e_7T  — 2 

27.  Shifted  data,  y"  + 4y  = 8r2  if  0 < t < 5 and  0 if 
t > 5;  y(l)  = 1 + cos  2,  yr(l)  = 4 — 2 sin  2 


31.  Discharge  in  UC-circuit.  Using  the  Laplace  transform, 
find  the  charge  q(t)  on  the  capacitor  of  capacitance  C 
in  Fig.  127  if  the  capacitor  is  charged  so  that  its  potential 
is  Vq  and  the  switch  is  closed  at  t = 0. 


32-34 


RC-CIRCUIT 


Using  the  Laplace  transform  and  showing  the  details,  find 
the  current  i(t)  in  the  circuit  in  Fig.  128  with  R = 10  U and 
C = 10-2F,  where  the  current  at  t = 0 is  assumed  to  be 


zero,  and: 


32.  v = 0 if  t < 4 and  14  • 106e-3t  V if  t > 4 


33.  v = 0 if  t < 2 and  100(f  - 2)  V if  t > 2 

34.  v(t ) = 100  V if  0.5  < t < 0.6  and  0 otherwise.  Why 
does  i(t)  have  jumps? 


R 


MODELS  OF  ELECTRIC  CIRCUITS 


RL-CIRCUIT 

Using  the  Laplace  transform  and  showing  the  details,  find 
the  current  /(f)  in  the  circuit  in  Fig.  126,  assuming  /(0)  = 0 
and: 

28.  R = 1 kU  (=1000  11),  L = 1 H,  v = 0 if  0 < t < tt, 
and  40  sin  t V if  t > tt 

29.  R = 25  U,  L = 0.1  H,  v = 490  e~5t  V if  0 < t < 1 
and  0 if  t > 1 

30.  R = 10  fl,  L = 0.5  H,  v = 200 1 V if  0 < t < 2 and  0 
if  t > 2 


28-40 


28-30 


Fig.  128.  Problems  32-34 


35-37 


LC-CIRCUIT 


Using  the  Laplace  transform  and  showing  the  details,  find 
the  current  /(f)  in  the  circuit  in  Fig.  129,  assuming  zero 
initial  current  and  charge  on  the  capacitor  and: 

35.  L = 1 H,  C = 10-2  F,  v = -9900  cos  t V if 
7 t < t < 37 t and  0 otherwise 


36.  L = 1 H,  C = 0.25  F,  v = 200  (f  - |f3)  V if 
0 < t < 1 and  0 if  t > 1 


37.  L = 0.5  H,  C = 0.05  F,  v = 78  sin  t V if  0 < t < tt 
and  0 if  f > 77 


v(t) 

Fig.  126.  Problems  28-30  Fig.  129.  Problems  35-37 


Fig.  127.  Problem  31 


38-40 


RLC-CIRCUIT 


Using  the  Laplace  transform  and  showing  the  details,  find 
the  current  /(f)  in  the  circuit  in  Fig.  130,  assuming  zero 
initial  current  and  charge  and: 


38.  R = 4 U,  L = 1 H,  C = 0.05  F,  v = 34e-t  V if 
0 < / < 4 and  0 if  f > 4 
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39.  R = 2 n,  L = 1 H,  C = 0.5  F,  v(t)  = 1 kV  if 
0 < t < 2 and  0 if  t > 2 


40.  R = 2 O,  L = 1 H,  C = 0.1  F.  v = 255  sin  t V 
if  0 < t < 277  and  0 if  f > 277 


C 


v(t ) 

Fig.  130.  Problems  38-40 


6.4  Short  Impulses.  Diracs  Delta  Function. 
Partial  Fractions 


An  airplane  making  a “hard”  landing,  a mechanical  system  being  hit  by  a hammerblow, 
a ship  being  hit  by  a single  high  wave,  a tennis  ball  being  hit  by  a racket,  and  many  other 
similar  examples  appear  in  everyday  life.  They  are  phenomena  of  an  impulsive  nature 
where  actions  of  forces — mechanical,  electrical,  etc. — are  applied  over  short  intervals 
of  time. 

We  can  model  such  phenomena  and  problems  by  “Dirac’s  delta  function,”  and  solve 
them  very  effecively  by  the  Laplace  transform. 

To  model  situations  of  that  type,  we  consider  the  function 

( 1 jk  if  a t ts  a + k 

(1)  fk(t  - a)  = \ (Fig.  132) 

l0  otherwise 

(and  later  its  limit  as  £— >0).  This  function  represents,  for  instance,  a force  of  magnitude 
1 /k  acting  from  t = a to  t = a + k,  where  k is  positive  and  small.  In  mechanics,  the 
integral  of  a force  acting  over  a time  interval  a ^ t ^ a + k is  called  the  impulse  of 
the  force;  similarly  for  electromotive  forces  E(t ) acting  on  circuits.  Since  the  blue  rectangle 
in  Fig.  132  has  area  1,  the  impulse  of  /j;:  in  (1)  is 


(2) 


Ik  - 


fk(t  ~ a)  dt 
^ o 


~a  + k 


t 

^-Area  = 1 

Ilk 

a a + k 


Fig.  132.  The  function  fk(t  — a)  in  (1) 
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To  find  out  what  will  happen  if  k becomes  smaller  and  smaller,  we  take  the  limit  of  /;, 
as  >0  (k  > 0).  This  limit  is  denoted  by  8(t  — a),  that  is, 

8(t  — a)  = lim  fjft  — a), 
k — >0 

8(t  — a)  is  called  the  Dirac  delta  function2  or  the  unit  impulse  function. 

8(t  — a)  is  not  a function  in  the  ordinary  sense  as  used  in  calculus,  but  a so-called 
generalized  function.2  To  see  this,  we  note  that  the  impulse  f:  of//,  is  1,  so  that  from  (1) 
and  (2)  by  taking  the  limit  as  i->0  we  obtain 


(3) 


8(t  — a) 


oo  if  t = a 
0 otherwise 


and 


r 00 

8{t  — a)  dt  = 1, 
ri) 


but  from  calculus  we  know  that  a function  which  is  everywhere  0 except  at  a single  point 
must  have  the  integral  equal  to  0.  Nevertheless,  in  impulse  problems,  it  is  convenient  to 
operate  on  8{t  — a)  as  though  it  were  an  ordinary  function.  In  particular,  for  a continuous 
function  g(t)  one  uses  the  property  [often  called  the  sifting  property  of  8(t  — a),  not  to 
be  confused  with  shifting ] 


(4) 


g{t)8(t  - a)  dt  = g(a) 

J o 


which  is  plausible  by  (2). 

To  obtain  the  Laplace  transform  of  S(t  — a),  we  write 


fk(t  ~ a) 


l 

k 


[ u(t  — a)  — u(t  — (a  + k ))] 


and  take  the  transform  [see  (2)] 

1 1 — p~ks 

<£{fk(t  - a)}  = ' [*"«  - e-(a+fc)s]  = c~as  ,g  ■ 

ks  ks 

We  now  take  the  limit  as  k — > 0.  By  THopital’s  rule  the  quotient  on  the  right  has  the  limit 
1 (differentiate  the  numerator  and  the  denominator  separately  with  respect  to  k,  obtaining 
se~ks  and  5,  respectively,  and  use  se~ks / s 1 as  A:— >0).  Hence  the  right  side  has  the 

limit  e~as.  This  suggests  defining  the  transform  of  8(t  — a)  by  this  limit,  that  is, 

(5)  ££{S(f  — a)}  = e-as. 

The  unit  step  and  unit  impulse  functions  can  now  be  used  on  the  right  side  of  ODEs 
modeling  mechanical  or  electrical  systems,  as  we  illustrate  next. 


2PAUL  DIRAC  (1902-1984),  English  physicist,  was  awarded  the  Nobel  Prize  [jointly  with  the  Austrian 
ERWIN  SCHRODINGER  (1887-1961)]  in  1933  for  his  work  in  quantum  mechanics. 

Generalized  functions  are  also  called  distributions.  Their  theory  was  created  in  1936  by  the  Russian 
mathematician  SERGEI  L’VOVICH  SOBOLEV  (1908-1989),  and  in  1945,  under  wider  aspects,  by  the  French 
mathematician  LAURENT  SCHWARTZ  (1915-2002). 
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EXAMPLE  1 


EXAMPLE  2 


Mass-Spring  System  Under  a Square  Wave 

Determine  the  response  of  the  damped  mass-spring  system  (see  Sec.  2.8)  under  a square  wave,  modeled  by 
(see  Fig.  133) 

y"  + 3y'  + 2 y = r(t)  = u(t  - 1)  - u(t  - 2),  y( 0)  = 0,  y'(0)  = 0. 

Solution.  From  (1)  and  (2)  in  Sec.  6.2  and  (2)  and  (4)  in  this  section  we  obtain  the  subsidiary  equation 


s2Y  + 3sY  + 2Y  = - (e~s  - e~2s).  Solution  Y(s)  = — 5 1 

s s(s  + 3s  + 2) 


(e~ 


- e 2s). 


Using  the  notation  F(s)  and  partial  fractions,  we  obtain 

1 1 


F(s)  = 


1 

1 2 

• + • 


s(sz  + 3s  + 2)  s(s  + l)(s  + 2)  s s+1  s + 2 


From  Table  6.1  in  Sec.  6.1,  we  see  that  the  inverse  is 


/(/)  = £E-\F)  =i~e~t  + \e-2t. 


Therefore,  by  Theorem  1 in  Sec.  6.3  (r-shifting)  we  obtain  the  square- wave  response  shown  in  Fig.  133, 


y = £~1(F(s)e~s  - F(s)e~2s) 

= f(t~  1 Mr  - 1)  - f(t  m.2)u(t  - 2) 

(° 

= U - £-«-’>  + ie~2tt-V 

I —(t— 2)  , 1 — 2(t— 1)  1 — 2(t— 2) 

( e T"  £ + 2*  2.e 


(0  < t < 1) 

(1  < t < 2) 

(f>2).  ■ 


Hammerblow  Response  of  a Mass-Spring  System 

Find  the  response  of  the  system  in  Example  1 with  the  square  wave  replaced  by  a unit  impulse  at  time  t — 1. 
Solution.  We  now  have  the  ODE  and  the  subsidiary  equation 

y”  + 3 y + 2 y = S(t  - 1),  and  (s2  + 3s  + 2 )Y  = e~s. 

Solving  algebraically  gives 


i%?)  = 


(i  + l)(v  + 2)  Vi  + 1 i + 2 


1 


1 


By  Theorem  1 the  inverse  is 


y(t)  = £-\Y)  = 


I " 

- e-za~v 


if  0 < t < 1 
if  t > 1. 
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EXAMPLE  3 


y(t)  is  shown  in  Fig.  134.  Can  you  imagine  how  Fig.  133  approaches  Fig.  134  as  the  wave  becomes  shorter  and 
shorter,  the  area  of  the  rectangle  remaining  1 ? 


Fig.  134.  Response  to  a hammerblow  in  Example  2 


Four-Terminal  RLC-Network 

Find  the  output  voltage  response  in  Fig.  135  if  R = 20  ft,  L = 1 H,  C = 1()-4  F,  the  input  is  S(t)  (a  unit  impulse 
at  time  t = 0),  and  current  and  charge  are  zero  at  time  t = 0. 

Solution.  To  understand  what  is  going  on,  note  that  the  network  is  an  RLC-circuit  to  which  two  wires  at  A 
and  B are  attached  for  recording  the  voltage  v(l)  on  the  capacitor.  Recalling  from  Sec.  2.9  that  current  i(t)  and 
charge  q(t)  are  related  by  i = q = dq/dt,  we  obtain  the  model 


Li'  + Ri  + 


1 

C 


= Lq"  + Rq 


q 

C 


= q"  + 2 Or/  + 10,0004  = SCO- 


From  (1)  and  (2)  in  Sec.  6.2  and  (5)  in  this  section  we  obtain  the  subsidiary  equation  for  Q(s)  = ££(4) 


02  + 20s  + 10,000)2  = 1.  Solution  Q = . 

(s  + 10)2  + 9900 

By  the  first  shifting  theorem  in  Sec.  6. 1 we  obtain  from  Q damped  oscillations  for  4 and  v;  rounding  9900  = 99. 502, 
we  get  (Fig.  135) 


4 = ST^Q)  = 


99.50 


- e 10(  sin  99.50 1 and 


v = - = 100.5e_lot  sin  99.50f. 
C 


-o  o- 

v(t)  = ? 


80 

40 

0 

-40 

-80 


\ A 

\ 1 / V / \ /Hs. >— ^.1 

0.05  0.1  0.15^/0.2  0.25  03 


Network 

Fig.  135. 


Voltage  on  the  capacitor 

Network  and  output  voltage  in  Example  3 


More  on  Partial  Fractions 

We  have  seen  that  the  solution  Y of  a subsidiary  equation  usually  appears  as  a quotient 
of  polynomials  Y(s ) = F(s)/G(s),  so  that  a partial  fraction  representation  leads  to  a sum 
of  expressions  whose  inverses  we  can  obtain  from  a table,  aided  by  the  first  shifting 
theorem  (Sec.  6.1).  These  representations  are  sometimes  called  Heaviside  expansions. 
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EXAMPLE  4 


An  unrepeated  factor  s — a in  G(s)  requires  a single  partial  fraction  A/(s  — a). 

2 3 

See  Examples  1 and  2.  Repeated  real  factors  (s  — a)  , (s  — a)  , etc.,  require  partial 
fractions 

A2  Ai  A3  A2  Ai 

o "1 , o H n + , etc., 

(s  — a)  s - a (s  — a)  (s  — a)  s — a 

The  inverses  are  ( A2t  + A\)eat,  (^A^t2  + A2t  + Ai)eat,  etc. 

Unrepeated  complex  factors  (s  — a)(s  — a),  a = a + if3,  a = a — iff  require  a partial 
fraction  (As  + B)/[(s  — a)2  + [32].  For  an  application,  see  Example  4 in  Sec.  6.3. 
A further  one  is  the  following. 


Unrepeated  Complex  Factors.  Damped  Forced  Vibrations 

Solve  the  initial  value  problem  for  a damped  mass-spring  system  acted  upon  by  a sinusoidal  force  for  some 
time  interval  (Fig.  136), 

y"  + 2 y +2 y = r(t ),  r(t ) = 10  sin  2tif  0 < t < it  and  0 if  t > tt\  y(0)  = 1,  y;(0)  = —5. 

Solution.  From  Table  6.1,  (1),  (2)  in  Sec.  6.2,  and  the  second  shifting  theorem  in  Sec.  6.3,  we  obtain  the 
subsidiary  equation 


02y  - s + 5)  + 2 Cry  - 1)  + 2F  = 10 (1  - e_7rs)- 

i2  + 4 

We  collect  the  K-terms,  (s2  -I-  2s  + 2)7,  take  — s + 5 — 2 = — s + 3 to  the  right,  and  solve, 

20  20e~™  s - 3 

(6)  7 = + . 

(s2  + 4)(s2  + 2s  + 2)  (s2  + 4)(s2  + 2s  + 2)  s2  + 2s  + 2 

For  the  last  fraction  we  get  from  Table  6.1  and  the  first  shifting  theorem 

i ( s + 1 — 4 1 

(7)  SE  ) l = e *(cos  / — 4 sin  t ). 

l(s  + l)2  + lj 

In  the  first  fraction  in  (6)  we  have  unrepeated  complex  roots,  hence  a partial  fraction  representation 

20  As  + B Ms  + N 

— + . 

(s2  + 4)(s2  + 2.S  + 2)  i2  + 4 s2  + 2s  + 2 

Multiplication  by  the  common  denominator  gives 

20  = (As  + B)(s2  + 2s  + 2)  + (Ms  + N)(s2  + 4). 

We  determine  A,  B,  M,  N.  Equating  the  coefficients  of  each  power  of  5 on  both  sides  gives  the  four  equations 

(a)  [s3] : 0 = A + M (b)  [s2] : 0 = 2A  + B + N 

(c)  [s] : 0 = 2A  + 2B  + 4 M (d)  [s°] : 20  = 2 B + 4N. 

We  can  solve  this,  for  instance,  obtaining  M = —A  from  (a),  then  A = B from  (c),  then  N = —3A  from  (b), 
and  finally  A = —2  from  (d).  Hence  A = —2 , B = —2,M  = 2,N=6,  and  the  first  fraction  in  (6)  has  the 
representation 

- 2s  - 2 2(5  + 1)  + 6 - 2 

(8)  1 . Inverse  transform:  —2  cos  2 1 — sin  2 1 + e (2  cos  t + 4 sin  t). 

s2  + 4 (s  + l)2  + 1 
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The  sum  of  this  inverse  and  (7)  is  the  solution  of  the  problem  for  0 < t < n,  namely  (the  sines  cancel), 

(9)  y(t)  = 3e_t  cos  t — 2 cos  2 1 — sin  2 1 if  0 < t < tt. 

In  the  second  fraction  in  (6),  taken  with  the  minus  sign,  we  have  the  factor  e~~"s , so  that  from  (8)  and  the  second 
shifting  theorem  (Sec.  6.3)  we  get  the  inverse  transform  of  this  fraction  for  t > 0 in  the  form 

+2  cos  (2 1 — 2tt)  + sin  (It  — 2tt)  — [2  cos  (t  — ir)  + 4 sin  ( t — tt)] 

= 2 cos  2 1 + sin  2 1 + (2  cos  t + 4 sin  t). 

The  sum  of  this  and  (9)  is  the  solution  for  t > tt, 

(10)  y(t ) = e_t[(3  + 2e")  cos  / + Ae"  sin  f]  if  t > tt. 

Figure  136  shows  (9)  (for  0 < t < tt)  and  (10)  (for  t > tt),  a beginning  vibration,  which  goes  to  zero  rapidly 
because  of  the  damping  and  the  absence  of  a driving  force  after  t = tt. 


Mechanical  system 


Output  (solution) 


Fig.  136.  Example  4 


The  case  of  repeated  complex  factors  [(s  — a)(s  — a)]2,  which  is  important  in  connection 
with  resonance,  will  be  handled  by  “convolution”  in  the  next  section. 


1.  CAS  PROJECT.  Effect  of  Damping.  Consider  a 
vibrating  system  of  your  choice  modeled  by 

y"  + cy  + ky  = 8(f). 

(a)  Using  graphs  of  the  solution,  describe  the  effect  of 
continuously  decreasing  the  damping  to  0,  keeping  k 
constant. 

(b)  What  happens  if  c is  kept  constant  and  k is 
continuously  increased,  starting  from  0? 

(c)  Extend  your  results  to  a system  with  two 
S-functions  on  the  right,  acting  at  different  times. 

2.  CAS  EXPERIMENT.  Limit  of  a Rectangular  Wave. 
Effects  of  Impulse. 

(a)  In  Example  1 in  the  text,  take  a rectangular  wave 
of  area  1 from  I to  I + k.  Graph  the  responses  for  a 
sequence  of  values  of  k approaching  zero,  illustrating 
that  for  smaller  and  smaller  k those  curves  approach 


the  curve  shown  in  Fig.  134.  Hint:  If  your  CAS  gives 
no  solution  for  the  differential  equation,  involving  k, 
take  specific  k’s  from  the  beginning. 

(b)  Experiment  on  the  response  of  the  ODE  in  Example 
1 (or  of  another  ODE  of  your  choice)  to  an  impulse 
S(f  — a)  for  various  systematically  chosen  a (>  0); 
choose  initial  conditions  y(0)  A 0,  y'(0)  = 0.  Also  con- 
sider the  solution  if  no  impulse  is  applied.  Is  there  a 
dependence  of  the  response  on  a?  On  b if  you  choose 
bS(t  — a)?  Would  — 8(t  — a)  with  a > a annihilate  the 
effect  of  8(t  — a)?  Can  you  think  of  other  questions  that 
one  could  consider  experimentally  by  inspecting  graphs? 

EFFECT  OF  DELTA  (IMPULSE) 

ON  VIBRATING  SYSTEMS 

Find  and  graph  or  sketch  the  solution  of  the  IVP.  Show  the 
details. 

3.  y"  + Ay  = S(t  - tt),  y(0)  = 8,  y'(0)  = 0 


3-12 
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4.  y"  + 16 v = 4S(f  - 377),  y(0)  = 2,  y'(0)  = 0 

5.  y"  + y = S(t  — 7 t)  — S(t  — 277), 
y(0)  = 0,y'(0)  = 1 

6.  y"  + Ay'  + 5y  = S(f  - 1),  y(0)  = 0,y'(0)  = 3 

7.  4v"  + 24y'  + 37y  = 17e_t  + S(f  - g), 
y(0)  = 1,  y'(0)  = 1 

8.  y"  + 3y'  + 2y  = 10(sin  t + 8(t  - 1)),  y(0)  = 1, 
y'(0)  = -i 

9.  y"  + 4y'  + 5y  = [1  - u(t  - 10)]e*  - e108(t  - 10), 
y(0)  = 0,  y'(0)  = 1 

10.  y"  + 5y'  + 6y  = 8(t  — g77)  + u{t  — 77)  cos  t, 
y(0)  = 0,  y'(0)  = 0 

11.  y"  + 5y'  + 6y  = u(t  — 1)  + 8(t  — 2), 
y(0)  = 0,  y'(0)  = 1 

12.  y"  + 2y'  + 5y  = 25 1 - 1006(r  - 7 r),  y(0)  = -2, 
y'(0)  = 5 

13.  PROJECT.  Heaviside  Formulas,  (a)  Show  that  for 
a simple  root  a and  fraction  A/(s  — a)  in  F(s)/G(s)  we 
have  the  Heaviside  formula 


Set  t — {n  — 1 )p  in  the  nth  integral.  Take  out  e~^n~lyp 
from  under  the  integral  sign.  Use  the  sum  formula  for 
the  geometric  series. 

(b)  Half-wave  rectifier.  Using  (11),  show  that  the 
half-wave  rectification  of  sin  cot  in  Fig.  137  has  the 
Laplace  transform 

cu(l  + e-™^) 

££(/)  = 

(s2  + &>2)(1  - e-2™/“) 

CO 

( s 2 + cu2)(l  - e_lrs/") 

(A  half-wave  rectifier  clips  the  negative  portions  of  the 
curve.  A full-wave  rectifier  converts  them  to  positive; 
see  Fig.  138.) 

(c)  Full- wave  rectifier.  Show  that  the  Laplace  trans- 
form of  the  full-wave  rectification  of  sin  cot  is 


2 , 2 
s + u> 


coth 


775 
2 w 


A 


lim 

s—*a 


(5  - a)F(s) 
G(5) 


(b)  Similarly,  show  that  for  a root  a of  order  m and 
fractions  in 


fit) 


0 k!u>  2 k!<o  3jiIco  t 


Fig.  137.  Half-wave  rectification 


F(s)  Am  Anl_  i 

W)  = (5  - a)m  + (5  - ar-1  + 

H h further  fractions 

s — a 

we  have  the  Heaviside  formulas  for  the  first  coefficient 


fit) 


0 jiko  2 nlco  3 n lm  t 

Fig.  138.  Full-wave  rectification 


A 


m 


lim 

s—*a 


(5  - a)mF{s) 
G(s) 


(d)  Saw-tooth  wave.  Find  the  Laplace  transform  of  the 
saw-tooth  wave  in  Fig.  139. 

fit) 


and  for  the  other  coefficients 
1 


Ak  ~ 


lim 


^ m—k 

(s 

- a)mF(s)i 

k 

/I 

/ 1 
/ 1 

X | . 

A 

/ 1 
1 / 

dsm~k 

G(5) 

’ 

1 

Z iZ 



1 / 

z 

k = 1. 


, m — 1 . 


14.  TEAM  PROJECT.  Laplace  Transform  of  Periodic 
Functions 

(a)  Theorem.  The  Laplace  transform  of  a piecewise 
continuous  function  f(t)  with  period  p is 


0 p 2p  3p  t 

Fig.  139.  Saw-tooth  wave 

15.  Staircase  function.  Find  the  Laplace  transform  of  the 
staircase  function  in  Fig.  140  by  noting  that  it  is  the 
difference  of  kt/p  and  the  function  in  14(d). 


fit) 

1 fP  1 ' 

(11)  ££(/)=  _ I e~stf(t)  dt  (5>  0).  k_  r I 

e ° ; 

0 p 2p  3p  t 

Prove  this  theorem.  Hint:  Write  /”  = J0P  + Jp2p  + ■ ■ ■ . Fig.  140.  Staircase  function 
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6.5  Convolution.  Integral  Equations 

Convolution  has  to  do  with  the  multiplication  of  transforms.  The  situation  is  as  follows. 
Addition  of  transforms  provides  no  problem;  we  know  that  f£(f  + g ) = f£(f)  + f£(g)- 
Now  multiplication  of  transforms  occurs  frequently  in  connection  with  ODEs,  integral 
equations,  and  elsewhere.  Then  we  usually  know  ££(/)  and  !£(g ) and  would  like  to  know 
the  function  whose  transform  is  the  product  '£(£):£(  g)-  We  might  perhaps  guess  that  it 
is  fg,  but  this  is  false.  The  transform  of  a product  is  generally  different  from  the  product 
of  the  transforms  of  the  factors. 


fE(fg)  £ £(f)£(g)  in  general. 

To  see  this  take/ = etandg  = l.Then/g  = et,f£(fg)  = l/(s  — l),but££(/)  = l/(s  — 1) 
and  ££(1)  = 1/s  give  f£(f)f£(g)  = l/(s2  - s). 

According  to  the  next  theorem,  the  correct  answer  is  that  f£(ff£(g)  is  the  transform  of 
the  convolution  of/and  g,  denoted  by  the  standard  notation/*  g and  defined  by  the  integral 


(1) 


hit)  = ( f*g)(t ) 


rt 


f(T)g(t  ~ t)  dr. 


THEOREM  1 


Convolution  Theorem 

If  two  functions  f and  g satisfy  the  assumption  in  the  existence  theorem  in  Sec.  6.1, 
so  that  their  transforms  F and  G exist,  the  product  H = FG  is  the  transform  of  h 
given  by  (1).  (Proof  after  Example  2.) 


EXAMPLE  1 Convolution 

Let  H{s ) = — fl)$].  Find  h(t). 

Solution.  1 /{s  — a ) has  the  inverse  f(t ) = eat,  and  1/s  has  the  inverse  g(t)  = 1.  With  f(j)  = earand 
g{t  — r)  = 1 we  thus  obtain  from  (1)  the  answer 


h(t)  = eat  * 1 = 


1 dr 


1 

a 


0 eat  - 1). 


To  check,  calculate 


H(s)  = X(h)(s)  = ~ 
a 


1 a 
a s — as 


— 1 — • - = S£(eat)i£(l). 
s — a s 


EXAMPLE  2 Convolution 

Let  H(s)  = 1 /(s2  + a)2)2.  Find  h(t). 

Solution.  The  inverse  of  1 /{s2  + a?)  is  (sin  u)t)/co.  Hence  from  (1)  and  the  first  formula  in  (1 1)  in  App.  3.1 
we  obtain 


sin  u)t  sin  cot  1 

* = — 2 sin  cut  sin  co(t  — r)  dr 


1 

2^ 


rf 

[—cos  ot  + cos  (2cut  — o)t)\  dr 
'o 


h(t)  = 


CO 


(x) 


'o 
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1 ' 

“ 2^  . 
1 ' 

" 2^  . 

in  agreement  with  formula  21  in  the  table  in  Sec.  6.9. 


— r cos  cot  + 


—t  cos  cot  + 


sin  cot 
co 

sin  cot " 
co 


' t 

. T = 0 


PROOF  We  prove  the  Convolution  Theorem  1.  CAUTION!  Note  which  ones  are  the  variables 
of  integration!  We  can  denote  them  as  we  want,  for  instance,  by  t and  p,  and  write 


F(s) 


e~sJ(r)  dr 


and 


G(s) 


e spg(p)dp. 


We  now  set  t = p + r,  where  t is  at  first  constant.  Then  p = t — t,  and  f varies  from 
t to  oo.  Thus 


G(s) 


e sCt  T)g(t  - t)  dt 


e stg(t  — t)  dt. 


t in  F and  t in  G vary  independently.  Hence  we  can  insert  the  G-integral  into  the 
/•’-integral.  Cancellation  of  e~ST  and  eST  then  gives 


F(s)G(s) 


!Tf(r)es 


e stg(t  — t ) dt  dr 


fij)  e Stg(t  ~ T)dtdr. 


Here  we  integrate  for  fixed  r over  t from  t to  00  and  then  over  r from  0 to  oo.  This  is  the 
blue  region  in  Fig.  141.  Under  the  assumption  on /and  g the  order  of  integration  can  be 
reversed  (see  Ref.  [A5]  for  a proof  using  uniform  convergence).  We  then  integrate  first 
over  r from  0 to  t and  then  over  t from  0 to  °°,  that  is. 


F(s)G(s) 


-st 


f(T)g(t  ~ T)drdt 


r 00 

e~sth(t)dt  = t£{h)  = H(s). 


This  completes  the 


proof. 


Fig.  141.  Region  of  integration  in  the 
tr-plane  in  the  proof  of  Theorem  1 
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EXAMPLE  3 


EXAMPLE  4 


From  the  definition  it  follows  almost  immediately  that  convolution  has  the  properties 


f *8  = 8*  f (commutative  law) 

f*  (gi  + 82)  =f*gi+ f *82  (distributive  law) 

(/  *g)  * v = f*(g  * v)  (associative  law) 

/*  0 = 0 */=  0 

similar  to  those  of  the  multiplication  of  numbers.  However,  there  are  differences  of  which 
you  should  be  aware. 


Unusual  Properties  of  Convolution 

/ * 1 9^  / in  general.  For  instance, 


r * 1 = t • l dr  = — t i=  t. 


(/  */)(0  = 0 niay  not  hold.  For  instance,  Example  2 with  oj  = 1 gives 


sin  t * sin  t = —\t  cos  t + \ sin  t 


(Fig.  142).  ■ 


4 - 


2 - 


-2  - 


-I — 1 — 


2 4 6 8 10:  t 


Fig.  142.  Example  3 


We  shall  now  take  up  the  case  of  a complex  double  root  (left  aside  in  the  last  section  in 
connection  with  partial  fractions)  and  find  the  solution  (the  inverse  transform)  directly  by 
convolution. 


Repeated  Complex  Factors.  Resonance 

In  an  undamped  mass-spring  system,  resonance  occurs  if  the  frequency  of  the  driving  force  equals  the  natural 
frequency  of  the  system.  Then  the  model  is  (see  Sec.  2.8) 

y " + d)Qy  = K sin  a>Qt 

where  coq  = k/m,  k is  the  spring  constant,  and  m is  the  mass  of  the  body  attached  to  the  spring.  We  assume 
y(0)  = 0 and  y (0)  = 0,  for  simplicity.  Then  the  subsidiary  equation  is 


s2Y  + co$Y  = 


KcOq 

2 , 2 
.S’  + (i)  o 


Ka0 

Y = . 

, 2 , 2\  2 
(5  + O>0) 


Its  solution  is 
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EXAMPLE  5 


This  is  a transform  as  in  Example  2 with  co  = coq  and  multiplied  by  Kq)q.  Hence  from  Example  2 we  can  see 
directly  that  the  solution  of  our  problem  is 

K(x)  o ( sin  cdq  t \ K 

y(t ) = g I — t cos  d ] ~ — 2 (— cos  + sin  woO- 

2 a;  o \ o»0  / 2 a;  0 

We  see  that  the  first  term  grows  without  bound.  Clearly,  in  the  case  of  resonance  such  a term  must  occur.  (See 
also  a similar  kind  of  solution  in  Fig.  55  in  Sec.  2.8.) 

Application  to  Nonhomogeneous  Linear  ODEs 

Nonhomogeneous  linear  ODEs  can  now  be  solved  by  a general  method  based  on 
convolution  by  which  the  solution  is  obtained  in  the  form  of  an  integral.  To  see  this,  recall 
from  Sec.  6.2  that  the  subsidiary  equation  of  the  ODE 

(2)  y"  + ay’  + by  = r(t)  {a,  b constant) 

has  the  solution  [(7)  in  Sec.  6.2] 

Y(s)  = [Cs  + a)y(O)  + y'(0)]Q(s)  + R(s)Q(s) 

with  R(s ) = -c£(r)  and  Q(s)  = 1 /(.v2  + as  + b)  the  transfer  function.  Inversion  of  the  first 
term  [ ■ • • ] provides  no  difficulty;  depending  on  whether  \a2  — b is  positive,  zero,  or 
negative,  its  inverse  will  be  a linear  combination  of  two  exponential  functions,  or  of  the 
form  (ci  + C2t)e~at^z,  or  a damped  oscillation,  respectively.  The  interesting  term  is 
R(s)Q(s)  because  r{f)  can  have  various  forms  of  practical  importance,  as  we  shall  see.  If 
y(0)  = 0 and  y (0)  = 0,  then  Y = RQ,  and  the  convolution  theorem  gives  the  solution 


(3) 


y(t) 


ct 

q(t  — r)r(T)  dr. 


Response  of  a Damped  Vibrating  System  to  a Single  Square  Wave 

Using  convolution,  determine  the  response  of  the  damped  mass-spring  system  modeled  by 

y"  + 3 yr  + 2y  = r(t),  r{t)  = 1 if  1 < t < 2 and  0 otherwise,  y(0)  = yr(0)  = 0. 

This  system  with  an  input  (a  driving  force)  that  acts  for  some  time  only  (Fig.  143)  has  been  solved  by  partial 
fraction  reduction  in  Sec.  6.4  (Example  1). 

Solution  by  Convolution.  The  transfer  function  and  its  inverse  are 
1 111 


5^  + 3s  + 2 (^  + l)(s  + 2)  5+1  5 + 2 


hence 


q(t)  = e~f  - e~2t. 


Hence  the  convolution  integral  (3)  is  (except  for  the  limits  of  integration) 


y(t)  = \q(t  - t)  ■ 1 dT  =*■  | [e-(t~T)  - e~2U-r>]  dr  = e~tt-T)  - i e~2a-r\ 

Now  comes  an  important  point  in  handling  convolution.  r(r)  =lifl<r<2  only.  Hence  if  t < 1,  the  integral 
is  zero.  If  1 < t < 2,  we  have  to  integrate  from  r = 1 (not  0)  to  t.  This  gives  (with  the  first  two  terms  from  the 
upper  limit) 

yit)  = e-°  - \e~°  - - \e-2a~»)  = | 
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EXAMPLE  6 


EXAMPLE  7 


If  t > 2,  we  have  to  integrate  from  t = 1 to  2 (not  to  f).  This  gives 

y(t)  = e-(t~2>  - |e-2«-2)  - (e-«-1)  - ic-2«-«). 

Figure  143  shows  the  input  (the  square  wave)  and  the  interesting  output,  which  is  zero  from  0 to  1,  then  increases, 
reaches  a maximum  (near  2.6)  after  the  input  has  become  zero  (why?),  and  finally  decreases  to  zero  in  a monotone 
fashion. 


Fig.  143.  Square  wave  and  response  in  Example  5 


Integral  Equations 

Convolution  also  helps  in  solving  certain  integral  equations,  that  is,  equations  in  which  the 
unknown  function  y(t ) appears  in  an  integral  (and  perhaps  also  outside  of  it).  This  concerns 
equations  with  an  integral  of  the  form  of  a convolution.  Hence  these  are  special  and  it  suffices 
to  explain  the  idea  in  terms  of  two  examples  and  add  a few  problems  in  the  problem  set. 


A Volterra  Integral  Equation  of  the  Second  Kind 

Solve  the  Volterra  integral  equation  of  the  second  kind3 

y (?)  — >>(t)  sin  (t  — r)  dr  = /. 

Solution.  From  (1)  we  see  that  the  given  equation  can  be  written  as  a convolution,  y — y * sin  t = t.  Writing 
Y = i£(y)  and  applying  the  convolution  theorem,  we  obtain 


YU)  ~ Y(s)- 


= TM- 


The  solution  is 

s2  + 1 1 1 t3 

Y(s)  = = 1 and  gives  the  answer  v(t)  = t H . 

will  need  patience).  I 

Another  Volterra  Integral  Equation  of  the  Second  Kind 

Solve  the  Volterra  integral  equation 

y(t)  — (1  + r)y(t  — r)  dr  = 1 — sinh  t. 


Check  the  result  by  a CAS  or  by  substitution  and  repeated  integration  by  parts  (which 


3If  the  upper  limit  of  integration  is  variable , the  equation  is  named  after  the  Italian  mathematician  VITO 
VOLTERRA  (1860-1940),  and  if  that  limit  is  constant,  the  equation  is  named  after  the  Swedish  mathematician 
ERIK  IVAR  FREDHOLM  (1866-1927).  “Of  the  second  kind  (first  kind)”  indicates  that  y occurs  (does  not 
occur)  outside  of  the  integral. 
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Solution.  By  (1)  we  can  write  y — (1  + t)  *y  = 1 — sinh  t.  Writing  Y = ££(y),  we  obtain  by  using  the 
convolution  theorem  and  then  taking  common  denominators 


Y(s) 


1 - 


1 1 


1 1 


s s‘  - 1 


hence  Y(s)  ■ 


sis*  ~ 1) 


(s2  — s — 1 )/s  cancels  on  both  sides,  so  that  solving  for  Y simply  gives 


Y(s)  = 


- 1 


and  the  solution  is  y(t)  = cosh  t. 


PROBLEM  SET  6.5 


1-7 


CONVOLUTIONS  BY  INTEGRATION 


Find: 

1.  1 * 1 2.  1 * sin  cot 

3.  et  * e_t  4.  (cos  cot)  * (cos  cot) 

5.  (sin  cot)  * (cos  cot)  6.  eat  * ebt  (a  b) 

7.  t * ef 


8-14 


INTEGRAL  EQUATIONS 


Solve  by  the  Laplace  transform,  showing  the  details: 


8.  y(t)  + 4 y (r)(t  — r)  dr  = 2 1 


9.  y(0  ~ y(r)  dr  = 1 


10.  y(t)  — y(j)  sin  2 (t  — t)  dr  = sin  2 1 


11.  y{t)  + (t  — r)y(r)  dr  = 1 


12.  y(t)  + v(t)  cosh  (t  — t)  dr  = t + e 


13.  y{t)  + 2e  v(r)e  T dr  = te 


14.  y(t)  - J y(r)(t  - t)  dr  = 2 - \t2 

'o 

15.  CAS  EXPERIMENT.  Variation  of  a Parameter. 

(a)  Replace  2 in  Prob.  13  by  a parameter  k and 
investigate  graphically  how  the  solution  curve  changes 
if  you  vary  k,  in  particular  near  k = — 2. 


(b)  Make  similar  experiments  with  an  integral  equation 
of  your  choice  whose  solution  is  oscillating. 


16.  TEAM  PROJECT.  Properties  of  Convolution.  Prove: 

(a)  Commutativity,/*  g = g*f 

(b)  Associativity,  (/*  g)  * v = /*  (g  * v) 

(c)  Distributivity,  /*  (gi  + g2)  = /*  gi  + /*  g2 

(d)  Dirac’s  delta.  Derive  the  sifting  formula  (4)  in  Sec. 
6.4  by  using /fc  with  a = 0 [(1),  Sec.  6.4]  and  applying 
the  mean  value  theorem  for  integrals. 

(e)  Unspecified  driving  force.  Show  that  forced 
vibrations  governed  by 

y"  + 0?y  = tit),  y(0)  = K\,  y'(0)  = K2 

with  co  0 and  an  unspecified  driving  force  r{t) 
can  be  written  in  convolution  form, 

1 k2 

y = — sin  cot  * r(t)  + K i cos  cot  H sin  cot. 

CO  CO 


17-26 


INVERSE  TRANSFORMS 
BY  CONVOLUTION 


Showing  details,  find /(f)  if  5£(f)  equals: 


17. 

5.5 

18. 

(s  + 1.5)(j  - 4) 

19. 

27 TS 

20. 

{S  + 77  ) 

21. 

CO 

22. 

2/  2 , 2\ 
s (s  + CO  ) 

23. 

40.5 

24. 

s(.s2  - 9) 

25. 

18s 

z„2  , 

1 

0 “ af 
9 

s(s  + 3) 


s(s  — 2) 

240 

Cs2  + l)Cs2  + 25) 


26.  Partial  Fractions.  Solve  Probs.  17,  21,  and  23  by 
partial  fraction  reduction. 
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6.6  Differentiation  and  Integration  of  Transforms. 
ODEs  with  Variable  Coefficients 


The  variety  of  methods  for  obtaining  transforms  and  inverse  transforms  and  their 
application  in  solving  ODEs  is  surprisingly  large.  We  have  seen  that  they  include  direct 
integration,  the  use  of  linearity  (Sec.  6.1),  shifting  (Secs.  6.1,  6.3),  convolution  (Sec.  6.5), 
and  differentiation  and  integration  of  functions  /(f)  (Sec.  6.2).  In  this  section,  we  shall 
consider  operations  of  somewhat  lesser  importance.  They  are  the  differentiation  and 
integration  of  transforms  F(s)  and  corresponding  operations  for  functions  /(f).  We  show 
how  they  are  applied  to  ODEs  with  variable  coefficients. 


Differentiation  of  Transforms 

It  can  be  shown  that,  if  a function /(t)  satisfies  the  conditions  of  the  existence  theorem  in 
Sec.  6.1,  then  the  derivative  F (,v)  = dF/ds  of  the  transform  F(s)  = i£(/)  can  be  obtained 
by  differentiating  F(s)  under  the  integral  sign  with  respect  to  s (proof  in  Ref.  [GenRef4] 
listed  in  App.  1).  Thus,  if 


F(s)  = 


e~stm  dt. 


then 


F\s)  = ~ 


e~sttf(t)  dt. 


Consequently,  if  ,/(/)  = F(s),  then 

(1)  £{tf(t)}  = ~ F\s ),  hence 


iTV'(s)}  = -f/(f) 


where  the  second  formula  is  obtained  by  applying  -F~ 1 on  both  sides  of  the  first  formula. 
In  this  way,  differentiation  of  the  transform  of  a function  corresponds  to  the  multiplication 
of  the  function  by  —t. 

Differentiation  of  Transforms.  Formulas  21-23  in  Sec.  6.9 

We  shall  derive  the  following  three  formulas. 


2(/) 

m 

(2) 

1 

(j2  + /32)2 

— (sin  pt  — pt  cos  pt ) 
2 P3 

s 

1 

(3) 

{s2  + p2)2 

— sin  Bt 

s2 

i 

(4) 

(j2  + p2)2 

— (sin  pt  + pt  cos  pt) 

Solution.  From  (1)  and  formula  8 (with  w = fi)  in  Table  6.1  of  Sec.  6.1  we  obtain  by  differentiation 
(CAUTION!  Chain  rule!) 


££(f  sin  fit) 


2)3  s 

C s 2 + |82)2 
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EXAMPLE  2 


Dividing  by  2/8  and  using  the  linearity  of  f£,  we  obtain  (3). 

Formulas  (2)  and  (4)  are  obtained  as  follows.  From  (1)  and  formula  7 (with  w = p)  in  Table  6.1  we  find 


(5) 


f £(t  cos  pt)  = 


(s'  + p2)  - 2 s2 

C s 2 + /82)2 


s2  ~ /32 
(i2  + /32)2 


From  this  and  formula  8 (with  o>  = /S)  in  Table  6.1  we  have 


££(  t cos  fit  ± — sin  /3 1 J = 


s2  - P2 
(, s 2 + P2)2 


1 

s2  + p2 


On  the  right  we  now  take  the  common  denominator.  Then  we  see  that  for  the  plus  sign  the  numerator  becomes 
22222 

s — (5  + s + jS  — 2s  , so  that  (4)  follows  by  division  by  2.  Similarly,  for  the  minus  sign  the  numerator 
takes  the  form  s2  — (32  — s2  — (32  = —2(32,  and  we  obtain  (2).  This  agrees  with  Example  2 in  Sec.  6.5. 


Integration  of  Transforms 

Similarly,  if /(f)  satisfies  the  conditions  of  the  existence  theorem  in  Sec.  6.1  and  the  limit 
of  /(f)/f,  as  t approaches  0 from  the  right,  exists,  then  for  s > k. 


(6) 


F(s ) ds 


hence 


m 


In  this  way,  integration  of  the  transform  of  a function  f(t)  corresponds  to  the  division  of 
f(t)  by  t. 

We  indicate  how  (6)  is  obtained.  From  the  definition  it  follows  that 


F(s)  ds 


-st. 


lit)  dt 


L "0 


ds. 


and  it  can  be  shown  (see  Ref.  [GenRef4]  in  App.  1)  that  under  the  above  assumptions  we 
may  reverse  the  order  of  integration,  that  is, 


F(s)  ds  = 

r 00 

1 

8 

dt  = 

m 

1 

1*0 

1 

8 

s 

0 

. 

S J 

. 

0 

. 

S J 

dt. 


Integration  of  e st  with  respect  to  .v  gives  e st/(—t).  Here  the  integral  over  s on  the  right 
equals  e~st/t.  Therefore, 


F(s)ds  = 


{ t J 


is  > k). 


Differentiation  and  Integration  of  Transforms 


Find  the  inverse  transform  of  In  1 -I = In 


Solution.  Denote  the  given  transform  by  F(s).  Its  derivative  is 


, d 0 « «,  2s  2s 

F (s)  = -(In  (s2  + u2)  - In  s2)  = - — . 

ds  -l 


S + 0)  S 
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Taking  the  inverse  transform  and  using  (1),  we  obtain 
2_{F'(i)j  = 2 


1 2 2 \ = 2 cos  wt  - 2 = 

( V + CO  s ) 


Hence  the  inverse /(f)  of  F(s)  is /(f)  = 2(1  — cos  a)t)/t.  This  agrees  with  formula  42  in  Sec.  6.9. 
Alternatively,  if  we  let 


2s  2 
G(S)  = -2— -J  - 


then 


g(f)  = i£-1(G)  - 2(cos  ait  - 1). 


From  this  and  (6)  we  get,  in  agreement  with  the  answer  just  obtained. 


g~1|ln'  +2m  | = g-1|  J G(s)  ds  | 


g(0  2 

= —(1  — cos  cot), 

t t 


the  minus  occurring  since  s is  the  lower  limit  of  integration. 
In  a similar  way  we  obtain  formula  43  in  Sec.  6.9, 


££  1 ^ In  ^ 1 2^1  = (1  — cosh  at). 


Special  Linear  ODEs  with  Variable  Coefficients 

Formula  (1)  can  be  used  to  solve  certain  ODEs  with  variable  coefficients.  The  idea  is  this. 
Let  ££(y)  = Y.  Then  !£(y')  = sY  — y(0)  (see  Sec.  6.2).  Hence  by  (1), 

(7)  X(ty')=  -^[sY-ym  = -Y- s^. 

as  ds 

Similarly,  £(y")  = s2Y  - sy(0)  - y'(0)  and  by  (1) 

(8)  £(ty")  = - ~ sy( 0)  - y'm  = -2 sY  - s2  ^ + y(0). 

as  as 

Hence  if  an  ODE  has  coefficients  such  as  at  + b,  the  subsidiary  equation  is  a first-order 
ODE  for  T,  which  is  sometimes  simpler  than  the  given  second-order  ODE.  But  if  the  latter 
has  coefficients  at2  + bt  + c,  then  two  applications  of  (1)  would  give  a second-order 
ODE  for  Y,  and  this  shows  that  the  present  method  works  well  only  for  rather  special 
ODEs  with  variable  coefficients.  An  important  ODE  for  which  the  method  is  advantageous 
is  the  following. 

Laguerre’s  Equation.  Laguerre  Polynomials 

Laguerre’s  ODE  is 


(9) 


ty"  + (1  — t)y'  + ny  = 0. 


We  determine  a solution  of  (9)  with  n = 0,  1,  2,  • ■ ■ . From  (7)-(9)  we  get  the  subsidiary  equation 


7dY 

-2 sY  - s2 — + y( 0) 
ds 


, dY  , 

+ sY  - y(0)  - ( -Y  - s—  \ + nY  = 0. 
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Simplification  gives 


9 dY 

(5  - s2) — + (n  + 1 - s)Y  = 0. 
ds 

Separating  variables,  using  partial  fractions,  integrating  (with  the  constant  of  integration  taken  to  be  zero),  and 
taking  exponentials,  we  get 


(10*) 


dY 

Y 


n + 1 — s 


- ds  = 


n n + 1 
5—1  5 


ds 


and 


Y = 


(*  - if 


We  write  ln  = ££  (7)  and  prove  Rodrigues’s  formula 


(10) 


ez  dn 

/(.=  I,  lJf)  = -—(tne-\ 
n\  dt 


n 


1,  2, • • • . 


These  are  polynomials  because  the  exponential  terms  cancel  if  we  perform  the  indicated  differentiations.  They 
are  called  Laguerre  polynomials  and  are  usually  denoted  by  Ln  (see  Problem  Set  5.7,  but  we  continue  to  reserve 
capital  letters  for  transforms).  We  prove  (10).  By  Table  6.1  and  the  first  shifting  theorem  (5-shifting), 


(s  + 1 )n 


hence  by  (3)  in  Sec.  6.2 


(S  + If 


because  the  derivatives  up  to  the  order  n — 1 are  zero  at  0.  Now  make  another  shift  and  divide  by  «!  to  get 
[see  (10)  and  then  (10*)] 


(s  - l)n 

Zdn)  = — = Y. 

n+i 


P R O B L~E^=S^E-T  ~6^6 


1.  REVIEW  REPORT.  Differentiation  and  Integration 
of  Functions  and  Transforms.  Make  a draft  of  these 
four  operations  from  memory.  Then  compare  your  draft 
with  the  text  and  write  a 2-  to  3-page  report  on  these 
operations  and  their  significance  in  applications. 


2-11 


TRANSFORMS  BY  DIFFERENTIATION 


Showing  the  details  of  your  work,  find  ££(/)  if /(f)  equals: 


2.  3 f sinh  4f 

i , — 3t 

3.  zte 

4.  fe-t  cos  t 


5.  f cos  cot 

6.  f2  sin  3f 

7.  t 2 cosh  2t 

8.  te~kt  sin  t 

9.  gf2  sin  7 Tt 

10.  tnekt 

11.  At  COS  g7Tf 

12.  CAS  PROJECT.  Laguerre  Polynomials,  (a)  Write  a 
CAS  program  for  finding  ln(t)  in  explicit  form  from  (10). 
Apply  it  to  calculate  10,  ■ ■ ■ , /10.  Verify  that  l0,  ■ ■ ■ , l10 
satisfy  Laguerre’ s differential  equation  (9). 


(b)  Show  that 


ln(t)  = 2 


(-If 


and  calculate  Iq,-",  ho  from  this  formula. 

(c)  Calculate  l0,---,ho  recursively  from  /0  = 1,  h = 
1 - t by 


( tl  l)(n+l  (2fl  ~\~  1 f)/n  tlln— 1- 


(d)  A generating  function  (definition  in  Problem  Set 
5.2)  for  the  Laguerre  polynomials  is 


n= 0 

Obtain  l0,  ■ ■ ■ , ho  from  the  corresponding  partial  sum 
of  this  power  series  in  .v  and  compare  the  / n with  those 
in  (a),  (b),  or  (c). 

13.  CAS  EXPERIMENT.  Laguerre  Polynomials.  Ex- 
periment with  the  graphs  of  l0,  ■ ■ ■ , /10,  finding  out 
empirically  how  the  first  maximum,  first  minimum,  ■ • ■ 
is  moving  with  respect  to  its  location  as  a function  of 
n.  Write  a short  report  on  this. 


242 


CHAP.  6 Laplace  Transforms 


14-20 


INVERSE  TRANSFORMS 


Using  differentiation,  integration,  s-shifting,  or  convolution, 
and  showing  the  details,  find /(f)  if  i£(/)  equals: 

5 


14. 


(s2  + 16)2 


15. 


(.t2  - 9)2 


16. 


2s  + 6 


(, s 2 + 6s  + 10)2 


17.  In  - 


19.  In 


s — 1 

■t2  + 1 

(s  - l)2 


18.  arccot  — 
7 T 


20. 


In 


s + a 
s + b 


6 ./  Systems  of  ODEs 

The  Laplace  transform  method  may  also  be  used  for  solving  systems  of  ODEs,  as  we  shall 
explain  in  terms  of  typical  applications.  We  consider  a first-order  linear  system  with 
constant  coefficients  (as  discussed  in  Sec.  4.1) 

>’l  = flllTl  + A12T2  + gi(t) 

(!) 

T2  = «21.Vl  + G22.V2  + ^CO- 

Writing  Y1  = Y2  = ££(y2),  Gi  = ££(gi),  G2  = ££( g2),  we  obtain  from  (1)  in  Sec.  6.2 

the  subsidiary  system 


■sJ!  - yi(0)  = On? [ + a12Y2  + G^s) 
sY2  - y2(0)  = a2i?i  + fl22?2  + G2(s). 

By  collecting  the  Yr  and  Kj-terms  we  have 


(2) 


(on  - s)?i  + a12Y2  = — yi(0)  - Gi(.s) 

fl21?l  + (fl22  — 5)?2  = — y2(0)  — G2(S). 


By  solving  this  system  algebraically  for  and  taking  the  inverse  transform  we 

obtain  the  solution  yq  = iE~1(Y]),  y2  = (£~l(Y2)  of  the  given  system  (1). 

Note  that  (1)  and  (2)  may  be  written  in  vector  form  (and  similarly  for  the  systems  in 
the  examples);  thus,  setting  y = [yx  y2]T,  A = [ajk\,  g = [gi  g2]T,  Y = [Tj  Y2]T, 
G = [Gi  G2]t  we  have 

y'  = Ay  + g and  (A  - sI)Y  = -y(0)  - G. 


EXAMPLE  1 Mixing  Problem  Involving  Two  Tanks 

Tank  7i  in  Fig.  144  initially  contains  100  gal  of  pure  water.  Tank  T2  initially  contains  100  gal  of  water  in  which 
150  lb  of  salt  are  dissolved.  The  inflow  into  7i  is  2 gal/min  from  T2  and  6 gal/min  containing  6 lb  of  salt  from 
the  outside.  The  inflow  into  T2  is  8 gal/min  from  7i.  The  outflow  from  T2  is  2 + 6 = 8 gal/min,  as  shown  in 
the  figure.  The  mixtures  are  kept  uniform  by  stirring.  Find  and  plot  the  salt  contents  yi(t)  and  in  7i  and 
7*2,  respectively. 
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EXAMPLE  2 


Solution.  The  model  is  obtained  in  the  form  of  two  equations 

Time  rate  of  change  = Inflow/ min  — Outflow/ min 
for  the  two  tanks  (see  Sec.  4.1).  Thus, 

f 8 2 / 8 8 

Yl  = — TOUJl  + 100^2  + 6.  y2  = loo}7!  — loo)^- 
The  initial  conditions  are  yi(0)  = 0,  ^(O)  = 150.  From  this  we  see  that  the  subsidiary  system  (2)  is 

6 

(-0.08  - s)Y1  + 0.02%  = -- 

0.08%  + (-0.08  - s)Y2  = -150. 

We  solve  this  algebraically  for  % and  Y2  by  elimination  (or  by  Cramer’s  rule  in  Sec.  7.7),  and  we  write  the 
solutions  in  terms  of  partial  fractions, 


li  = 

9s  + 0.48 

100  62.5 

37.5 

s(s  + 0.12)(s  + 0.04) 

s s + 0.12 

s + 0.04 

Y2  = 

150s2  + 12s  + 0.48 

100  , 125 

75 

s(s  + 0.12)(s  + 0.04) 

s s + 0.12 

s + 0.04' 

By  taking  the  inverse  transform  we  arrive  at  the  solution 

yi  = 100  - 62.5<T012t  - 37.5e~004t 
y2  = 100  + 125e~012t  - 75e_004t. 

Figure  144  shows  the  interesting  plot  of  these  functions.  Can  you  give  physical  explanations  for  their  main 
features?  Why  do  they  have  the  limit  100?  Why  is  y2  not  monotone,  whereas  yi  is?  Why  is  vy  from  some  time 
on  suddenly  larger  than  t’2?  Etc. 


6 gal/min 


y(t) 
150  - 

100 

50 


. Salt  content  in  77 


- Salt  content  in  T, 


50  100  150  200 


Fig.  144.  Mixing  problem  in  Example  1 


Other  systems  of  ODEs  of  practical  importance  can  be  solved  by  the  Laplace  transform 
method  in  a similar  way,  and  eigenvalues  and  eigenvectors,  as  we  had  to  determine  them 
in  Chap.  4,  will  come  out  automatically,  as  we  have  seen  in  Example  1. 

Electrical  Network 

Find  the  currents  ijff)  and  i2(t ) in  the  network  in  Fig.  145  with  L and  R measured  in  terms  of  the  usual  units 
(see  Sec.  2.9),  v(t)  = 100  volts  i f 0 = t = 0.5  sec  and  0 thereafter,  and  7(0)  = 0,  i'(0)  = 0. 

Solution.  The  model  of  the  network  is  obtained  from  Kirchhoff’s  Voltage  Law  as  in  Sec.  2.9.  For  the  lower 
circuit  we  obtain 


0.8;  i + l(;'i  - i 2)  + 1.4;i  = 100[1  - u(t  - |)] 
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l2  = l h 


Fig.  145.  Electrical  network  in  Example  2 


and  for  the  upper 

1 ■ 12  + 1(*2  - *i)  = 0. 

Division  by  0.8  and  ordering  gives  for  the  lower  circuit 

i i + 3ii  - 1.25/2  = 125[1  - M(r  - I)] 

and  for  the  upper 

*2  “ *1  + *2  = 0- 

With  f i(0)  = 0,  ^(O)  = 0 we  obtain  from  (1)  in  Sec.  6.2  and  the  second  shifting  theorem  the  subsidiary 
system 

(\  e~s'2 

(*  + 3 )h~  1.25 12=  125  [ — — — 

-h  + Cs  + 1 )h  = 0. 

Solving  algebraically  for  A and  /2  gives 

125(v  + 1) 


h = ■ 


h = 


+ l)(i  + z) 

125 


(1  - <Ts/2), 


(1  - <Ts/2). 


s(s  + !)(i  + |) 

The  right  sides,  without  the  factor  1 — e~s'2,  have  the  partial  fraction  expansions 

500  _ 125  _ 625 

7s  3 (s  + |)  21  (s  + 1) 

and 

500  250  250 

— + , 

ls  3(5  + |)  21(5  + ?) 

respectively.  The  inverse  transform  of  this  gives  the  solution  for  0 ^ t ^ 

✓A  125  -t/2  625  —7 1/2  , 500 

■ /A  _ 250  —t/2  , 250  — 7t/2  , 500 

W)---3~e  ' + 2T^  ' + -7" 


(0  g r £ |). 
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EXAMPLE  3 


According  to  the  second  shifting  theorem  the  solution  for  t > 2 is  ;'iO)  - ii(t  ~ |)  and  i2(f)  - i2(f  ~ 
«i(f)  = -¥(1  - e1/4)<Tf/2  - fP(l  - 

i2(t)  = -Sf8(l  - eV4)e-‘/2  + fP(l  - c7/4)e-7‘/2 


g),  that  is, 
(t  > 2). 


Can  you  explain  physically  why  both  currents  eventually  go  to  zero,  and  why  i\(t)  has  a sharp  cusp  whereas 
12(f)  has  a continuous  tangent  direction  at  t = 


Systems  of  ODEs  of  higher  order  can  be  solved  by  the  Laplace  transform  method  in  a 
similar  fashion.  As  an  important  application,  typical  of  many  similar  mechanical  systems, 
we  consider  coupled  vibrating  masses  on  springs. 


Fig.  146.  Example  3 


Model  of  Two  Masses  on  Springs  (Fig.  146) 

The  mechanical  system  in  Fig.  146  consists  of  two  bodies  of  mass  1 on  three  springs  of  the  same  spring  constant 
k and  of  negligibly  small  masses  of  the  springs.  Also  damping  is  assumed  to  be  practically  zero.  Then  the  model 
of  the  physical  system  is  the  system  of  ODEs 


(3) 


y"  = ~kyi  + k(y2  - yi) 
yf2  = ~k(y2  ~ yi)  ~ ky2. 


Here  yi  and  y2  are  the  displacements  of  the  bodies  from  their  positions  of  static  equilibrium.  These  ODEs 
follow  from  Newton’s  second  law,  Mass  X Acceleration  = Force,  as  in  Sec.  2.4  for  a single  body.  We  again 
regard  downward  forces  as  positive  and  upward  as  negative.  On  the  upper  body,  —kyi  is  the  force  of  the 
upper  spring  and  k(y2  — yi)  that  of  the  middle  spring,  y2  — yi  being  the  net  change  in  spring  length — think 
this  over  before  going  on.  On  the  lower  body,  ~k(y2  — yi)  is  the  force  of  the  middle  spring  and  — Ay2  that 
of  the  lower  spring. 

We  shall  determine  the  solution  corresponding  to  the  initial  conditions  yi(0)  = 1,  y2(0)  = 1,  yi(0)  = \/3 £, 
yi(0)  = ~\^3k.  Let  Yi  = ££(yi)  and  Y2  = !£(y2).  Then  from  (2)  in  Sec.  6.2  and  the  initial  conditions  we  obtain 
the  subsidiary  system 

s2Fi  - s - VVc  = -kYi  + k(Y2  ~ Fi) 
s2Y2  - s + V3 k = —k(Y2  - Fj)  - kY2. 


This  system  of  linear  algebraic  equations  in  the  unknowns  Y\  and  Y2  may  be  written 

U2  + 2k)Y,  - kY2  = s + Vvc 

—kyi  + ( s 2 + 2 k)Y2  = s - VVc. 
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Elimination  (or  Cramer’s  rule  in  Sec.  7.7)  yields  the  solution,  which  we  can  expand  in  terms  of  partial  fractions. 


(s  + Vvc)(s2  + 2k)  + k(s  - V3k) 

s Vu 

(sz  + 21) 2 - k2 

+ k + 3k 

( s 2 + 2 k)(s  - V3k)  + k(s  + VVc) 

s V3k 

02  + 2k) 2 - k2 

s2  + k s2  + 3k 

Hence  the  solution  of  our  initial  value  problem  is  (Fig.  147) 

yi(r)  = ) = cos  y/kt  + sin  V3 let 

y2(0  = ^f_1(^2)  — cos  y/kt  — sin  y/3kt. 

We  see  that  the  motion  of  each  mass  is  harmonic  (the  system  is  undamped!),  being  the  superposition  of  a “slow” 
oscillation  and  a “rapid”  oscillation. 


1.  TEAM  PROJECT.  Comparison  of  Methods  for 
Linear  Systems  of  ODEs 

(a)  Models.  Solve  the  models  in  Examples  1 and  2 of 
Sec.  4. 1 by  Laplace  transforms  and  compare  the  amount 
of  work  with  that  in  Sec.  4.1.  Show  the  details  of  your 
work. 

(b)  Homogeneous  Systems.  Solve  the  systems  (8), 
( 1 1) — ( 13)  in  Sec.  4.3  by  Laplace  transforms.  Show  the 
details. 

(c)  Nonhomogeneous  System.  Solve  the  system  (3)  in 
Sec.  4.6  by  Laplace  transforms.  Show  the  details. 

SYSTEMS  OF  ODES 

the  Laplace  transform  and  showing  the  details  of 
your  work,  solve  the  I VP: 

2.  y[  + V2  = 0,  yi  + y2  = 2 cos  t, 

yi(0)  = L y2(0)  = 0 
3-  y'l  = -yi  + 4y2,  y2  = 3yi  - 2y2, 
vr(0)  = 3,  y2(0)  = 4 

4.  y[  = 4y2  “ 8 cos  At,  y2  = — 3vi  — 9 sin  4t, 
vi(0)  = 0,  y2(0)  = 3 


5.  W = y2  + 1 - u(t  - 1),  v2  = -yi  + 1 - u{t  - 1), 
yi(0)  = 0,  y2(0)  = 0 

6-  yi  = 5yi  + y2,  y2  = yi  + 5y2, 
yi(0)  = 1,  y2(0)  = -3 

7.  yi  = 2yi  - 4v2  + u(t  - 1 )et, 

>'2  = >’t  “ 3y2  + u(t  - l)e‘,  yxCO)  = 3,  y2(0)  = 0 

8.  yi  = — 2yi  + 3y2,  y2  = 4y1  - y2, 
yi(0)  = 4,  v2(0)  = 3 

9.  vi  = 4yi  + y2,  y2  = -yi  + 2y2,  yi(0)  = 3, 

>2(0)  = 1 

10.  vi  = — y2,  v2  = -yi  + 2[1  - u(t  - 2n r)]  cos  t, 
yi(0)  = l,  y2(0)  = o 

11.  y"  = yi  + 3v2,  y2  = 4yi  - Ael, 

yi(0)  = 2,  yi(0)  = 3,  y2(0)  = 1.  y^(0)  = 2 

12.  y"  = — 2y!  + 2y2,  y2  = 2yy  - 5v2, 

yi(0)  = 1,  yx(0)  = 0,  y2(0)  = 3,  y2(0)  = 0 

13.  yi'  + y2  = —101  sin  lOf,  y2  + yi  = 101  sin  lOt, 

yi(0)  = 0,  yi(0)  = 6,  y2(0)  = 8,  y^(0)  = -6 


SEC.  6.7  Systems  of  ODEs 


247 


14.  4v{  + y2  ~ 2y3  — 0,  — 2v{  + y3  — 1, 

2y2  - 4y3  = - 1 6/ 

Vi(0)  = 2,  >’2(0)  = 0,  y3(0)  = 0 

15.  v{  + v2  = 2 sinh  t,  y2  + y2  = e*, 

vs  + y{  = 2c‘  + e“*,  yi(0)  = 1,  y2(0)  = 1, 
v3(0)  = 0 

FURTHER  APPLICATIONS 

16.  Forced  vibrations  of  two  masses.  Solve  the  model  in 
Example  3 with  k — 4 and  initial  conditions  yi(0)  = 1, 
yi(0)  = E y2(0)  = 1,  y2  = — 1 under  the  assumption 
that  the  force  1 1 sin  t is  acting  on  the  first  body  and  the 
force  — 1 1 sin  t on  the  second.  Graph  the  two  curves 
on  common  axes  and  explain  the  motion  physically. 

17.  CAS  Experiment.  Effect  of  Initial  Conditions.  In 
Prob.  16,  vary  the  initial  conditions  systematically, 
describe  and  explain  the  graphs  physically.  The  great 
variety  of  curves  will  surprise  you.  Are  they  always 
periodic?  Can  you  find  empirical  laws  for  the  changes 
in  terms  of  continuous  changes  of  those  conditions? 

18.  Mixing  problem.  What  will  happen  in  Example  1 if 
you  double  all  flows  (in  particular,  an  increase  to 
12  gal/min  containing  12  lb  of  salt  from  the  outside), 
leaving  the  size  of  the  tanks  and  the  initial  conditions 
as  before?  First  guess,  then  calculate.  Can  you  relate 
the  new  solution  to  the  old  one? 

19.  Electrical  network.  Using  Laplace  transforms, 
find  the  currents  i\(t)  and  i2(t)  in  Fig.  148,  where 
v{t)  = 390  cos  t and  i'i(0)  = 0,  /2(0)  = 0.  How  soon 


will  the  currents  practically  reach  their  steady  state? 


4n  8fi 


Fig.  148.  Electrical  network  and 
currents  in  Problem  19 


20.  Single  cosine  wave.  Solve  Prob.  19  when  the  EMF 
(electromotive  force)  is  acting  from  0 to  277  only.  Can 
you  do  this  just  by  looking  at  Prob.  19,  practically 
without  calculation? 
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6.8  Laplace  Transform:  General  Formulas 


Formula 

Name,  Comments 

Sec. 

F(s)  = ££{/«}  = 
fit ) = i£_1{ 

r 00 

e-st/(0  dt 

0 

F(J)} 

Definition  of  Transform 
Inverse  Transform 

6.1 

i£{af(t)  + bg(t) } = o2{/(f)}  + bX{g(J)} 

Linearity 

6.1 

■£  { eatf(t) ) = F(s  - fl) 
- «)}  = 

i-Shifting 

(First  Shifting  Theorem) 

6.1 

7(/')  = 
7(/")  = 
•2(/(K))  = 

*{ 

’2(/)  -/(0) 
s22(/)  - sf( 0)  -/(0) 

.s-”2(/)  - s(n~um)  - 

...  -/»->) 

/(T)rfrj  = 7-2(/) 

Jq  ^ 

Differentiation  of  Function 
Integration  of  Function 

6.2 

•t 

( f*g)(t)  = /(T)g(f  - T)  dT 

Jo 

rt 

= fit  ~ r)g(r)  dr 
7(/*g)  = 2(/)2(g) 

Convolution 

6.5 

2{/(t  — a)z<(r  — «)}  = e-asF(s) 
2 1{e~asF(i)}  = f(t  — a)u(t  — a) 

f-Shifting 

(Second  Shifting  Theorem) 

6.3 

2{f/(0J  = -f'W 

7«i  r 

—j  = 2(s)r/s 

s 

Differentiation  of  Transform 
Integration  of  Transform 

6.6 

2(/)  = - 

1 - e-psi 

/-/> 

e-st/(0  rff 

0 

/ Periodic  with  Period  p 

6.4 

Project 

16 

SEC.  6.9  Table  of  Laplace  Transforms 
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6 Table  of  Laplace  Transforms 

For  more  extensive  tables,  see  Ref.  [A9]  in  Appendix  1. 


1 

2 

3 

4 

5 

6 


9 

10 


11 

12 


13 

14 

15 

16 

17 

18 


19 

20 


F(s)  = i£{f(t)} 

m 

l/s 

1 

l/s2 

t 

\/sn  (n  = 1,  2,  ■ ■ • ) 

tn~1/(n  - 1)! 

1/Vs 

1/VttT 

l/s3/2 

iVt/lT 

l/sa  (a  > 0) 

TO1/  r(«) 

1 


s — a 
1 

(, s — a)2 

1 

0 - of 
1 


(s  — a)1 


k 


(n  = 1,2,--) 

(*>  0) 


1 


(s  — a)(s  — b) 
s 

(s  — a)(s  — b) 


{a  + b) 
(■ a + b) 


1 


2 , 2 
S ~r  CO 


2 , 2 
S + (O 


2 2 
s —a 


2 2 
5 — a 


(s  — a)2  + co2 


(s  — a)2  + co2 


s(j2  + co2) 


2/  2 i 2x 

J (S  + CO  ) 


fe 


1 

(n  - 1)! 


^n— l^at 


i 

TO) 


,k— 1 at 

- 1 e 


a — b 


(eat  - ebt) 


1 (aeat  - bebt ) 


a — b 


— sin  cot 

CO 


cos  cot 


— sinh  at 
a 


cosh  at 


—eat  sinh  cot 
co 


eat  cos  cot 


(1  — cos  cot) 


co 

1 


(cot  — sin  cot) 


Sec. 


6.1 


6.1 


6.1 


6.2 


( continued ) 
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Table  of  Laplace  Transforms  ( continued ) 


f(s)  = nm) 


m 


Sec. 


21 

22 

23 

24 


/ 2 I 2\2 

(S  + CO  ) 


(s2  + w2)2 


(, s 2 + CO2) 2 


(s2  + a2)(s  2 + b2) 


(sin  cot  — cot  cos  cot) 

2co3 


2 co 
2 co 


sin  cot 


(sin  cot  + cot  cos  cot) 


(a2  A b2) 


, 2 2 
b — a 


(cos  at  — cos  bt) 


6.6 


25 

26 

27 

28 


s4  + 4A4 


s4  + 4 A:4 


s4- A4 


,v4  - A'4 


4AJ 


2A2 


(sin  Af  cos  Af  — cos  kt  sinh  kt) 


- sin  kt  sinh  kt 


(sinh  kt  — sin  kt) 

2A3 

(cosh  kt  — cos  kt) 

2k2 


29 

30 

31 


Vs  — a — Vs  ~ b 
1 


Vs  + a Vs  + A 
1 


VsM^ 


(ebt  - eat) 


„-(a  + b)t/2/o(^_lf 


i0(ar) 


7 5.5 


7 5.4 


32 

33 


(s  - «)3/2 

1 

/ 2 2\/c 

(5  —a  ) 


(A  > 0) 


eat(l  + 2at) 


Vtt t 
Vtt  ( t 
T\k)  \2a)  /fc-1/2(fl° 


fc-l/2 


7 5.5 


34 

35 


e~as/s 


u(t  — a) 
8(t  — a) 


6.3 

6.4 


36 

37 

38 

39 


-e~k/s 

s 


Vs 


,3/2 


-k/s 


pk/s 


—k^/s 


(k>  0) 


70(2VAr) 

1 


Vtt t 

1 

Wk 

A 


cos  2 VAf 
sinh  2 VAf 

_-k2/4, 


7 5.4 


( continued ) 


Chapter  6 Review  Questions  and  Problems 


251 


Table  of  Laplace  Transforms  ( continued ) 


F(s)  = <£{f(t)) 

m 

Sec. 

40 

— In  s 
s 

-in  t - y (y  * 0.5772) 

y 5.5 

41 

, s — a 

In 

s — b 

-( ebt  - eat) 
t 

42 

2 . 2 
, S i (O 

In 

s1 2 

2 

— ( 1 — cos  cot) 

6.6 

43 

2 2 
in" 

2 

— ( 1 — cosh  at) 

s2 

44 

CO 

arctan  — 
s 

1 . 

— sin  cot 
t 

45 

i 

— arccot  s 
s 

Si  (0 

App. 

A3.1 

mp^ElFCTffl^QlJE^  T I O N S AND  PROBLEMS 


1.  State  the  Laplace  transforms  of  a few  simple  functions 
from  memory. 

2.  What  are  the  steps  of  solving  an  ODE  by  the  Laplace 
transform? 


3.  In  what  cases  of  solving  ODEs  is  the  present  method 
preferable  to  that  in  Chap.  2? 

4.  What  property  of  the  Laplace  transform  is  crucial  in 
solving  ODEs? 

5.  Is  2{f(t)  + g(f)}  = ££{/«}  + 2{«(f)}? 

<£{f(t)g(t)}  = nm}£{g(t)V.  Explain. 

6.  When  and  how  do  you  use  the  unit  step  function  and 
Dirac’s  delta? 

7.  If  you  know  /(f)  = i£-1{F(s)},  how  would  you  find 
,2_1{F(s)/s2}? 

8.  Explain  the  use  of  the  two  shifting  theorems  from  memory. 

9.  Can  a discontinuous  function  have  a Laplace  transform? 
Give  reason. 


10.  If  two  different  continuous  functions  have  transforms, 
the  latter  are  different.  Why  is  this  practically  important? 


11-19 


LAPLACE  TRANSFORMS 


Find  the  transform,  indicating  the  method  used  and  showing 
the  details. 

11.  5 cosh  2f  — 3 sinh  t 12.  e-t(cos  At  — 2 sin  At) 
13.  sin2(g7r0  14.  16 t2u(t  — \) 


15.  et^2u(t  — 3)  16.  u(t  — 2tt ) sin  t 

17.  t cos  t + sin  t 18.  (sin  cot)  * (cos  cot) 

19.  12f  * e~3t 


20-28  INVERSE  LAPLACE  TRANSFORM 

Find  the  inverse  transform,  indicating  the  method  used  and 
showing  the  details: 


20. 


7.5 


- 2s  - 


22. 


16 

i2  + S + \ 


24. 


s2  - 6.25 
(s2  + 6.25  f 


26.  — — — ^ e~5s 


28. 


3s 

sz  — 2s  + 2 


->1  5 + 1 - 

21.  — - — e 


23. 


co  cos  6 + s sin  6 


25. 


27. 


6 (s  + 1) 


s 

3s  + 4 
: + 4s  + 5 


29-37 


ODEs  AND  SYSTEMS 


Solve  by  the  Laplace  transform,  showing  the  details  and 
graphing  the  solution: 

29.  y"  + Ay'  + 5y  = 50f,  y(0)  = 5,  y'(0)  = -5 

30.  y"  + I6>>  = 4S(f  - it),  y(0)  = -1,  y'(0)  = 0 
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31.  y"  — y — 2y  = 12 u(t  — tt)  sin  ?,  y(0)  = 1, 
y'(0)  = - i 

32.  y"  + 4y  = S(t  - tt)  - 8(t  - 2ir),  y(0)  = 1, 
y'(0)  = 0 

33.  y"  + 3y'  + 2y  = 2 u(t  - 2),  y(0)  = 0,  y'(0)  = 0 

34.  y[  = y2,  y2  = -4vi  + S(f  - 7T),  yi(0)  = 0, 
y2(0)  = 0 

35.  y{  = 2_v,  - 4y2,  y2  = yi  - 3y2,  yi(0)  = 3, 
y2(0)  = 0 


36.  y{  = 2yi  + 4y2,  y2  = yi  + ly2,  yi(0)  = -4, 
y2(0)  = -4 

37.  y[  = y2  + u(t  - it),  y2  = -yi  + u{t  - 2i r), 
yi(0)  = l,  y2(0)  = 0 


38-45 


MASS-SPRING  SYSTEMS,  CIRCUITS, 
NETWORKS 


Model  and  solve  by  the  Laplace  transform: 

38.  Show  that  the  model  of  the  mechanical  system  in 
Fig.  149  (no  friction,  no  damping)  is 


m iyi  = ~k iyi  + k2(y2  - yi) 
m2y2  = ~k2(y2  - y,)  - k3y2). 


0 

I 


UTJ 


=WMr 


Fig.  149,  System  in  Problems  38  and  39 


39.  InProb.  SS.letwt!  = m2  — 10  kg,^  = k3  = 20  kg/sec2, 
k2  = 40kg/sec2.  Find  the  solution  satisfying  the  ini- 
tial conditions  yi(0)  = y2(0)  = 0,  y{(0)  = 1 meter/ sec, 
y2(0)  = — 1 meter/sec. 

40.  Find  the  model  (the  system  of  ODEs)  in  Prob.  38 
extended  by  adding  another  mass  m3  and  another  spring 
of  modulus  k4  in  series. 

41.  Find  the  current  i(t)  in  the  UC-circuit  in  Fig.  150, 
where  R = 10  11,  C = 0.1  F,  v(t)  = 10?  VifO  < ? < 4, 
v{t)  = 40  V if  t > 4,  and  the  initial  charge  on  the 
capacitor  is  0. 


v(t) 

Fig.  150.  RC-circuit 


42.  Find  and  graph  the  charge  q(t)  and  the  current  ?'(?)  in 
the  LC-circuit  in  Fig.  151,  assuming  L = 1 H,  C = 1 F, 
v{t)  = 1 — e~ 4 if  0 < t < tt,  v(t)  = 0 if  t > tt,  and 
zero  initial  current  and  charge. 

43.  Find  the  current ;(?)  in  the  RLC-circuit  in  Fig.  152,  where 
R = 160  11,  L = 20  H,  C = 0.002  F,  v{t)  = 37  sin  10?  V, 
and  current  and  charge  at  ? = 0 are  zero. 

C 


v(t)  v(t) 

Fig.  151.  LC-circuit  Fig.  152.  RLC-circuit 


44.  Show  that,  by  Kirchhoff  s Voltage  Law  (Sec.  2.9),  the 
currents  in  the  network  in  Fig.  153  are  obtained  from 
the  system 

Li[  + R(ii  - i2)  = v(t) 

R(‘2  ~ h ) + ^ <2  = 0. 

Solve  this  system,  assuming  that  R = 1012,  L = 20  H, 
C = 0.05  F,  v = 20  V,  ?i(0)  = 0,  ?2(0)  = 2 A. 


L 


Fig.  153.  Network  in  Problem  44 


45.  Set  up  the  model  of  the  network  in  Fig.  154  and  find 
the  solution,  assuming  that  all  charges  and  currents  are 
0 when  the  switch  is  closed  at  ? = 0.  Find  the  limits  of 
i jX?)  and  i2(?)  as  ? — » ®,  (i)  from  the  solution,  (ii)  directly 
from  the  given  network. 


L = 5 H 


y = 60V 


Switch 

Fig.  154.  Network  in  Problem  45 


Summary  of  Chapter  6 
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SUMMARY  OF  CHAPTER  6 

Laplace  Transforms 


The  main  purpose  of  Laplace  transforms  is  the  solution  of  differential  equations  and 
systems  of  such  equations,  as  well  as  corresponding  initial  value  problems.  The 
Laplace  transform  F(s)  = 2(f)  of  a function /(f)  is  defined  by 


(1) 


m = 2(f) 


(Sec.  6.1). 


This  definition  is  motivated  by  the  property  that  the  differentiation  of/ with  respect 
to  t corresponds  to  the  multiplication  of  the  transform  F by  s\  more  precisely, 


(2) 


2(f')  = s2(f)  -f(0) 

2(f")  = sz2(f)  - sf( 0)  -/'( 0) 


(Sec.  6.2) 


etc.  Hence  by  taking  the  transform  of  a given  differential  equation 

(3)  y"  + ay'  + by  = r(t)  (a,  b constant) 

and  writing  2(y)  = Y(s),  we  obtain  the  subsidiary  equation 

(4)  (s2  + as  + b)Y  = 2(r)  + sf(  0)  + /'( 0)  + af(  0). 

Here,  in  obtaining  the  transform  2(r)  we  can  get  help  from  the  small  table  in  Sec.  6. 1 
or  the  larger  table  in  Sec.  6.9.  This  is  the  first  step.  In  the  second  step  we  solve  the 
subsidiary  equation  algebraically  for  Y(s).  In  the  third  step  we  determine  the  inverse 
transform  y(f)  = 2~I(Y),  that  is,  the  solution  of  the  problem.  This  is  generally 
the  hardest  step,  and  in  it  we  may  again  use  one  of  those  two  tables.  Y(s)  will  often 
be  a rational  function,  so  that  we  can  obtain  the  inverse  2~1(Yn)  by  partial  fraction 
reduction  (Sec.  6.4)  if  we  see  no  simpler  way. 

The  Laplace  method  avoids  the  determination  of  a general  solution  of  the 
homogeneous  ODE,  and  we  also  need  not  determine  values  of  arbitrary  constants 
in  a general  solution  from  initial  conditions;  instead,  we  can  insert  the  latter  directly 
into  (4).  Two  further  facts  account  for  the  practical  importance  of  the  Laplace 
transform.  First,  it  has  some  basic  properties  and  resulting  techniques  that  simplify 
the  determination  of  transforms  and  inverses.  The  most  important  of  these  properties 
are  listed  in  Sec.  6.8,  together  with  references  to  the  corresponding  sections.  More 
on  the  use  of  unit  step  functions  and  Dirac’s  delta  can  be  found  in  Secs.  6.3  and 
6.4,  and  more  on  convolution  in  Sec.  6.5.  Second,  due  to  these  properties,  the  present 
method  is  particularly  suitable  for  handling  right  sides  r(t)  given  by  different 
expressions  over  different  intervals  of  time,  for  instance,  when  r(t ) is  a square  wave 
or  an  impulse  or  of  a form  such  as  r(t)  = cos  t if  0 i f i 477  and  0 elsewhere. 

The  application  of  the  Laplace  transform  to  systems  of  ODEs  is  shown  in  Sec.  6.7. 
(The  application  to  PDEs  follows  in  Sec.  12.12.) 


PART 


B 


Linear  Algebra. 
Vector  Calculus 


CHAPTER  7 
CHAPTER  8 
CHAPTER  9 
CHAPTER  10 


Linear  Algebra:  Matrices,  Vectors,  Determinants.  Linear  Systems 
Linear  Algebra:  Matrix  Eigenvalue  Problems 
Vector  Differential  Calculus.  Grad,  Div,  Curl 
Vector  Integral  Calculus.  Integral  Theorems 


Matrices  and  vectors,  which  underlie  linear  algebra  (Chaps.  7 and  8),  allow  us  to  represent 
numbers  or  functions  in  an  ordered  and  compact  form.  Matrices  can  hold  enormous  amounts 
of  data — think  of  a network  of  millions  of  computer  connections  or  cell  phone  connections — 
in  a form  that  can  be  rapidly  processed  by  computers.  The  main  topic  of  Chap.  7 is  how 
to  solve  systems  of  linear  equations  using  matrices.  Concepts  of  rank,  basis,  linear 
transformations,  and  vector  spaces  are  closely  related.  Chapter  8 deals  with  eigenvalue 
problems.  Linear  algebra  is  an  active  field  that  has  many  applications  in  engineering 
physics,  numerics  (see  Chaps.  20-22),  economics,  and  others. 


Chapters  9 and  10  extend  calculus  to  vector  calculus.  We  start  with  vectors  from  linear 
algebra  and  develop  vector  differential  calculus.  We  differentiate  functions  of  several 
variables  and  discuss  vector  differential  operations  such  as  grad,  div,  and  curl.  Chapter  10 
extends  regular  integration  to  integration  over  curves,  surfaces,  and  solids,  thereby 
obtaining  new  types  of  integrals.  Ingenious  theorems  by  Gauss,  Green,  and  Stokes  allow 
us  to  transform  these  integrals  into  one  another. 


Software  suitable  for  linear  algebra  (Lapack,  Maple,  Mathematica,  Matlab)  can  be  found 
in  the  list  at  the  opening  of  Part  E of  the  book  if  needed. 


Numeric  linear  algebra  (Chap.  20)  can  be  studied  directly  after  Chap.  7 or  8 because 
Chap.  20  is  independent  of  the  other  chapters  in  Part  E on  numerics. 
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CHAPTER  7 


Linear  Algebra:  Matrices, 
Vectors,  Determinants. 
Linear  Systems 


Linear  algebra  is  a fairly  extensive  subject  that  covers  vectors  and  matrices,  determinants, 
systems  of  linear  equations,  vector  spaces  and  linear  transformations,  eigenvalue  problems, 
and  other  topics.  As  an  area  of  study  it  has  a broad  appeal  in  that  it  has  many  applications 
in  engineering,  physics,  geometry,  computer  science,  economics,  and  other  areas.  It  also 
contributes  to  a deeper  understanding  of  mathematics  itself. 

Matrices,  which  are  rectangular  arrays  of  numbers  or  functions,  and  vectors  are  the 
main  tools  of  linear  algebra.  Matrices  are  important  because  they  let  us  express  large 
amounts  of  data  and  functions  in  an  organized  and  concise  form.  Furthermore,  since 
matrices  are  single  objects,  we  denote  them  by  single  letters  and  calculate  with  them 
directly.  All  these  features  have  made  matrices  and  vectors  very  popular  for  expressing 
scientific  and  mathematical  ideas. 

The  chapter  keeps  a good  mix  between  applications  (electric  networks,  Markov 
processes,  traffic  flow,  etc.)  and  theory.  Chapter  7 is  structured  as  follows:  Sections  7.1 
and  7.2  provide  an  intuitive  introduction  to  matrices  and  vectors  and  their  operations, 
including  matrix  multiplication.  The  next  block  of  sections,  that  is,  Secs.  7. 3-7. 5 provide 
the  most  important  method  for  solving  systems  of  linear  equations  by  the  Gauss 
elimination  method.  This  method  is  a cornerstone  of  linear  algebra,  and  the  method 
itself  and  variants  of  it  appear  in  different  areas  of  mathematics  and  in  many  applications. 
It  leads  to  a consideration  of  the  behavior  of  solutions  and  concepts  such  as  rank  of  a 
matrix,  linear  independence,  and  bases.  We  shift  to  determinants,  a topic  that  has 
declined  in  importance,  in  Secs.  7.6  and  7.7.  Section  7.8  covers  inverses  of  matrices. 
The  chapter  ends  with  vector  spaces,  inner  product  spaces,  linear  transformations,  and 
composition  of  linear  transformations.  Eigenvalue  problems  follow  in  Chap.  8. 

COMMENT.  Numeric  linear  algebra  (Secs.  20.1-20.5)  can  be  studied  immediately 
after  this  chapter. 

Prerequisite:  None. 

Sections  that  may  be  omitted  in  a short  course:  7.5,  7.9. 

References  and  Answers  to  Problems:  App.  1 Part  B,  and  App.  2. 
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7.1  Matrices,  Vectors: 

Addition  and  Scalar  Multiplication 

The  basic  concepts  and  rules  of  matrix  and  vector  algebra  are  introduced  in  Secs.  7.1  and 
7.2  and  are  followed  by  linear  systems  (systems  of  linear  equations),  a main  application, 
in  Sec.  7.3. 

Let  us  first  take  a leisurely  look  at  matrices  before  we  formalize  our  discussion.  A matrix 
is  a rectangular  array  of  numbers  or  functions  which  we  will  enclose  in  brackets.  For  example, 


Oil 

012 

«13 

0.3 

1 

-5' 

1 

021 

022 

a23 

0 

-0.2 

16 

_a31 

O32 

a33 

'e-x 

2xz 

"4" 

6x 

e 

4x  _ 

* 

[fli 

02 

<33], 

1 

.2. 

are  matrices.  The  numbers  (or  functions)  are  called  entries  or,  less  commonly,  elements 
of  the  matrix.  The  first  matrix  in  ( 1)  has  two  rows,  which  are  the  horizontal  lines  of  entries. 
Furthermore,  it  has  three  columns,  which  are  the  vertical  lines  of  entries.  The  second  and 
third  matrices  are  square  matrices,  which  means  that  each  has  as  many  rows  as  columns — 
3 and  2,  respectively.  The  entries  of  the  second  matrix  have  two  indices,  signifying  their 
location  within  the  matrix.  The  first  index  is  the  number  of  the  row  and  the  second  is  the 
number  of  the  column,  so  that  together  the  entry’s  position  is  uniquely  identified.  For 
example,  023  (read  a two  three ) is  in  Row  2 and  Column  3,  etc.  The  notation  is  standard 
and  applies  to  all  matrices,  including  those  that  are  not  square. 

Matrices  having  just  a single  row  or  column  are  called  vectors.  Thus,  the  fourth  matrix 
in  (1)  has  just  one  row  and  is  called  a row  vector.  The  last  matrix  in  (1)  has  just  one 
column  and  is  called  a column  vector.  Because  the  goal  of  the  indexing  of  entries  was 
to  uniquely  identify  the  position  of  an  element  within  a matrix,  one  index  suffices  for 
vectors,  whether  they  are  row  or  column  vectors.  Thus,  the  third  entry  of  the  row  vector 
in  (1)  is  denoted  by  03. 

Matrices  are  handy  for  storing  and  processing  data  in  applications.  Consider  the 
following  two  common  examples. 

EXAMPLE  1 Linear  Systems,  a Major  Application  of  Matrices 

We  are  given  a system  of  linear  equations,  briefly  a linear  system,  such  as 

4*!  + 6x2  + 9*3  = 6 
6x1  - 2x3  = 20 

5*!  — 8x2  + x3  = 10 

where  xi,  X2,  X3  are  the  unknowns.  We  form  the  coefficient  matrix,  call  it  A,  by  listing  the  coefficients  of  the 
unknowns  in  the  position  in  which  they  appear  in  the  linear  equations.  In  the  second  equation,  there  is  no 
unknown  X2,  which  means  that  the  coefficient  of  X2  is  0 and  hence  in  matrix  A,  ^22  = 0,  Thus, 
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"4 

6 

9" 

4 

6 

9 

6~ 

6 

0 

-2 

We  form  another  matrix  A = 

6 

0 

-2 

20 

_5 

-8 

1 

_5 

-8 

1 

10 

by  augmenting  A with  the  right  sides  of  the  linear  system  and  call  it  the  augmented  matrix  of  the  system. 

Since  we  can  go  back  and  recapture  the  system  of  linear  equations  directly  from  the  augmented  matrix  A,  A 
contains  all  the  information  of  the  system  and  can  thus  be  used  to  solve  the  linear  system.  This  means  that  we 
can  just  use  the  augmented  matrix  to  do  the  calculations  needed  to  solve  the  system.  We  shall  explain  this  in 
detail  in  Sec.  7.3.  Meanwhile  you  may  verify  by  substitution  that  the  solution  is  xi  = 3,  *2  = 2>  x3  = ~ 1- 
The  notation  xi,  X2,  *3  for  the  unknowns  is  practical  but  not  essential;  we  could  choose  x,  y,  z or  some  other 
letters. 

Sales  Figures  in  Matrix  Form 

Sales  figures  for  three  products  I,  II,  III  in  a store  on  Monday  (Mon),  Tuesday  (Tues),  • • • may  for  each  week 
be  arranged  in  a matrix 


Mon 

Tues 

Wed 

Thur 

Fri 

Sat 

Sun 

~40 

33 

81 

0 

21 

47 

33" 

I 

A = 

0 

12 

78 

50 

50 

96 

90 

■ II 

10 

0 

0 

27 

43 

78 

56 

III 

If  the  company  has  10  stores,  we  can  set  up  10  such  matrices,  one  for  each  store.  Then,  by  adding  corresponding 
entries  of  these  matrices,  we  can  get  a matrix  showing  the  total  sales  of  each  product  on  each  day.  Can  you  think 
of  other  data  which  can  be  stored  in  matrix  form?  For  instance,  in  transportation  or  storage  problems?  Or  in 
listing  distances  in  a network  of  roads? 

General  Concepts  and  Notations 

Let  us  formalize  what  we  just  have  discussed.  We  shall  denote  matrices  by  capital  boldface 
letters  A,  B,  C,  ■ ■ • , or  by  writing  the  general  entry  in  brackets;  thus  A = [a^],  and  so 
on.  By  an  m X n matrix  (read  m by  n matrix ) we  mean  a matrix  with  m rows  and  n 
columns — rows  always  come  first!  m X n is  called  the  size  of  the  matrix.  Thus  an  m X n 
matrix  is  of  the  form 


(2) 


A [ctjk] 


all 

a12 

d\n 

a21 

®22 

a2n 

tlm.l 

am2 

amn 

, 3 X 

3,  2 X 2,  1 

X 3,  and  2X1,  respectively 

Each  entry  in  (2)  has  two  subscripts.  The  first  is  the  row  number  and  the  second  is  the 
column  number.  Thus  a2 1 is  the  entry  in  Row  2 and  Column  1. 

If  m = n,  we  call  A an  n X n square  matrix.  Then  its  diagonal  containing  the  entries 
An,  a2 2,  • ■ ■ , ann  is  called  the  main  diagonal  of  A.  Thus  the  main  diagonals  of  the  two 
square  matrices  in  (1)  are  an,  a2 2,  O33  and  e~x,  4x,  respectively. 

Square  matrices  are  particularly  important,  as  we  shall  see.  A matrix  of  any  size  m X n 
is  called  a rectangular  matrix;  this  includes  square  matrices  as  a special  case. 
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DEFINITION 


EXAMPLE  3 


Vectors 

A vector  is  a matrix  with  only  one  row  or  column.  Its  entries  are  called  the  components 
of  the  vector.  We  shall  denote  vectors  by  lowercase  boldface  letters  a,  b,  • • • or  by  its 
general  component  in  brackets,  a = [aj\,  and  so  on.  Our  special  vectors  in  (1)  suggest 
that  a (general)  row  vector  is  of  the  form 


a = «2  ' ' ' an\. 

A column  vector  is  of  the  form 

b\ 
b>2 


Forinstance,  a=  [ — 2 5 0.8  0 1]. 


For  instance. 


Addition  and  Scalar  Multiplication 
of  Matrices  and  Vectors 

What  makes  matrices  and  vectors  really  useful  and  particularly  suitable  for  computers  is 
the  fact  that  we  can  calculate  with  them  almost  as  easily  as  with  numbers.  Indeed,  we 
now  introduce  rules  for  addition  and  for  scalar  multiplication  (multiplication  by  numbers) 
that  were  suggested  by  practical  applications.  (Multiplication  of  matrices  by  matrices 
follows  in  the  next  section.)  We  first  need  the  concept  of  equality. 


Equality  of  Matrices 

Two  matrices  A = [ajk]  and  B = [bj^]  are  equal,  written  A = B,  if  and  only  if 
they  have  the  same  size  and  the  corresponding  entries  are  equal,  that  is,  an  = £>n, 
«12  = ^12>  and  so  on.  Matrices  that  are  not  equal  are  called  different.  Thus,  matrices 
of  different  sizes  are  always  different. 


Equality  of  Matrices 

Let 


an 

a12 

4 O' 

A = 

«21 

a22. 

and 

B = 

3 -1 

Then 

«n  = 4,  a12  = 0, 

A = B if  and  only  if 

«21  — 3,  CI22  = — 1. 

The  following  matrices  are  all  different.  Explain! 


1 3' 

4 2 

4 T 

1 3 

o' 

0 1 

3' 

4 2 

i 3 

2 3 

4 2 

0 

0 4 

2 
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DEFINITION 


EXAMPLE  5 


Addition  of  Matrices 

The  sum  of  two  matrices  A = [ajk}  and  B = [bjk]  of  the  same  size  is  written 
A + B and  has  the  entries  alk  + bjk  obtained  by  adding  the  corresponding  entries 
of  A and  B.  Matrices  of  different  sizes  cannot  be  added. 


As  a special  case,  the  sum  a + b of  two  row  vectors  or  two  column  vectors,  which 
must  have  the  same  number  of  components,  is  obtained  by  adding  the  corresponding 
components. 

Addition  of  Matrices  and  Vectors 


-4  6 3' 

5 -1  0 

15  3' 

and  B = 

, then  A + B = 

0 ! 2 

3 1 0 

3 2 2. 

A in  Example  3 and  our  present  A cannot  be  added.  If  a = [5  7 2]  and  b = [—6  2 0],  then 

a + b = [ — 1 9 2], 

An  application  of  matrix  addition  was  suggested  in  Example  2.  Many  others  will  follow. 


Scalar  Multiplication  (Multiplication  by  a Number) 

The  product  of  any  m X n matrix  A = \alk\  and  any  scalar  c (number  c)  is  written 
cA  and  is  the  m X n matrix  cA  = \cajk\  obtained  by  multiplying  each  entry  of  A 
by  c. 


Here  ( — 1 )A  is  simply  written  —A  and  is  called  the  negative  of  A.  Similarly,  (— k)A  is 
written  — kA.  Also,  A + (-B)  is  written  A — B and  is  called  the  difference  of  A and  B 
(which  must  have  the  same  size!). 

Scalar  Multiplication 


~2.7 

-1.8" 

"-2.7 

1.8" 

, “a- 

" 3 

-2 

"o 

o” 

0 

0.9 

, then  —A  = 

0 

-0.9 

0 

1 

. 0A  = 

0 

0 

9 

_9.° 

— 4.5_ 

-9.0 

4.5_ 

10 

— 5_ 

0 

0 

If  a matrix  B shows  the  distances  between  some  cities  in  miles,  1.609B  gives  these  distances  in  kilometers. 


Rules  for  Matrix  Addition  and  Scalar  Multiplication.  From  the  familiar  laws  for  the 
addition  of  numbers  we  obtain  similar  laws  for  the  addition  of  matrices  of  the  same  size 
m X n,  namely. 


(3) 


(a)  A + B = B + A 

(b)  (A  + B)  + C = A + (B  + C)  (written  A + B + C) 

(c)  A + 0 = A 

(d)  A + (—A)  = 0. 


Here  0 denotes  the  zero  matrix  (of  size  m X n),  that  is,  the  m X n matrix  with  all  entries 
zero.  If  m = 1 or  n = 1,  this  is  a vector,  called  a zero  vector. 
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Hence  matrix  addition  is  commutative  and  associative  [by  (3a)  and  (3b)]. 
Similarly,  for  scalar  multiplication  we  obtain  the  rules 


(4) 


(a)  c(A  + B)  = cA  + cB 

(b)  (c  + k)A  = cA  + kA 

(c)  c(kA)  = (ck)A  (written  ckA) 

(d)  1A  = A. 


frqble^£E:T::7.i 


GENERAL  QUESTIONS 

1.  Equality.  Give  reasons  why  the  five  matrices  in 
Example  3 are  all  different. 

2.  Double  subscript  notation.  If  you  write  the  matrix  in 
Example  2 in  the  form  A = [a^],  what  is  a31?  a13? 

C'26  ->  «33? 

3.  Sizes.  What  sizes  do  the  matrices  in  Examples  1,  2,  3, 
and  5 have? 

4.  Main  diagonal.  What  is  the  main  diagonal  of  A in 
Example  1?  Of  A and  B in  Example  3? 

5.  Scalar  multiplication.  If  A in  Example  2 shows  the 
number  of  items  sold,  what  is  the  matrix  B of  units  sold 
if  a unit  consists  of  (a)  5 items  and  (b)  10  items? 

6.  If  a 12  X 12  matrix  A shows  the  distances  between 
12  cities  in  kilometers,  how  can  you  obtain  from  A the 
matrix  B showing  these  distances  in  miles? 

7.  Addition  of  vectors.  Can  you  add:  A row  and 
a column  vector  with  different  numbers  of  compo- 
nents? With  the  same  number  of  components?  Two 
row  vectors  with  the  same  number  of  components 
but  different  numbers  of  zeros?  A vector  and  a 
scalar?  A vector  with  four  components  and  a 2 X 2 
matrix? 

ADDITION  AND  SCALAR 
MULTIPLICATION  OF  MATRICES 
AND  VECTORS 

Let 
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3 

4 

3 - 
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> 
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3 

, w = 

—30 

-3.0 

2 

10 

Find  the  following  expressions,  indicating  which  of  the 

rules  in  (3)  or  (4)  they  illustrate,  or  give  reasons  why  they 

are  not  defined. 

8.  2A  + 4B,  4B  + 2A,  0A  + B,  0.4B  - 4.2A 

9.  3 A,  0.5B,  3A  + 0.5B,  3A  + 0.5B  + C 

10.  (4  • 3)A,  4(3A),  14B  — 3B,  11B 

11.  8C  + 10D,  2(5D  + 4C),  0.6C  - 0.6D, 

0.6(C  - D) 

12.  (C  + D)  + E,  (D  + E)  + C,  0(C  - E)  + 4D, 

A - 0C 

13.  (2  • 7)C,  2(7C),  -D  + 0E,  E - D + C + u 

14.  (5u  + 5v)  - gw,  — 20(u  + v)  + 2w, 

E — (u  + v),  10(u  + v)  + w 

15.  (u  + v)  — w,  u + (v  — w),  C + Ow, 

0E  + u - v 

16.  15v  — 3w  - Ou,  ~3w  + 15v,  D — u + 3C, 

8.5w  - 11. lu  + 0.4v 

17.  Resultant  of  forces.  If  the  above  vectors  u,  v,  w 
represent  forces  in  space,  their  sum  is  called  their 
resultant.  Calculate  it. 

18.  Equilibrium.  By  definition,  forces  are  in  equilibrium 
if  their  resultant  is  the  zero  vector.  Find  a force  p such 
that  the  above  u,  v,  w,  and  p are  in  equilibrium. 

19.  General  rules.  Prove  (3)  and  (4)  for  general  2X3 
matrices  and  scalars  c and  k. 
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20.  TEAM  PROJECT.  Matrices  for  Networks.  Matrices 
have  various  engineering  applications,  as  we  shall  see. 
For  instance,  they  can  be  used  to  characterize  connections 
in  electrical  networks,  in  nets  of  roads,  in  production 
processes,  etc.,  as  follows. 

(a)  Nodal  Incidence  Matrix.  The  network  in  Fig.  155 

consists  of  six  branches  (connections)  and  four  nodes 
(points  where  two  or  more  branches  come  together). 
One  node  is  the  reference  node  (grounded  node,  whose 
voltage  is  zero).  We  number  the  other  nodes  and 
number  and  direct  the  branches.  This  we  do  arbitrarily. 
The  network  can  now  be  described  by  a matrix 
A = [rtjjc],  where 

! + 1 if  branch  k leaves  node  (J) 

— 1 if  branch  k enters  node 

0 if  branch  k does  not  touch  node  . 

A is  called  the  nodal  incidence  matrix  of  the  network. 
Show  that  for  the  network  in  Fig.  155  the  matrix  A has 
the  given  form. 


3 


Branch 


1 2 3 4 5 6 


Node  (T) 
Node  (2) 
Node  (5) 


1 -1 

0 1 

0 0 


-1  0 

0 1 

1 0 


0 0 

1 0 

-1  -1 


Fig.  155.  Network  and  nodal  incidence 
matrix  in  Team  Project  20(a) 


(c)  Sketch  the  three  networks  corresponding  to  the 
nodal  incidence  matrices 
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(d)  Mesh  Incidence  Matrix.  A network  can  also  be 
characterized  by  the  mesh  incidence  matrix  M = [nijk], 
where 


+ 1 if  branch  k is  in  mesh  j 
and  has  the  same  orientation 


mjk  = < 


— 1 if  branch  k is  in  mesh  j 

and  has  the  opposite  orientation 


, 0 if  branch  k is  not  in  mesh  j 


and  a mesh  is  a loop  with  no  branch  in  its  interior  (or 
in  its  exterior).  Here,  the  meshes  are  numbered  and 
directed  (oriented)  in  an  arbitrary  fashion.  Show  that 
for  the  network  in  Fig.  157,  the  matrix  M has  the  given 
form,  where  Row  1 corresponds  to  mesh  1,  etc. 


(b)  Find  the  nodal  incidence  matrices  of  the  networks 
in  Fig.  156. 
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0-10 
0 1-1 
10  1 
10  0 
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Fig.  157.  Network  and  matrix  M in 
Team  Project  20(d) 
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7.2  Matrix  Multiplication 

Matrix  multiplication  means  that  one  multiplies  matrices  by  matrices.  Its  definition  is 
standard  but  it  looks  artificial.  Thus  you  have  to  study  matrix  multiplication  carefully, 
multiply  a few  matrices  together  for  practice  until  you  can  understand  how  to  do  it.  Here 
then  is  the  definition.  (Motivation  follows  later.) 


DEFINITION 


Multiplication  of  a Matrix  by  a Matrix 

The  product  C = AB  (in  this  order)  of  an  m X n matrix  A = [a.^J  times  an  r X p 
matrix  B = [bjif]  is  defined  if  and  only  if  r = n and  is  then  the  m X p matrix 
C = [c'jk]  with  entries 

n j =[,■■■,  m 

(1)  cjk  a:jlhk  tij \b\k  T aj2b2k  T ■ ■ * T ajnbnj c 

l=i  k = ■ ,p. 


The  condition  r = n means  that  the  second  factor,  B,  must  have  as  many  rows  as  the  first 
factor  has  columns,  namely  n.  A diagram  of  sizes  that  shows  when  matrix  multiplication 
is  possible  is  as  follows: 


A B = C 

[m  X n]  [n  X p]  = [ m X p]. 

The  entry  cj k in  (1)  is  obtained  by  multiplying  each  entry  in  the  yth  row  of  A by  the 
corresponding  entry  in  the  kth  column  of  B and  then  adding  these  n products.  For  instance, 
c2i  = a2i^n  + a22^2 1 + ■ • • + a27lhn\,  and  so  on.  One  calls  this  briefly  a multiplication 
of  rows  into  columns.  For  n = 3,  this  is  illustrated  by 


72  = 3 p = 2 p = 2 


Notations  in  a product  AB  = C 


where  we  shaded  the  entries  that  contribute  to  the  calculation  of  entry  C21  just  discussed. 

Matrix  multiplication  will  be  motivated  by  its  use  in  linear  transformations  in  this 
section  and  more  fully  in  Sec.  7.9. 

Let  us  illustrate  the  main  points  of  matrix  multiplication  by  some  examples.  Note  that 
matrix  multiplication  also  includes  multiplying  a matrix  by  a vector,  since,  after  all, 
a vector  is  a special  matrix. 


Matrix  Multiplication 


3 5-1 

2-2  3 1 

22  -2  43  42 

4 0 2 

5 0 7 8 

= 

26  -16  14  6 

1 

1 

OS 

1 

to 

1 

9-411 

-9  4 -37  -28 

Herein  = 3-  2 + 5-  5 + (—  1)  • 9 = 22,  and  so  on.  The  entry  in  the  box  is  C23  = 4-  3 + 0-  7 + 2-  l = 14. 
The  product  BA  is  not  defined. 
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EXAMPLE  2 


EXAMPLE  3 


EXAMPLE  4 


Multiplication  of  a Matrix  and  a Vector 


4 ■ 3 + 2 ■ 5 

_ 

22 

whereas 

3 

4 2 

1 ■ 3 + 8 ■ 5 

43 

5 

1 8 

is  undefined. 


Products  of  Row  and  Column  Vectors 


l" 

1 

"3  6 l" 

2 

= [19], 

2 

[3  6 1]  = 

6 12  2 

4 

4 

12  24  4 

CAUTION!  Matrix  Multiplication  Is  Not  Commutative,  AB  + BA  in  General 

This  is  illustrated  by  Examples  1 and  2,  where  one  of  the  two  products  is  not  even  defined,  and  by  Example  3, 
where  the  two  products  have  different  sizes.  But  it  also  holds  for  square  matrices.  For  instance, 


1 l" 

"-1 

1 

0 

o' 

but 

-1 

r 

1 r 

100  100 

1 

-1 

0 

0 

1 

-1 

100  100 

It  is  interesting  that  this  also  shows  that  AB  = 0 does  not  necessarily  imply  BA  = 0 or  A = 0 or  B = 0.  We 
shall  discuss  this  further  in  Sec.  7.8,  along  with  reasons  when  this  happens. 


Our  examples  show  that  in  matrix  products  the  order  of  factors  must  always  be  observed 
very  carefully.  Otherwise  matrix  multiplication  satisfies  rules  similar  to  those  for  numbers, 
namely. 

(a)  (AA)B  = £(AB)  = A(£B)  written  AAB  or  AAB 

(b)  A(BC)  = (AB)C  written  ABC 

(2) 

(c)  (A  + B)C  = AC  + BC 

(d)  C(A  + B)  = CA  + CB 

provided  A,  B.  and  C are  such  that  the  expressions  on  the  left  are  defined;  here,  k is  any 
scalar.  (2b)  is  called  the  associative  law.  (2c)  and  (2d)  are  called  the  distributive  laws. 

Since  matrix  multiplication  is  a multiplication  of  rows  into  columns,  we  can  write  the 
defining  formula  (1)  more  compactly  as 

(3)  cjk  = ajbk,  j = 1,  — , m;  k = 


where  a j is  the  jth  row  vector  of  A and  is  the  kth  column  vector  of  B,  so  that  in 
agreement  with  (1), 


bik 


ttjbfc  [ Uj  1 Clj2  ' ’ * &jn\ 


Qjlbl k #j2^2/c  * * • + Cljnbnk- 


byik 
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EXAMPLE  5 


EXAMPLE  6 


Product  in  Terms  of  Row  and  Column  Vectors 

If  A = [ a.jk  is  of  size  3X3  and  B = [ kji-  | is  of  size  3X4,  then 


(4) 


AB 


aibi  aib2  aib3  aib4 

a2bi  a2b2  a 2b;:  a2b4  . 

a3bi  a3b2  a3b3  a3b4 


Taking  a4  = [3  5 — 1 ] , a2 


[4  0 2],  etc.,  verify  (4)  for  the  product  in  Example  1. 


Parallel  processing  of  products  on  the  computer  is  facilitated  by  a variant  of  (3)  for 
computing  C = AB,  which  is  used  by  standard  algorithms  (such  as  in  Lapack).  In  this 
method,  A is  used  as  given,  B is  taken  in  terms  of  its  column  vectors,  and  the  product  is 
computed  columnwise;  thus, 


(5)  AB  = A[br  b2  •••  bp]  = [Ab,  Ab2  Abp]. 

Columns  of  B are  then  assigned  to  different  processors  (individually  or  several  to 
each  processor),  which  simultaneously  compute  the  columns  of  the  product  matrix 
Abi,  Ab2,  etc. 


Computing  Products  Columnwise  by  (5) 

To  obtain 


AB  = 

4 f 

-5  2_ 

3 0 7 

-14  6 

= 

11  4 34" 

-17  8 -23 

from  (5),  calculate  the  columns 

4 'll"3" 

ll" 

4 ii  r ° 

4]  \ 4 ll  [7 

-5 


-5  2 6 


34 

-23 


of  AB  and  then  write  them  as  a single  matrix,  as  shown  in  the  first  formula  on  the  right. 


Motivation  of  Multiplication 
by  Linear  Transformations 

Let  us  now  motivate  the  “unnatural”  matrix  multiplication  by  its  use  in  linear 
transformations.  For  n = 2 variables  these  transformations  are  of  the  form 


(6*) 


y l = an*i  + ai2-*2 

>’2  = a2\x  1 + 022*2 


and  suffice  to  explain  the  idea.  (For  general  n they  will  be  discussed  in  Sec.  7.9.)  For 
instance,  (6*)  may  relate  an  xi*2-coordinate  system  to  a >'iy2-coordinate  system  in  the 
plane.  In  vectorial  form  we  can  write  (6*)  as 


yi 

On  Oi2 

*1 

On*  1 + Oi2*2 

= Ax  = 

= 

y2_ 

021  o22 

.*2. 

«2i*  1 + o22.r2 
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EXAMPLE  7 


Now  suppose  further  that  the  x ix2“System  is  related  to  a wivi^-system  by  another  linear 
transformation,  say, 


X1 

= Bw  = 

bn 

b\2 

Wi 



fonWi  + b12w2 

.*2. 

b2\ 

b22_ 

_w2 

b2\W\  + b22w2_ 

Then  the  y^-system  is  related  to  the  vtqvt^-system  indirectly  via  the  .r1.r2-system,  and 
we  wish  to  express  this  relation  directly.  Substitution  will  show  that  this  direct  relation  is 
a linear  transformation,  too,  say, 


(8) 


C11  c12 

Wi 

C11W1  + c12w2 

<N 

<N 

<N 

1 

w2 

_C21W!  + C22W2_ 

Indeed,  substituting  (7)  into  (6),  we  obtain 


Vi  = anOiiWr  + b12w2)  + a12(b21Wi  + b22w2) 

= (fln^ii  + ai2^2i)v*T  + («11^12  + a12b22)w2 
V2  = + b12w2)  + a22(b21Wi  + ^22^2) 

= (#21^11  + t?22^’2l)vvl  + (fl21^12  + 022b22)W2. 


Comparing  this  with  (8),  we  see  that 

C11  = flll^ll  + fl12^21  c12  = a11^12  + fl12^22 

f -2 1 = 021^11  + Ct22b2 1 C22  = a2\b\2  + n22^22- 

This  proves  that  C = AB  with  the  product  defined  as  in  (1).  For  larger  matrix  sizes  the 
idea  and  result  are  exactly  the  same.  Only  the  number  of  variables  changes.  We  then  have 
m variables  y and  n variables  x and  p variables  w.  The  matrices  A,  B,  and  C = AB  then 
have  sizes  m X n,  n X p,  and  m X p,  respectively.  And  the  requirement  that  C be  the 
product  AB  leads  to  formula  (1)  in  its  general  form.  This  motivates  matrix  multiplication. 

Transposition 

We  obtain  the  transpose  of  a matrix  by  writing  its  rows  as  columns  (or  equivalently  its 
columns  as  rows).  This  also  applies  to  the  transpose  of  vectors.  Thus,  a row  vector  becomes 
a column  vector  and  vice  versa.  In  addition,  for  square  matrices,  we  can  also  “reflect” 
the  elements  along  the  main  diagonal,  that  is,  interchange  entries  that  are  symmetrically 
positioned  with  respect  to  the  main  diagonal  to  obtain  the  transpose.  Hence  a\2  becomes 
a2i,  a?>  1 becomes  013,  and  so  forth.  Example  7 illustrates  these  ideas.  Also  note  that,  if  A 
is  the  given  matrix,  then  we  denote  its  transpose  by  AT. 

Transposition  of  Matrices  and  Vectors 


5 

4” 

Ui 

1 

00 

1 

, then 

at  = 

-8 

0 

4 0 

0 

1 

0 

If 


A = 
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A little  more  compactly,  we  can  write 


5 

4~ 

~3  0 

7 

5 -8  1 

T 

= 

-8 

0 

8 -1 

5 

4 0 0 

1 

0 

1 -9 

4 

the  transpose  [6  2 

3]T 

of  the  row 

vector  [6  2 3]  is 

6 

V 

[6  2 3]t  = 

2 

Conversely, 

2 

3 

3 

3 8 1 

0-1  -9 

7 5 4 


= [6  2 3], 


DEFINITION 


Transposition  of  Matrices  and  Vectors 

The  transpose  of  an  m X n matrix  A = [a^]  is  the  n X m matrix  AT  (read  A 
transpose)  that  has  the  first  row  of  A as  its  first  column , the  second  row  of  A as  its 
second  column , and  so  on.  Thus  the  transpose  of  A in  (2)  is  AT  = \cip-:j],  written  out 


On 

a21 

Cl  ml 

(9) 

II 

i 

II 

a\2 

a22 

am2 

n 

a2  n 

Q-mn 

As  a special  case,  transposition  converts  row  vectors  to  column  vectors  and  conversely. 


Transposition  gives  us  a choice  in  that  we  can  work  either  with  the  matrix  or  its 
transpose,  whichever  is  more  convenient. 

Rules  for  transposition  are 


(10) 


(a)  (At)t  = A 

(b)  (A  + B)t  = At  + Bt 

(c)  (cA)t  = cAt 

(d)  (AB)t  = BtAt. 


CAUTION!  Note  that  in  (lOd)  the  transposed  matrices  are  in  reversed  order.  We  leave 
the  proofs  as  an  exercise  in  Probs.  9 and  10. 


Special  Matrices 

Certain  kinds  of  matrices  will  occur  quite  frequently  in  our  work,  and  we  now  list  the 
most  important  ones  of  them. 


Symmetric  and  Skew-Symmetric  Matrices.  Transposition  gives  rise  to  two  useful 
classes  of  matrices.  Symmetric  matrices  are  square  matrices  whose  transpose  equals  the 
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EXAMPLE  8 


EXAMPLE  9 


EXAMPLE  10 


matrix  itself.  Skew-symmetric  matrices  are  square  matrices  whose  transpose  equals 
minus  the  matrix.  Both  cases  are  defined  in  (11)  and  illustrated  by  Example  8. 

(11)  AT  = A (thus  a^j  = cijk),  At  = —A  (thus  ci^j  = —ctjk,  hence  cijj  = 0). 

Symmetric  Matrix  Skew-Symmetric  Matrix 

Symmetric  and  Skew-Symmetric  Matrices 


20 

120 

200 

0 

1 

-3_ 

A = 

120 

10 

150 

is  symmetric,  and 

B = 

-1 

0 

-2 

is  skew- symmetric 

200 

150 

30 

3 

2 

0 

For  instance,  if  a company  has  three  building  supply  centers  C±,  C2,  C3,  then  A could  show  costs,  say,  djj  for 
handling  1000  bags  of  cement  at  center  Cj , and  djk  (J  =£  k)  the  cost  of  shipping  1000  bags  from  Cj  to  C^.  Clearly, 
cijk  — dk j if  we  assume  shipping  in  the  opposite  direction  will  cost  the  same. 

Symmetric  matrices  have  several  general  properties  which  make  them  important.  This  will  be  seen  as  we 
proceed. 


Triangular  Matrices.  Upper  triangular  matrices  are  square  matrices  that  can  have  nonzero 
entries  only  on  and  above  the  main  diagonal,  whereas  any  entry  below  the  diagonal  must  be 
zero.  Similarly,  lower  triangular  matrices  can  have  nonzero  entries  only  on  and  below  the 
main  diagonal.  Any  entry  on  the  main  diagonal  of  a triangular  matrix  may  be  zero  or  not. 


Upper  and  Lower  Triangular  Matrices 


1 

4 

2 

T 

3 

0 

3 

2 

0 

2 

0 

0 

6 

Upper  triangular 


"3  0 

0 

o’ 

2 

0 

o’ 

9 -3 

0 

0 

8 

-1 

0 

■ 

1 0 

2 

0 

7 

6 

8 

1 9 

3 

6 

Lower  triangular 

Diagonal  Matrices.  These  are  square  matrices  that  can  have  nonzero  entries  only  on 
the  main  diagonal.  Any  entry  above  or  below  the  main  diagonal  must  be  zero. 

If  all  the  diagonal  entries  of  a diagonal  matrix  S are  equal,  say,  c,  we  call  S a scalar 
matrix  because  multiplication  of  any  square  matrix  A of  the  same  size  by  S has  the  same 
effect  as  the  multiplication  by  a scalar,  that  is, 

(12)  AS  = SA  = cA. 

In  particular,  a scalar  matrix,  whose  entries  on  the  main  diagonal  are  all  1 , is  called  a unit 
matrix  (or  identity  matrix)  and  is  denoted  by  In  or  simply  by  I.  For  I,  formula  (12)  becomes 

(13)  AI  = IA  = A. 

Diagonal  Matrix  D.  Scalar  Matrix  S.  Unit  Matrix  I 


_2 

0 

0" 

c 

0 

0" 

’l 

0 

0’ 

0 

-3 

0 

s = 

0 

c 

0 

, I = 

0 

1 

0 

0 

0 

0 

0 

0 

c 

0 

0 

1 

D = 
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Some  Applications  of  Matrix  Multiplication 

EXAMPLE  11  Computer  Production.  Matrix  Times  Matrix 

Supercomp  Ltd  produces  two  computer  models  PC  1086  and  PCI  186.  The  matrix  A shows  the  cost  per  computer 
(in  thousands  of  dollars)  and  B the  production  figures  for  the  year  2010  (in  multiples  of  10,000  units.)  Find  a 
matrix  C that  shows  the  shareholders  the  cost  per  quarter  (in  millions  of  dollars)  for  raw  material,  labor,  and 
miscellaneous. 


PC  1086  PCI  186 


1.2 

1.6" 

Raw  Components 

A = 

0.3 

0.4 

Labor 

0.5 

0.6 

Miscellaneous 

Solution. 


Quarter 

2 3 

4 

8 6 

9 

PC  1086 

2 4 

3 

PCI  186 

Quarter 


1 

2 

3 

4 

13.2 

12.8 

13.6 

15. 6~ 

Raw  Components 

3.3 

3.2 

3.4 

3.9 

Labor 

5.1 

5.2 

5.4 

6.3 

Miscellaneous 

Since  cost  is  given  in  multiples  of  $1000  and  production  in  multiples  of  10,000  units,  the  entries  of  C are 
multiples  of  $10  millions;  thus  c\\  = 13.2  means  $132  million,  etc. 


EXAMPLE  12  Weight  Watching.  Matrix  Times  Vector 

Suppose  that  in  a weight-watching  program,  a person  of  185  lb  burns  350  cal/hr  in  walking  (3  mph),  500  in 
bicycling  (13  mph),  and  950  in  jogging  (5.5  mph).  Bill,  weighing  185  lb,  plans  to  exercise  according  to  the 
matrix  shown.  Verify  the  calculations  (W  = Walking,  B = Bicycling,  J = Jogging). 


W B J 


MON 

1.0 

0 

0.5 

825 

MON 

~350~ 

WED 

1.0 

1.0 

0.5 

500 

— 

1325 

WED 

FRI 

1.5 

0 

0.5 

_950_ 

1000 

FRI 

SAT 

2.0 

1.5 

1.0 

2400 

SAT 

EXAMPLE  13  Markov  Process.  Powers  of  a Matrix.  Stochastic  Matrix 

Suppose  that  the  2004  state  of  land  use  in  a city  of  60  mi2  of  built-up  area  is 

C:  Commercially  Used  25%  I:  Industrially  Used  20%  R:  Residentially  Used  55%. 

Find  the  states  in  2009,  2014,  and  2019,  assuming  that  the  transition  probabilities  for  5-year  intervals  are  given 
by  the  matrix  A and  remain  practically  the  same  over  the  time  considered. 


From  C 

From  I 

From  R 

0.7 

0.1 

0 

ToC 

A = 

0.2 

0.9 

0.2 

To  I 

0.1 

0 

0.8 

ToR 
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A is  a stochastic  matrix,  that  is,  a square  matrix  with  all  entries  nonnegative  and  all  column  sums  equal  to  1 . 
Our  example  concerns  a Markov  process,1  that  is,  a process  for  which  the  probability  of  entering  a certain  state 
depends  only  on  the  last  state  occupied  (and  the  matrix  A),  not  on  any  earlier  state. 

Solution.  From  the  matrix  A and  the  2004  state  we  can  compute  the  2009  state, 


0.7  ■ 25  + 0.1  ■ 20  + 0 ■ 55 

0.7 

0.1 

0 

~25~ 

19. 5~ 

0.2  ■ 25  + 0.9  ■ 20  + 0.2  • 55 

= 

0.2 

0.9 

0.2 

20 

= 

34.0 

0.1  ■ 25  + 0 ■ 20  + 0.8  ■ 55 

0.1 

0 

0.8_ 

_55_ 

46. 5_ 

To  explain:  The  2009  figure  for  C equals  25%  times  the  probability  0.7  that  C goes  into  C,  plus  20%  times  the 
probability  0.1  that  I goes  into  C,  plus  55%  times  the  probability  0 that  R goes  into  C.  Together, 

25  • 0.7  + 20  • 0.1  + 55  • 0 = 19.5  [%].  Also  25  • 0.2  + 20  • 0.9  + 55  • 0.2  = 34  [%]. 

Similarly,  the  new  R is  46.5%.  We  see  that  the  2009  state  vector  is  the  column  vector 

y = [19.5  34.0  46.5]T  = Ax  = A [25  20  55]T 

where  the  column  vector  x = [25  20  55  ]T  is  the  given  2004  state  vector.  Note  that  the  sum  of  the  entries  of 

y is  100  [%].  Similarly,  you  may  verify  that  for  2014  and  2019  we  get  the  state  vectors 

z — Ay  = A(Ax)  = A2x  = [17.05  43.80  39.15]T 
u = Az  — A2y  — A3x  = [16.315  50.660  33.025]7. 

Answer.  In  2009  the  commercial  area  will  be  19.5%  (11.7  mi2),  the  industrial  34%  (20.4  mi2),  and  the 
residential  46.5%  (27.9  mi2).  For  2014  the  corresponding  figures  are  17.05%,  43.80%,  and  39.15%.  For  2019 
they  are  16.315%,  50.660%,  and  33.025%.  (In  Sec.  8.2  we  shall  see  what  happens  in  the  limit,  assuming  that 
those  probabilities  remain  the  same.  In  the  meantime,  can  you  experiment  or  guess?) 


yHFQ-B^L-E^^ST-T— T— 2 


1-10 


GENERAL  QUESTIONS 


1.  Multiplication.  Why  is  multiplication  of  matrices 
restricted  by  conditions  on  the  factors? 


2.  Square  matrix.  What  form  does  a 3 X 3 matrix  have 
if  it  is  symmetric  as  well  as  skew-symmetric? 


3.  Product  of  vectors.  Can  every  3X3  matrix  be 
represented  by  two  vectors  as  in  Example  3? 


4.  Skew-symmetric  matrix.  How  many  different  entries 
can  a 4 X 4 skew-symmetric  matrix  have?  An  n X n 
skew-symmetric  matrix? 


5.  Same  questions  as  in  Prob.  4 for  symmetric  matrices. 

6.  Triangular  matrix.  If  Ui,  U2  are  upper  triangular  and 
Lj,  L2  are  lower  triangular,  which  of  the  following  are 
triangular? 


Ui  + U2,  U1U2,  uf,  Ui  + L1;  UiLi, 
Li  + L2 


7.  Idempotent  matrix,  defined  by  A2  = A.  Can  you  find 
four  2X2  idempotent  matrices? 


8.  Nilpotent  matrix,  defined  by  Bm  = 0 for  some  m. 
Can  you  find  three  2X2  nilpotent  matrices? 

9.  Transposition.  Can  you  prove  (10a)-(10c)  for  3 X 3 
matrices?  For  m X n matrices? 

10.  Transposition,  (a)  Illustrate  (lOd)  by  simple  examples, 
(b)  Prove  (lOd). 


11-20 


MULTIPLICATION,  ADDITION,  AND 
TRANSPOSITION  OF  MATRICES  AND 
VECTORS 


Let 


4 

-2 

3" 

1 - 

-3 

0" 

A = 

-2 

1 

6 

, B = 

-3 

1 

0 

1 

2 

2 

0 

0 - 

2 

0 

1 

3~ 

C = 

3 

2 

. 

a = [1  - 

2 0], 

b = 

1 

-2 

0 

-1 

1ANDREI  ANDREJEVITCH  MARKOV  (1856-1922),  Russian  mathematician,  known  for  his  work  in 

probability  theory. 
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Showing  all  intermediate  results,  calculate  the  following 
expressions  or  give  reasons  why  they  are  undefined: 


11. 

AB, 

abt. 

BA, 

BtA 

12. 

AAt, 

A2, 

BBt 

B2 

13. 

CCT, 

BC, 

CB, 

CtB 

14. 

3A  - 

2B, 

(3A  - 

2B)t, 

(3A  - 

- 2B)T 

aT 

15. 

Aa, 

AaT, 

(Ab)T 

, bTAT 

16. 

BC, 

BCt, 

Bb, 

bTB 

17.  ABC,  ABa,  ABb,  CaT 

18.  ab,  ba.  aA,  Bb 

19.  1.5a  + 3.0b,  1.5aT  + 3.0b,  (A  - B)b,  Ab  - Bb 

20.  bTAb,  aBaT,  aCCT,  CTba 

21.  General  rules.  Prove  (2)  for  2 X 2 matrices  A = [o^], 
B = [bjjf  ],  C = [cjk\,  and  a general  scalar. 

22.  Product.  Write  AB  in  Prob.  1 1 in  terms  of  row  and 
column  vectors. 

23.  Product.  Calculate  AB  in  Prob.  11  columnwise.  See 
Example  1. 

24.  Commutativity.  Find  all  2 X 2 matrices  A = 
that  commute  with  B = [bj^\,  where  bj ^ = j + k. 

25.  TEAM  PROJECT.  Symmetric  and  Skew-Symmetric 
Matrices.  These  matrices  occur  quite  frequently  in 
applications,  so  it  is  worthwhile  to  study  some  of  their 
most  important  properties. 

(a)  Verify  the  claims  in  (11)  that  a^j  = ajk  for  a 
symmetric  matrix,  and  a^j  = — for  a skew- 
symmetric  matrix.  Give  examples. 

(b)  Show  that  for  every  square  matrix  C the  matrix 
C + CT  is  symmetric  and  C — CT  is  skew-symmetric. 
Write  C in  the  form  C = S + T,  where  S is  symmetric 
and  T is  skew-symmetric  and  find  S and  T in  terms 
of  C.  Represent  A and  B in  Probs.  1 1-20  in  this  form. 

(c)  A linear  combination  of  matrices  A,  B,  C,  ■ • • , M 
of  the  same  size  is  an  expression  of  the  form 

(14)  uA  + bB  + cC  + ■ ■ ■ + mM, 


where  a,  ■ ■ ■ , m are  any  scalars.  Show  that  if  these 
matrices  are  square  and  symmetric,  so  is  (14);  similarly, 
if  they  are  skew-symmetric,  so  is  (14). 

(d)  Show  that  AB  with  symmetric  A and  B is  symmetric 
if  and  only  if  A and  B commute,  that  is,  AB  = BA. 

(e)  Under  what  condition  is  the  product  of  skew- 
symmetric  matrices  skew-symmetric? 


26-30 


FURTHER  APPLICATIONS 


26.  Production.  In  a production  process,  let  N mean  “no 
trouble”  and  T “trouble.”  Let  the  transition  probabilities 
from  one  day  to  the  next  be  0.8  for  N —*  N,  hence  0.2 
for  N ->  T,  and  0.5  for  T -»  N,  hence  0.5  for  T -»  T. 


If  today  there  is  no  trouble,  what  is  the  probability  of 
N two  days  after  today?  Three  days  after  today? 

27.  CAS  Experiment.  Markov  Process.  Write  a program 
for  a Markov  process.  Use  it  to  calculate  further  steps 
in  Example  13  of  the  text.  Experiment  with  other 
stochastic  3X3  matrices,  also  using  different  starting 
values. 

28.  Concert  subscription.  In  a community  of  100,000 
adults,  subscribers  to  a concert  series  tend  to  renew  their 
subscription  with  probability  90%  and  persons  presently 
not  subscribing  will  subscribe  for  the  next  season  with 
probability  0.2%.  If  the  present  number  of  subscribers 
is  1200,  can  one  predict  an  increase,  decrease,  or  no 
change  over  each  of  the  next  three  seasons? 

29.  Profit  vector.  Two  factory  outlets  F\  and  F 2 in  New 
York  and  Los  Angeles  sell  sofas  (S),  chairs  (C),  and 
tables  (T)  with  a profit  of  $35,  $62,  and  $30,  respectively. 
Let  the  sales  in  a certain  week  be  given  by  the  matrix 


S 

c 

T 

400 

60 

240 

Fi 

100 

120 

500 

f2 

Introduce  a “profit  vector”  p such  that  the  components 
of  v = Ap  give  the  total  profits  of  Fi  and  F 2. 

30.  TEAM  PROJECT.  Special  Linear  Transformations. 
Rotations  have  various  applications.  We  show  in  this 
project  how  they  can  be  handled  by  matrices. 

(a)  Rotation  in  the  plane.  Show  that  the  linear 
transformation  y = Ax  with 


cos  0 

— sin  0 

Xi 

yi 

A = 

sin  0 

cos  0 

, X = 

*2. 

> y = 

L2. 

is  a counterclockwise  rotation  of  the  Cartesian  x in- 
coordinate system  in  the  plane  about  the  origin,  where 
0 is  the  angle  of  rotation. 

(b)  Rotation  through  nO.  Show  that  in  (a) 


An 


cos  nO  — sinnf? 

Sin  770  COS  770 


Is  this  plausible?  Explain  this  in  words. 

(c)  Addition  formulas  for  cosine  and  sine.  By 

geometry  we  should  have 


cos  a 

— sin  a 

cos  fi 

— sin  fi 

sin  a 

cos  a 

sin  fi 

cos  fi 

cos  (a  + fi)  —sin  (a  + /3) 

sin  (a  + fi)  cos  ( a + fi) 


Derive  from  this  the  addition  formulas  (6)  in  App.  A3.1. 
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(d)  Computer  graphics.  To  visualize  a three- 
dimensional  object  with  plane  faces  (e.g.,  a cube),  we 
may  store  the  position  vectors  of  the  vertices  with 
respect  to  a suitable  x1x2x3-coordinate  system  (and  a 
list  of  the  connecting  edges)  and  then  obtain  a two- 
dimensional  image  on  a video  screen  by  projecting 
the  object  onto  a coordinate  plane,  for  instance,  onto 
the  X]X2-plane  by  setting  x3  = 0.  To  change  the 
appearance  of  the  image,  we  can  impose  a linear 
transformation  on  the  position  vectors  stored.  Show 
that  a diagonal  matrix  D with  main  diagonal  entries  3, 
1,  g gives  from  an  x = [xj]  the  new  position  vector 
y = Dx,  where  y1  = 3xi  (stretch  in  the  Xi-direction 
by  a factor  3),  y2  = x2  (unchanged),  y3  = |x3  (con- 
traction in  the  x3-direction).  What  effect  would  a scalar 
matrix  have? 


(e)  Rotations  in  space.  Explain  y = Ax  geometrically 
when  A is  one  of  the  three  matrices 


1 0 0 

0 cos  9 —sin  9 , 

0 sin  9 cos  6 


cos  9 

0 

— sin  9 

COS  l/f 

— sin  i// 

0 

0 

1 

0 

> 

sin  i jj 

COS  i fj 

0 

sin  9 

0 

cos  9 

0 

0 

1 

What  effect  would  these  transformations  have  in  situations 
such  as  that  described  in  (d)? 


7.3  Linear  Systems  of  Equations. 
Gauss  Elimination 


We  now  come  to  one  of  the  most  important  use  of  matrices,  that  is,  using  matrices  to 
solve  systems  of  linear  equations.  We  showed  informally,  in  Example  1 of  Sec.  7.1,  how 
to  represent  the  information  contained  in  a system  of  linear  equations  by  a matrix,  called 
the  augmented  matrix.  This  matrix  will  then  be  used  in  solving  the  linear  system  of 
equations.  Our  approach  to  solving  linear  systems  is  called  the  Gauss  elimination  method. 
Since  this  method  is  so  fundamental  to  linear  algebra,  the  student  should  be  alert. 

A shorter  term  for  systems  of  linear  equations  is  just  linear  systems.  Linear  systems 
model  many  applications  in  engineering,  economics,  statistics,  and  many  other  areas. 
Electrical  networks,  traffic  flow,  and  commodity  markets  may  serve  as  specific  examples 
of  applications. 

Linear  System,  Coefficient  Matrix,  Augmented  Matrix 

A linear  system  of  m equations  in  n unknowns  x±,  ■■  ■ ,xn  is  a set  of  equations  of 
the  form 


(1) 


flnxi  + ■ • • + alnxn  = bi 

fl2i*i  + ' ■ • + a2nxn  = b2 


£ZmlXi  T * ‘ * T amnXn  brn  ■ 

The  system  is  called  linear  because  each  variable  Xj  appears  in  the  first  power  only,  just 
as  in  the  equation  of  a straight  line,  an,  • • • , amn  are  given  numbers,  called  the  coefficients 
of  the  system.  b\,  ■ ■ ■ , bm  on  the  right  are  also  given  numbers.  If  all  the  bj  are  zero,  then 
(1)  is  called  a homogeneous  system.  If  at  least  one  bj  is  not  zero,  then  (1)  is  called  a 

nonhomogeneous  system. 
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A solution  of  (1)  is  a set  of  numbers  X\,---,xn  that  satisfies  all  the  m equations. 
A solution  vector  of  (1)  is  a vector  x whose  components  form  a solution  of  (1).  If  the 
system  (1)  is  homogeneous,  it  always  has  at  least  the  trivial  solution  x\  = 0,  • • • , xn  = 0. 

Matrix  Form  of  the  Linear  System  (1).  From  the  definition  of  matrix  multiplication 
we  see  that  the  m equations  of  (1)  may  be  written  as  a single  vector  equation 

(2)  Ax  = b 


where  the  coefficient  matrix  A = 

[fljfc]  is  the  m 

X n matrix 

an 

fll2 

a\n 

xi 

A = 

a21 

«22 

a2n 

, and 

X = 

am2 

amn 

h 


b = 


h 

um 


are  column  vectors.  We  assume  that  the  coefficients  are  not  all  zero,  so  that  A is 
not  a zero  matrix.  Note  that  x has  n components,  whereas  b has  m components.  The 
matrix 


an 

aim  1 bl 
1 
1 

a mi 

1 

®mn  1 Pm 

is  called  the  augmented  matrix  of  the  system  (1).  The  dashed  vertical  line  could  be 
omitted,  as  we  shall  do  later.  It  is  merely  a reminder  that  the  last  column  of  A did  not 
come  from  matrix  A but  came  from  vector  b.  Thus,  we  augmented  the  matrix  A. 

Note  that  the  augmented  matrix  A determines  the  system  (1)  completely  because  it 
contains  all  the  given  numbers  appearing  in  (1). 


Geometric  Interpretation.  Existence  and  Uniqueness  of  Solutions 

If  m = n — 2,  we  have  two  equations  in  two  unknowns  xi,X2 


011*1  + 012*2  = h 
021*  1 + 022*2  = h- 


If  we  interpret  xi,  X2  as  coordinates  in  the  x i* 2-plane,  then  each  of  the  two  equations  represents  a straight  line, 
and  (jti,  X2 ) is  a solution  if  and  only  if  the  point  P with  coordinates  x\9  X2  lies  on  both  lines.  Hence  there  are 
three  possible  cases  (see  Fig.  158  on  next  page): 

(a)  Precisely  one  solution  if  the  lines  intersect 

( b ) Infinitely  many  solutions  if  the  lines  coincide 

(c)  No  solution  if  the  lines  are  parallel 
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Infinitely 
many  solutions 


For  instance, 


If  the  system  is  homogenous.  Case  (c)  cannot  happen,  because  then  those  two  straight  lines  pass  through  the 
origin,  whose  coordinates  (0,  0)  constitute  the  trivial  solution.  Similarly,  our  present  discussion  can  be  extended 
from  two  equations  in  two  unknowns  to  three  equations  in  three  unknowns.  We  give  the  geometric  interpretation 
of  three  possible  cases  concerning  solutions  in  Fig.  158.  Instead  of  straight  lines  we  have  planes  and  the  solution 
depends  on  the  positioning  of  these  planes  in  space  relative  to  each  other.  The  student  may  wish  to  come  up 
with  some  specific  examples. 


Fig.  158.  Three 
equations  in 
three  unknowns 
interpreted  as 
planes  in  space 


Our  simple  example  illustrated  that  a system  (1)  may  have  no  solution.  This  leads  to  such 
questions  as:  Does  a given  system  (1)  have  a solution?  Under  what  conditions  does  it  have 
precisely  one  solution?  If  it  has  more  than  one  solution,  how  can  we  characterize  the  set 
of  all  solutions?  We  shall  consider  such  questions  in  Sec.  7.5. 

First,  however,  let  us  discuss  an  important  systematic  method  for  solving  linear  systems. 

Gauss  Elimination  and  Back  Substitution 

The  Gauss  elimination  method  can  be  motivated  as  follows.  Consider  a linear  system  that 
is  in  triangular  form  (in  full,  upper  triangular  form)  such  as 

2x\  + 5x2  = 2 

13x2  = -26 


(Triangular  means  that  all  the  nonzero  entries  of  the  corresponding  coefficient  matrix  lie 
above  the  diagonal  and  form  an  upside-down  90°  triangle.)  Then  we  can  solve  the  system 
by  back  substitution,  that  is,  we  solve  the  last  equation  for  the  variable,  x 2 = —26/13  = —2, 
and  then  work  backward,  substituting  X2  = — 2 into  the  first  equation  and  solving  it  for  xi, 
obtainingxi  = -2  <2  — 5x2)  = \ (2  — 5 • (—2))  = 6.  This  gives  us  the  idea  of  first  reducing 
a general  system  to  triangular  form.  For  instance,  let  the  given  system  be 


2x1  + 5x2  = 2 

Its  augmented  matrix  is 

— 4x1  + 3x2  = —30. 


2 5 2 

-4  3 -30 


We  leave  the  first  equation  as  it  is.  We  eliminate  xi  from  the  second  equation,  to  get  a 
triangular  system.  For  this  we  add  twice  the  first  equation  to  the  second,  and  we  do  the  same 
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operation  on  the  rows  of  the  augmented  matrix.  This  gives  —4x1  + 4x  ] + 3^2  + 10x2  = 
-30  + 2-2,  that  is, 

2xi  + 5x2  = 2 2 5 2 

13x2  = “26  Row  2 + 2 Row  1013  —26 

where  Row  2 + 2 Row  1 means  “Add  twice  Row  1 to  Row  2“  in  the  original  matrix.  This 
is  the  Gauss  elimination  (for  2 equations  in  2 unknowns)  giving  the  triangular  form,  from 
which  back  substitution  now  yields  X2  = —2  and  xi  = 6,  as  before. 

Since  a linear  system  is  completely  determined  by  its  augmented  matrix.  Gauss 
elimination  can  be  done  by  merely  considering  the  matrices,  as  we  have  just  indicated. 
We  do  this  again  in  the  next  example,  emphasizing  the  matrices  by  writing  them  first  and 
the  equations  behind  them,  just  as  a help  in  order  not  to  lose  track. 


Gauss  Elimination.  Electrical  Network 

Solve  the  linear  system 


+ 

N 

H 

1 

H 

o 

II 

eo 

* 

-*1  + X2  - 

x3  = 0 

10X2  + 

25x3  = 90 

20x i + 10x2 

= 80. 

Derivation  from  the  circuit  in  Fig.  159  {Optional).  This  is  the  system  for  the  unknown  currents 
Xi  = i i,  *2  = *2>  x3  = *3  in  the  electrical  network  in  Fig.  159.  To  obtain  it,  we  label  the  currents  as  shown, 
choosing  directions  arbitrarily;  if  a current  will  come  out  negative,  this  will  simply  mean  that  the  current  flows 
against  the  direction  of  our  arrow.  The  current  entering  each  battery  will  be  the  same  as  the  current  leaving  it. 
The  equations  for  the  currents  result  from  Kirchhoff  s laws: 

Kirchhoff’s  Current  Law  (KCL).  At  any  point  of  a circuit,  the  sum  of  the  inflowing  currents  equals  the  sum 
of  the  outflowing  currents. 

Kirchhoff’s  Voltage  Law  (KVL).  In  any  closed  loop,  the  sum  of  all  voltage  drops  equals  the  impressed 
electromotive  force. 

Node  P gives  the  first  equation,  node  Q the  second,  the  right  loop  the  third,  and  the  left  loop  the  fourth,  as 
indicated  in  the  figure. 


— VW 


80  V 


20  Q,  „ ion 

AAA — 


10  n 


90  v 


Node  P: 
Node  Q: 
Right  loop: 


i,  - i0  + i0  = 0 


10  + 25L  = 90 


AW — 

15  Q. 


Left  loop:  20^  + 10^  =80 


Fig.  159.  Network  in  Example  2 and  equations  relating  the  currents 


Solution  by  Guuss  Elimination.  This  system  could  be  solved  rather  quickly  by  noticing  its  particular 
form.  But  this  is  not  the  point.  The  point  is  that  the  Gauss  elimination  is  systematic  and  will  work  in  general. 
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also  for  large  systems.  We  apply  it  to  our  system  and  then  do  back  substitution.  As  indicated,  let  us  write  the 
augmented  matrix  of  the  system  first  and  then  the  system  itself: 


Augmented  Matrix  A 


Pivot  1 » 

T 

0 

1 

I 0 
1 

-1 

1 

-1 

1 0 

Eliminate > 

0 

10 

25 

1 90 

20 

10 

0 

' 80 

Pivot  1 


Eliminate » 


Equations 

( *i)~  *2  + *3  = 0 

+ *2  “ *3=0 
10*2  + 25*3  = 90 
+ 10*2  = 80. 


20*i 


Step  1.  Elimination  of  x i 

Call  the  first  row  of  A the  pivot  row  and  the  first  equation  the  pivot  equation.  Call  the  coefficient  1 of  its 
*i-term  the  pivot  in  this  step.  Use  this  equation  to  eliminate  *i  (get  rid  of  *i)  in  the  other  equations.  For  this,  do: 

Add  1 times  the  pivot  equation  to  the  second  equation. 

Add  —20  times  the  pivot  equation  to  the  fourth  equation. 

This  corresponds  to  row  operations  on  the  augmented  matrix  as  indicated  in  BLUE  behind  the  new  matrix  in 
(3).  So  the  operations  are  performed  on  the  preceding  matrix.  The  result  is 


’l 

-1 

1 

1 

1 

o" 

*1  - *2  + *3=0 

0 

0 

0 

1 

1 

0 

Row  2 + Row  1 

0=0 

0 

10 

25 

1 

1 

1 

90 

10*2  + 25*3  = 90 

0 

30 

-20 

1 

1 

80 

Row  4 — 20  Row  1 

30*2  - 20*3  = 80 

Step  2.  Elimination  of  *2 

The  first  equation  remains  as  it  is.  We  want  the  new  second  equation  to  serve  as  the  next  pivot  equation.  But 
since  it  has  no  *2-term  (in  fact,  it  is  0 = 0),  we  must  first  change  the  order  of  the  equations  and  the  corresponding 
rows  of  the  new  matrix.  We  put  0 = 0 at  the  end  and  move  the  third  equation  and  the  fourth  equation  one  place 
up.  This  is  called  partial  pivoting  (as  opposed  to  the  rarely  used  total  pivoting , in  which  the  order  of  the 
unknowns  is  also  changed).  It  gives 


1 

-1 

1 l 
1 

0 

*1  - 

*2 

+ *3=0 

Pivot  10 » 

0 

0 

25  l 
1 

90 

Pivot  10 * 

+ 25*3  = 90 

nate  30 > 

0 

10 

-20  1 

80 

Eliminate  30*2 * 

30*2 

- 20*3  = 80 

0 

0 

0 1 

0 

0=0. 

To  eliminate  *2,  do: 

Add  —3  times  the  pivot  equation  to  the  third  equation. 
The  result  is 


1 

-1 

1 1 
1 

o” 

X 

1 

£ 

+ 

£ 

II 

0 

0 

10 

25  1 

1 

90 

10*2  + 25*3  = 

90 

0 

0 

-95  1 
1 

-190 

Row  3 — 3 Row  2 

- 95*3  = 

-190 

0 

0 

0 1 

0 

0 = 

0. 

Bcick  Substitution.  Determination  of  x 3,  *2,  *1  (in  this  order) 

Working  backward  from  the  last  to  the  first  equation  of  this  “triangular”  system  (4),  we  can  now  readily  find 
*3,  then  *2,  and  then  x\\ 


-95*3  =~190  *3  = *3  = 2 [A] 

10;t2  + 25*3  =90  *2  = ro (90  - 25*3)  = i2  = 4 [A] 

*1  - *2  + *3  = 0 *1  = *2  — *3  = 'l  = 2 [A] 


where  A stands  for  “amperes.”  This  is  the  answer  to  our  problem.  The  solution  is  unique. 
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Elementary  Row  Operations.  Row-Equivalent  Systems 

Example  2 illustrates  the  operations  of  the  Gauss  elimination.  These  are  the  first  two  of 
three  operations,  which  are  called 

Elementary  Row  Operations  for  Matrices: 

Interchange  of  two  rows 

Addition  of  a constant  multiple  of  one  row  to  another  row 
Multiplication  of  a row  by  a nonzero  constant  c 

CAUTION!  These  operations  are  for  rows,  not  for  columns ! They  correspond  to  the 
following 

Elementary  Operations  for  Equations: 

Interchange  of  two  equations 

Addition  of  a constant  multiple  of  one  equation  to  another  equation 
Multiplication  of  an  equation  by  a nonzero  constant  c 

Clearly,  the  interchange  of  two  equations  does  not  alter  the  solution  set.  Neither  does  their 
addition  because  we  can  undo  it  by  a corresponding  subtraction.  Similarly  for  their 
multiplication,  which  we  can  undo  by  multiplying  the  new  equation  by  1/c  (since  c A 0), 
producing  the  original  equation. 

We  now  call  a linear  system  S±  row-equivalent  to  a linear  system  .S'2  if  Si  can  be 
obtained  from  S2  by  (finitely  many!)  row  operations.  This  justifies  Gauss  elimination  and 
establishes  the  following  result. 


THEOREM  1 


Row-Equivalent  Systems 

Row -equivalent  linear  systems  have  the  same  set  of  solutions. 


Because  of  this  theorem,  systems  having  the  same  solution  sets  are  often  called 
equivalent  systems.  But  note  well  that  we  are  dealing  with  row  operations.  No  column 
operations  on  the  augmented  matrix  are  permitted  in  this  context  because  they  would 
generally  alter  the  solution  set. 

A linear  system  (1)  is  called  overdetermined  if  it  has  more  equations  than  unknowns, 
as  in  Example  2,  determined  if  m = n,  as  in  Example  1,  and  underdetermined  if  it  has 
fewer  equations  than  unknowns. 

Furthermore,  a system  (1)  is  called  consistent  if  it  has  at  least  one  solution  (thus,  one 
solution  or  infinitely  many  solutions),  but  inconsistent  if  it  has  no  solutions  at  all,  as 
Xi+  x2=  1,X!  + x2  = 0 in  Example  1,  Case  (c). 


Gauss  Elimination:  The  Three  Possible 
Cases  of  Systems 

We  have  seen,  in  Example  2,  that  Gauss  elimination  can  solve  linear  systems  that  have  a 
unique  solution.  This  leaves  us  to  apply  Gauss  elimination  to  a system  with  infinitely 
many  solutions  (in  Example  3)  and  one  with  no  solution  (in  Example  4). 
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EXAMPLE  3 


EXAMPLE  4 


Gauss  Elimination  if  Infinitely  Many  Solutions  Exist 

Solve  the  following  linear  system  of  three  equations  in  four  unknowns  whose  augmented  matrix  is 


3.0 

2.0 

2.0 

-5.0 

I 8.0 
1 

^.0*  j) 

+ 2.0x2 

+ 2.0*3 

- 5.0*4 = 

= 8.0 

0.6 

1.5 

1.5 

-5.4 

1 2.7 

Thus, 

0.6*  i 

+ 1.5*2 

+ 1.5*3 

- 5.4*4 = 

= 2.7 

1.2 

-0.3 

-0.3 

2.4 

1 2.1 

1.2*1 

- 0.3*2 

- 0.3*3 

+ 2.4x4  ~ 

= 2.1 

Solution.  As  in  the  previous  example,  we  circle  pivots  and  box  terms  of  equations  and  corresponding 
entries  to  be  eliminated.  We  indicate  the  operations  in  terms  of  equations  and  operate  on  both  equations  and 
matrices. 

Step  1.  Elimination  of  x i from  the  second  and  third  equations  by  adding 


—0. 6/3.0  = —0.2  times  the  first  equation  to  the  second  equation, 
— 1. 2/3.0  = —0.4  times  the  first  equation  to  the  third  equation. 


This  gives  the  following,  in  which  the  pivot  of  the  next  step  is  circled. 


3.0 

2.0 

2.0 

-5.0 1 

8.0 

3.0*i  + 2.0*2 

+ 2.0x3  — 5.0x4  — 8.0 

0 

1.1 

1.1 

-4.4  1 
1 

1.1 

Row  2 

- 0.2  Row  1 

+ 1.1*3  - 4.4*4  = 1.1 

0 

-1.1 

-1.1 

4.4 1 

-1.1 

Row  3 

— 0.4  Row  1 

-1.1*2 

- 1.1*3  + 4.4*4  = -11 

Step  2.  Elimination  of  x 2 from  the  third  equation  of  (6)  by  adding 

1.1/ 1.1  = 1 times  the  second  equation  to  the  third  equation. 


This  gives 


3.0 

2.0 

2.0 

-5.0 

1 8.0 

1 

3.0*1  + 2.0*2  + 2.0*3 

- 5.0*4  = 8.0 

0 

1.1 

1.1 

-4.4 

1 1.1 

1 

1.1*2  + 1.1*3 

- 4.4*4  =1.1 

0 

0 

0 

0 

1 0 

Row  3 + Row  2 

0 = 0. 

Back  Substitution.  From  the  second  equation,  x2  = 1 — *3  + 4^4.  From  this  and  the  first  equation, 
Xi  = 2 — X4.  Since  X3  and  X4  remain  arbitrary,  we  have  infinitely  many  solutions.  If  we  choose  a value  of  X3 
and  a value  of  X4,  then  the  corresponding  values  of  Xi  and  X2  are  uniquely  determined. 

Oft  Notation.  If  unknowns  remain  arbitrary,  it  is  also  customary  to  denote  them  by  other  letters  1 1,  t2, • • • . 
In  this  example  we  may  thus  write  Xi  = 2 — X4  = 2 — /2,  x2  = 1 — X3  + 4x4  = 1 — ?i  + 4f2,  *3  — h (first 
arbitrary  unknown),  X4  = f2  (second  arbitrary  unknown). 


Gauss  Elimination  if  no  Solution  Exists 

What  will  happen  if  we  apply  the  Gauss  elimination  to  a linear  system  that  has  no  solution?  The  answer  is  that 
in  this  case  the  method  will  show  this  fact  by  producing  a contradiction.  For  instance,  consider 


3 2 113 

I 

2 1 110 

I 

6 2 4 16 


(3*i)  + 2*2  + x3  = 3 
2xi  + x2  + X3  = 0 
6x1  + 2x2  + 4x3  — 6. 


Step  1.  Elimination  of  x 1 from  the  second  and  third  equations  by  adding 

— § times  the  first  equation  to  the  second  equation, 

— | = — 2 times  the  first  equation  to  the  third  equation. 


SEC.  7.3  Linear  Systems  of  Equations.  Gauss  Elimination 


279 


This  gives 


3 

2 

1 

l 3 

3*!  + 2x2  + x3  = 

3 

0 

1 

3 

1 

3 

l -2 
1 

Row  2 

— | Row  1 

(r  5*a)+  3*3  = 

-2 

0 

-2 

2 

1 0 

Row  3 

— 2 Row  1 

— 2x2|+  2x3  = 

0. 

Step  2.  Elimination  of  x 2 from  the  third  equation  gives 


’3 

2 

1 

1 3’ 

3*i  + 2x2  + x3  = 

3 

0 

1 

3 

1 

3 

1 -2 

II 

CO 

i-HM 

+ 

(N 

H 

>Hieo 

1 

- 2 

0 

0 

0 

1 12 

Row  3 — 6 Row  2 

0 = 

12 

The  false  statement  0 = 12  shows  that  the  system  has  no  solution. 

Row  Echelon  Form  and  Information  From  It 

At  the  end  of  the  Gauss  elimination  the  form  of  the  coefficient  matrix,  the  augmented 
matrix,  and  the  system  itself  are  called  the  row  echelon  form.  In  it,  rows  of  zeros,  if 
present,  are  the  last  rows,  and,  in  each  nonzero  row,  the  leftmost  nonzero  entry  is  farther 
to  the  right  than  in  the  previous  row.  For  instance,  in  Example  4 the  coefficient  matrix 
and  its  augmented  in  row  echelon  form  are 


_3 

2 

1 

3 

2 

1 

1 3 

| 

0 

1 

3 

1 

3 

and 

0 

1 

3 

1 

3 

-2 

0 

0 

0 

0 

0 

0 

i 12 

Note  that  we  do  not  require  that  the  leftmost  nonzero  entries  be  1 since  this  would  have 
no  theoretic  or  numeric  advantage.  (The  so-called  reduced  echelon  form,  in  which  those 
entries  are  1,  will  be  discussed  in  Sec.  7.8.) 

The  original  system  of  m equations  in  n unknowns  has  augmented  matrix  [Alb].  This 
is  to  be  row  reduced  to  matrix  [Rif].  The  two  systems  Ax  = b and  Rx  = f are  equivalent: 
if  either  one  has  a solution,  so  does  the  other,  and  the  solutions  are  identical. 

At  the  end  of  the  Gauss  elimination  (before  the  back  substitution),  the  row  echelon  form 
of  the  augmented  matrix  will  be 


(9) 


fi 

h 

fr 
fr+ 1 


Here,  r Si  m,  rn  A 0,  and  all  entries  in  the  blue  triangle  and  blue  rectangle  are  zero. 
The  number  of  nonzero  rows,  r,  in  the  row-reduced  coefficient  matrix  R is  called  the 
rank  of  R and  also  the  rank  of  A.  Here  is  the  method  for  determining  whether  Ax  = b 
has  solutions  and  what  they  are: 

(a)  No  solution.  If  r is  less  than  m (meaning  that  R actually  has  at  least  one  row  of 
all  Os)  and  at  least  one  of  the  numbers /r+i,/r+2>  • ' ' ,/m  is  not  zero,  then  the  system 
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Rx  = f is  inconsistent:  No  solution  is  possible.  Therefore  the  system  Ax  = b is 
inconsistent  as  well.  See  Example  4,  where  r = 2 < m = 3 and/r+i  = /3  = 12. 

If  the  system  is  consistent  (either  r = m,  or  r < m and  all  the  numbers /r+1,/r+2,  ■ ■ ■ , fm 
are  zero),  then  there  are  solutions. 

(b)  Unique  solution.  If  the  system  is  consistent  and  r = n,  there  is  exactly  one 
solution,  which  can  be  found  by  back  substitution.  See  Example  2,  where  r = n = 3 
and  m = 4. 

(c)  Infinitely  many  solutions.  To  obtain  any  of  these  solutions,  choose  values  of 
xr+i,  ■■■  ,xn  arbitrarily.  Then  solve  the  rth  equation  for  xr  (in  terms  of  those 
arbitrary  values),  then  the  (r  — l)st  equation  for  xr-\,  and  so  on  up  the  line.  See 
Example  3. 

Orientation.  Gauss  elimination  is  reasonable  in  computing  time  and  storage  demand. 
We  shall  consider  those  aspects  in  Sec.  20.1  in  the  chapter  on  numeric  linear  algebra. 
Section  7.4  develops  fundamental  concepts  of  linear  algebra  such  as  linear  independence 
and  rank  of  a matrix.  These  in  turn  will  be  used  in  Sec.  7.5  to  fully  characterize  the 
behavior  of  linear  systems  in  terms  of  existence  and  uniqueness  of  solutions. 


PROBLEM  SET71 


1-14 


GAUSS  ELIMINATION 


Solve  the  linear  system  given  explicitly  or  by  its  augmented 
matrix.  Show  details. 


1.  Ax  — 6y  = — 1 1 


—3x  + 8y  = 10 

3. 

X + 

y ~ z = 

9 

8y  + 6c  = - 

-6 

— 2x  + Ay  — 6z  — 40 

5. 

" 13 

12  -6 

-4 

7 -73 

11  - 

-13  157 

7. 

2 

4 1 

o' 

-1 

1 -2 

0 

4 

0 6 

0 

9. 

— 2y 

1 

N> 

II 

1 

00 

3x  + 4y 

- 5c  = 13 

11. 


0 5 5 -10 

2-3-3  6 

4 11-2 


2. 


4. 


6. 


8. 


10. 

o" 

2 

4 


3.0  -0.5 


1.5 

4 

5 
-9 

4 

-1 

3 


4.5 

1 

-3 

2 


2 

-6 

4y  + 3c  = 8 
2x  — z = 2 
3x  + 2y  =5 


0.6 
6.0 

0 4 

1 2 

-1  5 

3 16 

-5  -21 

1 7 


5 

-15 


-7 


21  -9 


17 

50 


12. 


13. 


2 -2  4 0 0 

-3  3 -6  5 15 

1 -1  2 0 0_ 

I Ox  + 4v  — 2 z—  —4 
— 3w  — 17x  + y + 2 z = 2 
w + x + y =6 

8w  — 34.x  + 16y  — 10c  = 4 


14. 


3 

-2 

-1 

4 


1 

5 

3 

-7 


-11  1 
-4  5 

-3  3 

2 -7 


15.  Equivalence  relation.  By  definition,  an  equivalence 
relation  on  a set  is  a relation  satisfying  three  conditions: 
(named  as  indicated) 

(i)  Each  element  A of  the  set  is  equivalent  to  itself 
(Reflexivity). 

(ii)  If  A is  equivalent  to  B,  then  B is  equivalent  to  A 
(Symmetry). 

(ill)  If  A is  equivalent  to  B and  B is  equivalent  to  C, 
then  A is  equivalent  to  C (Transitivity). 


Show  that  row  equivalence  of  matrices  satisfies  these 
three  conditions.  Hint.  Show  that  for  each  of  the  three 
elementary  row  operations  these  conditions  hold. 
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16.  CAS  PROJECT.  Gauss  Elimination  and  Back 
Substitution.  Write  a program  for  Gauss  elimination 
and  back  substitution  (a)  that  does  not  include  pivoting 
and  (b)  that  does  include  pivoting.  Apply  the  programs 
to  Probs.  11-14  and  to  some  larger  systems  of  your 
choice. 


17-21 


MODELS  OF  NETWORKS 


In  Probs.  17-19,  using  Kirchhoff  s laws  (see  Example  2) 
and  showing  the  details,  find  the  currents: 


17.  r 16  V 


32  V 


20.  Wheatstone  bridge.  Show  that  if  Rx/R3  = R1/R2  in 
the  figure,  then  1 = 0.  (Rq  is  the  resistance  of  the 
instrument  by  which  I is  measured.)  This  bridge  is  a 
method  for  determining  Rx.  R R2 , R3  are  known.  R3 
is  variable.  To  get  Rx,  make  / = 0 by  varying  R3.  Then 
calculate  Rx  = R3Ri/R2. 


Wheatstone  bridge 


Net  of  one-way  streets 


Problem  20 


Problem  21 


21.  Traffic  flow.  Methods  of  electrical  circuit  analysis 
have  applications  to  other  fields.  For  instance,  applying 


the  analog  of  Kirchhoff  s Current  Law,  find  the  traffic 
flow  (cars  per  hour)  in  the  net  of  one-way  streets  (in 
the  directions  indicated  by  the  arrows)  shown  in  the 
figure.  Is  the  solution  unique? 

22.  Models  of  markets.  Determine  the  equilibrium 
solution  (£>1  = Si,  D2  = S2 ) of  the  two-commodity 
market  with  linear  model  ( D , S,  P = demand,  supply, 
price;  index  1 = first  commodity,  index  2 = second 
commodity) 


£>!  = 40  - 2P1  - P2,  Si  = 4 P1  ~ P2  + 4, 

D2  = 5 Pi  - 2 P2  + 16,  S2  = 3 P2  - 4. 


23.  Balancing  a chemical  equation  viCaHg  + x202 
x3C02  + X4H2O  means  finding  integer  x4,  x2.  x3,  x4 
such  that  the  numbers  of  atoms  of  carbon  (C),  hydrogen 
(H),  and  oxygen  (O)  are  the  same  on  both  sides  of  this 
reaction,  in  which  propane  C3H8  and  02  give  carbon 
dioxide  and  water.  Find  the  smallest  positive  integers 
*1,  ■ ■ ■ ,x4. 

24.  PROJECT.  Elementary  Matrices.  The  idea  is  that 
elementary  operations  can  be  accomplished  by  matrix 
multiplication.  If  A is  an  m X n matrix  on  which  we 
want  to  do  an  elementary  operation,  then  there  is  a 
matrix  E such  that  EA  is  the  new  matrix  after  the 
operation.  Such  an  E is  called  an  elementary  matrix. 
This  idea  can  be  helpful,  for  instance,  in  the  design 
of  algorithms.  ( Computationally , it  is  generally  prefer- 
able to  do  row  operations  directly,  rather  than  by 
multiplication  by  E.) 

(a)  Show  that  the  following  are  elementary  matrices, 
for  interchanging  Rows  2 and  3,  for  adding  —5  times 
the  first  row  to  the  third,  and  for  multiplying  the  fourth 
row  by  8. 


Ei 


E2 


10  0 0 
0 0 10 
0 1 0 0 
0 0 0 1 _ 

1 000” 

0 10  0 
-5010’ 


E 


3 — 


0 0 0 1 
10  0 0 
0 10  0 
0 0 10 


0 0 0 8 
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Apply  Ei,  E2,  E3  to  a vector  and  to  a 4 X 3 matrix  of 
your  choice.  Find  B = E3E2E1A,  where  A = [fitj/c]  is 
the  general  4X2  matrix.  Is  B equal  to  C = EiE2E3A? 

(b)  Conclude  that  Ei,  E2,  E3  are  obtained  by  doing 
the  corresponding  elementary  operations  on  the  4X4 


unit  matrix.  Prove  that  ifM  is  obtained  from  A by  an 
elementary  row  operation,  then 

M = EA, 

where  E is  obtained  from  the  n X n unit  matrix  In  by 
the  same  row  operation. 


7.4  Linear  Independence.  Rank  of  a Matrix. 
Vector  Space 

Since  our  next  goal  is  to  fully  characterize  the  behavior  of  linear  systems  in  terms 
of  existence  and  uniqueness  of  solutions  (Sec.  7.5),  we  have  to  introduce  new 
fundamental  linear  algebraic  concepts  that  will  aid  us  in  doing  so.  Foremost  among 
these  are  linear  independence  and  the  rank  of  a matrix.  Keep  in  mind  that  these 
concepts  are  intimately  linked  with  the  important  Gauss  elimination  method  and  how 
it  works. 


Linear  Independence  and  Dependence  of  Vectors 

Given  any  set  of  m vectors  aQ),  • • • , a(m)  (with  the  same  number  of  components),  a linear 
combination  of  these  vectors  is  an  expression  of  the  form 


ciaci)  + c2^(2)  + • ■ ■ + cm  a(m) 


where  c1;  c2,  ■ ■ • , cm  are  any  scalars.  Now  consider  the  equation 


(1) 


ciaa)  + c2a(2)  + • ■ • + cma(m)  - 0. 


Clearly,  this  vector  equation  (1)  holds  if  we  choose  all  cfs  zero,  because  then  it  becomes 
0 = 0.  If  this  is  the  only  /72-tuple  of  scalars  for  which  (1)  holds,  then  our  vectors 
aQ),  • • ■ , a(m)  are  said  to  form  a linearly  independent  set  or,  more  briefly,  we  call  them 
linearly  independent.  Otherwise,  if  (1)  also  holds  with  scalars  not  all  zero,  we  call  these 
vectors  linearly  dependent.  This  means  that  we  can  express  at  least  one  of  the  vectors 
as  a linear  combination  of  the  other  vectors.  For  instance,  if  (1)  holds  with,  say, 
Ci  f=  0,  we  can  solve  (1)  for  a(i): 


*t(i ) ^2*1(2)  T • ■ • + krnti(m)  where  kj  cj/ Ci. 

(Some  kf  s may  be  zero.  Or  even  all  of  them,  namely,  if  a(i)  = 0.) 

Why  is  linear  independence  important?  Well,  if  a set  of  vectors  is  linearly 
dependent,  then  we  can  get  rid  of  at  least  one  or  perhaps  more  of  the  vectors  until  we 
get  a linearly  independent  set.  This  set  is  then  the  smallest  “truly  essential”  set  with 
which  we  can  work.  Thus,  we  cannot  express  any  of  the  vectors,  of  this  set,  linearly 
in  terms  of  the  others. 
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EXAMPLE  1 


DEFINITION 


EXAMPLE  2 


THEOREM  1 


Linear  Independence  and  Dependence 

The  three  vectors 

a<i,  = [ 3 0 2 2] 

a<2)  = [-6  42  24  54] 

aC3)  = [21  -21  0 -15] 


are  linearly  dependent  because 


68(1)  2 8(2)  8(3)  — 0. 


Although  this  is  easily  checked  by  vector  arithmetic  (do  it!),  it  is  not  so  easy  to  discover.  However,  a systematic 
method  for  finding  out  about  linear  independence  and  dependence  follows  below. 

The  first  two  of  the  three  vectors  are  linearly  independent  because  Cia(i)  + c2a(2)  = 0 implies  c2  = 0 (from 
the  second  components)  and  then  Cy  = 0 (from  any  other  component  of  a(i). 

Rank  of  a Matrix 


The  rank  of  a matrix  A is  the  maximum  number  of  linearly  independent  row  vectors 
of  A.  It  is  denoted  by  rank  A. 


Our  further  discussion  will  show  that  the  rank  of  a matrix  is  an  important  key  concept  for 
understanding  general  properties  of  matrices  and  linear  systems  of  equations. 

Rank 

The  matrix 


3 

0 

2 

2 

(2) 

A = 

-6 

42 

24 

54 

21 

-21 

0 

-15 

has  rank  2,  because  Example  1 shows  that  the  first  two  row  vectors  are  linearly  independent,  whereas  all  three 
row  vectors  are  linearly  dependent. 

Note  further  that  rank  A = 0 if  and  only  if  A = 0.  This  follows  directly  from  the  definition. 


We  call  a matrix  Ay  row-equivalent  to  a matrix  A2  if  Ay  can  be  obtained  from  A2  by 
(finitely  many!)  elementary  row  operations. 

Now  the  maximum  number  of  linearly  independent  row  vectors  of  a matrix  does  not 
change  if  we  change  the  order  of  rows  or  multiply  a row  by  a nonzero  c or  take  a linear 
combination  by  adding  a multiple  of  a row  to  another  row.  This  shows  that  rank  is 
invariant  under  elementary  row  operations: 


Row-Equivalent  Matrices 

Row -equivalent  matrices  have  the  same  rank. 


Hence  we  can  determine  the  rank  of  a matrix  by  reducing  the  matrix  to  row-echelon 
form,  as  was  done  in  Sec.  7.3.  Once  the  matrix  is  in  row-echelon  form,  we  count  the 
number  of  nonzero  rows,  which  is  precisely  the  rank  of  the  matrix. 
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EXAMPLE  3 


THEOREM  2 


THEOREM  3 


PROOF 


Determination  of  Rank 

For  the  matrix  in  Example  2 we  obtain  successively 


3 

0 

2 

2 

-6 

42 

24 

54 

(given) 

21 

-21 

0 

-15  _ 

3 

0 

2 

2~ 

0 

42 

28 

58 

Row  2 + 2 Row  1 

0 

-21 

-14 

-29 

Row  3 — 7 Row  1 

3 

0 

2 

2~ 

0 

42 

28 

58 

0 

0 

0 

0 

Row  3+2  Row  2 

The  last  matrix  is  in  row-echelon  form  and  has  two  nonzero  rows.  Hence  rank  A = 2,  as  before. 


Examples  1-3  illustrate  the  following  useful  theorem  (with  p = 3,  n = 3,  and  the  rank  of 
the  matrix  = 2). 


Linear  Independence  and  Dependence  of  Vectors 

Consider  p vectors  that  each  have  n components.  Then  these  vectors  are  linearly 
independent  if  the  matrix  formed,  with  these  vectors  as  row  vectors,  has  rank  p. 
However,  these  vectors  are  linearly  dependent  if  that  matrix  has  rank  less  than  p. 


Further  important  properties  will  result  from  the  basic 


Rank  in  Terms  of  Column  Vectors 

The  rank  r of  a matrix  A equals  the  maximum  number  of  linearly  independent 
column  vectors  of  A. 

Hence  A and  its  transpose  AT  have  the  same  rank. 


In  this  proof  we  write  simply  “rows”  and  “columns”  for  row  and  column  vectors.  Let  A 
be  an  m X n matrix  of  rank  A = r.  Then  by  definition  of  rank,  A has  r linearly  independent 
rows  which  we  denote  by  V(d,  • • • , v(r)  (regardless  of  their  position  in  A),  and  all  the  rows 
a(i),  • • • , a(TO)  of  A are  linear  combinations  of  those,  say, 


3(1)  - CllV(l)  + Ci2V(2)  + • • • + ClrV(r) 
3(2)  = C2lV(l)  + C22V(2)  + • ■ ■ + C2rV(r) 


3 (m)  ('m  A(1 ) T"  Cm2V(2)  + • ' ' + Crrlr\(ri. 
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EXAMPLE  4 


THEOREM  4 


PROOF 


These  are  vector  equations  for  rows.  To  switch  to  columns,  we  write  (3)  in  terms  of 
components  as  n such  systems,  with  k = !,•••,«, 


(4) 


At fc  — CnUife  + Ci2U2fc  + ' ’ ' + CirUrk 
a2k  — c21^1k  c22^2k  + ' ' ' + C2 r^rk 


Qmk  fmlTlfc  T ^m2^2k  T • * • + CmrUrk 


and  collect  components  in  columns.  Indeed,  we  can  write  (4)  as 


Olk 

Cll 

c12 

Cir 

«2  fc 

= vlk 

C21 

+ t>2k 

c22 

+ • ' ■ + Vrk 

c2r 

ttmk 

Cml 

cm2 

CjYir 

where  k = 1,  ■ • • , n.  Now  the  vector  on  the  left  is  the  Ath  column  vector  of  A.  We  see  that 
each  of  these  n columns  is  a linear  combination  of  the  same  r columns  on  the  right.  Hence 
A cannot  have  more  linearly  independent  columns  than  rows,  whose  number  is  rank  A = r. 
Now  rows  of  A are  columns  of  the  transpose  AT.  For  AT  our  conclusion  is  that  AT  cannot 
have  more  linearly  independent  columns  than  rows,  so  that  A cannot  have  more  linearly 
independent  rows  than  columns.  Together,  the  number  of  linearly  independent  columns 
of  A must  be  r,  the  rank  of  A.  This  completes  the  proof. 

Illustration  of  Theorem  3 

The  matrix  in  (2)  has  rank  2.  From  Example  3 we  see  that  the  first  two  row  vectors  are  linearly  independent 
and  by  “working  backward"  we  can  verify  that  Row  3 = 6 Row  1 — j Row  2.  Similarly,  the  first  two  columns 
are  linearly  independent,  and  by  reducing  the  last  matrix  in  Example  3 by  columns  we  find  that 

Column  3 = | Column  1 + § Column  2 and  Column  4 = § Column  1 + §f  Column  2. 

Combining  Theorems  2 and  3 we  obtain 


Linear  Dependence  of  Vectors 

Consider  p vectors  each  having  n components.  If  n < p,  then  these  vectors  are 
linearly  dependent. 


The  matrix  A with  those  p vectors  as  row  vectors  has  p rows  and  n < p columns;  hence 
by  Theorem  3 it  has  rank  A Si  n < p,  which  implies  linear  dependence  by  Theorem  2. 

Vector  Space 

The  following  related  concepts  are  of  general  interest  in  linear  algebra.  In  the  present 
context  they  provide  a clarification  of  essential  properties  of  matrices  and  their  role  in 
connection  with  linear  systems. 
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EXAMPLE  5 


THEOREM  5 


PROOF 


Consider  a nonempty  set  V of  vectors  where  each  vector  has  the  same  number  of 
components.  If,  for  any  two  vectors  a and  b in  V,  we  have  that  all  their  linear  combinations 
era  + /3b  ( a , [3  any  real  numbers)  are  also  elements  of  V,  and  if,  furthermore,  a and  b satisfy 
the  laws  (3a),  (3c),  (3d),  and  (4)  in  Sec.  7.1,  as  well  as  any  vectors  a,  b,  c in  V satisfy  (3b) 
then  V is  a vector  space.  Note  that  here  we  wrote  laws  (3)  and  (4)  of  Sec.  7.1  in  lowercase 
letters  a,  b,  c,  which  is  our  notation  for  vectors.  More  on  vector  spaces  in  Sec.  7.9. 

The  maximum  number  of  linearly  independent  vectors  in  V is  called  the  dimension  of 
V and  is  denoted  by  dim  V.  Here  we  assume  the  dimension  to  be  finite;  infinite  dimension 
will  be  defined  in  Sec.  7.9. 

A linearly  independent  set  in  V consisting  of  a maximum  possible  number  of  vectors 
in  V is  called  a basis  for  V.  In  other  words,  any  largest  possible  set  of  independent  vectors 
in  V forms  basis  for  V.  That  means,  if  we  add  one  or  more  vector  to  that  set,  the  set  will 
be  linearly  dependent.  (See  also  the  beginning  of  Sec.  7.4  on  linear  independence  and 
dependence  of  vectors.)  Thus,  the  number  of  vectors  of  a basis  for  V equals  dim  V. 

The  set  of  all  linear  combinations  of  given  vectors  aQ),  • • • , a(p)  with  the  same  number 
of  components  is  called  the  span  of  these  vectors.  Obviously,  a span  is  a vector  space.  If 
in  addition,  the  given  vectors  aQ),  ■ • • , a(p)  are  linearly  independent,  then  they  form  a basis 
for  that  vector  space. 

This  then  leads  to  another  equivalent  definition  of  basis.  A set  of  vectors  is  a basis  for 
a vector  space  V if  (1)  the  vectors  in  the  set  are  linearly  independent,  and  if  (2)  any  vector 
in  V can  be  expressed  as  a linear  combination  of  the  vectors  in  the  set.  If  (2)  holds,  we 
also  say  that  the  set  of  vectors  spans  the  vector  space  V. 

By  a subspace  of  a vector  space  V we  mean  a nonempty  subset  of  V (including  V itself) 
that  forms  a vector  space  with  respect  to  the  two  algebraic  operations  (addition  and  scalar 
multiplication)  defined  for  the  vectors  of  V. 

Vector  Space,  Dimension,  Basis 

The  span  of  the  three  vectors  in  Example  1 is  a vector  space  of  dimension  2.  A basis  of  this  vector  space  consists 
of  any  two  of  those  three  vectors,  for  instance,  a(1),  a(2,,  or  a(D,  a(3),  etc. 

We  further  note  the  simple 


Vector  Space  R” 

The  vector  space  Rn  consisting  of  all  vectors  with  n components  ( n real  numbers) 
has  dimension  n. 

A basis  of  n vectors  is  aQ)  = [1  0 •••  0],  a^  = [0  1 0 •••  0],  •••, 

a(„)  = [0  •••  0 1], 

For  a matrix  A,  we  call  the  span  of  the  row  vectors  the  row  space  of  A.  Similarly,  the 
span  of  the  column  vectors  of  A is  called  the  column  space  of  A. 

Now,  Theorem  3 shows  that  a matrix  A has  as  many  linearly  independent  rows  as 
columns.  By  the  definition  of  dimension,  their  number  is  the  dimension  of  the  row  space 
or  the  column  space  of  A.  This  proves 


Row  Space  and  Column  Space 

The  row  space  and  the  column  space  of  a matrix  A have  the  same  dimension,  equal 
to  rank  A. 


THEOREM  6 


SEC.  7.4 
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Finally,  for  a given  matrix  A the  solution  set  of  the  homogeneous  system  Ax  = 0 is  a 
vector  space,  called  the  null  space  of  A,  and  its  dimension  is  called  the  nullity  of  A.  In 
the  next  section  we  motivate  and  prove  the  basic  relation 

(6)  rank  A + nullity  A = Number  of  columns  of  A. 


PROBLEM  S E T 7 74 


1-10 


RANK,  ROW  SPACE,  COLUMN  SPACE 


Find  the  rank.  Find  a basis  for  the  row  space.  Find  a basis 
for  the  column  space.  Hint.  Row-reduce  the  matrix  and  its 
transpose.  (You  may  omit  obvious  factors  from  the  vectors 
of  these  bases.) 


4 

-2 

6 

a 

b 

1. 

-2 

1 

-3 

2. 

b 

a 

~0 

3 

5~ 

6 

-4 

0" 

3. 

3 

5 

0 

4. 

-4 

0 

2 

5 

0 

10 

0 

2 

6 

~0.2 

0.1 

0.4" 

0 

1 

0" 

5. 

0 

1.1 

-0.3 

6. 

-1 

0 

-4 

0.1 

0 

-2.1 

0 

4 

0 

2 

4 

8 

16 

7. 


0 4 0 
2 0 4 
0 2 0 


8. 


9. 


9 0 10 

0 0 10 
1111 
0 0 10 


10. 


16  8 4 

4 8 16 

2 16  8 
5 -2  1 

-2  0 -4 

1 -4  -11 

0 1 2 


11.  CAS  Experiment.  Rank,  (a)  Show  experimentally 
that  the  n X n matrix  A = [a^]  with  = j + k — 1 
has  rank  2 for  any  n.  (Problem  20  shows  n = 4.)  Try 
to  prove  it. 

(b)  Do  the  same  when  — j + k + c,  where  c is  any 
positive  integer. 

(c)  What  is  rank  A if  Oj^  = 2J+fc-2?  Try  to  find  other 
large  matrices  of  low  rank  independent  of  n. 


12-16 


GENERAL  PROPERTIES  OF  RANK 


Show  the  following: 

12.  rank  BTAT  = rank  AB.  (Note  the  order!) 

13.  rank  A = rank  B does  not  imply  rank  A2  = rank  B2. 
(Give  a counterexample.) 


14.  If  A is  not  square,  either  the  row  vectors  or  the  column 
vectors  of  A are  linearly  dependent. 


15.  If  the  row  vectors  of  a square  matrix  are  linearly 
independent,  so  are  the  column  vectors,  and  conversely. 

16.  Give  examples  showing  that  the  rank  of  a product  of 
matrices  cannot  exceed  the  rank  of  either  factor. 


17-25 


LINEAR  INDEPENDENCE 


Are  the  following  sets  of  vectors  linearly  independent? 
Show  the  details  of  your  work. 


17.  [3 

4 

0 2],  [2  -1  3 7], 

[1 

16 

-12  -22] 

18.  [1 

1 

2 

1 li  rl  1 1 li  rl  1 1 

q 4.  , 9 q a k , q a r: 

b], 

pi 

1 

1 *1 

u 

5 

6 7J 

19.  [0 

1 

i].  [i  1 i].  [0  0 1] 

20.  [1 

2 

3 4],  [2  3 4 5],  [3  4 5 

6], 

[4 

5 

6 7] 

21.  [2 

0 

0 7],  [2  0 0 8],  [2  0 0 

9], 

[2 

0 

1 0] 

22.  [0.4 

-0.2  0.2],  [0  0 0],  [3.0  -0.6  1.5] 

23.  [9 

8 

7 6 5],  [9  7 5 3 1] 

24.  [4 

-1 

3],  [0  8 1],  [1  3 -5], 

[2 

6 

1] 

25.  [6 

0 

-1  3],  [2  2 5 0], 

[-4 

-4  -4  -4] 

26.  Linearly  independent  subset.  Beginning 

with  the 

last 

of 

the  vectors  [3  0 1 2],  [6  1 

0 0], 

[12 

1 

2 4],  [6  0 2 4],  and  [9  0 

1 2], 

omit  one  after  another  until  you  get  a linearly 
independent  set. 
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VECTOR  SPACE 

Is  the  given  set  of  vectors  a vector  space?  Give  reasons.  If 
your  answer  is  yes,  determine  the  dimension  and  find  a 
basis,  (iq,  v2,  - ■ ■ denote  components.) 

27.  All  vectors  in  R3  with  v1  — v2  + 2v3  = 0 

28.  All  vectors  in  R3  with  3v2  + v3  = k 

29.  All  vectors  in  R2  with  iq  £ v2 

30.  All  vectors  in  Rn  with  the  first  n — 2 components  zero 


31.  All  vectors  in  R5  with  positive  components 

32.  All  vectors  in  R3  with  3tq  — 2v2  + v3  = 0, 
4iq  + 5v2  = 0 

33.  All  vectors  in  R3  with  3tq  — V3  = 0, 

2iq  + 3v2  - 4-V3  = 0 

34.  All  vectors  in  Rn  with  |uj|  = 1 for)  = 1,  ■ ■ ■ , n 

35.  All  vectors  in  R^  with  iq  — 2v2  = 3v3  = 4iq 


7.5  Solutions  of  Linear  Systems: 

Existence,  Uniqueness 

Rank,  as  just  defined,  gives  complete  information  about  existence,  uniqueness,  and  general 
structure  of  the  solution  set  of  linear  systems  as  follows. 

A linear  system  of  equations  in  n unknowns  has  a unique  solution  if  the  coefficient 
matrix  and  the  augmented  matrix  have  the  same  rank  n,  and  infinitely  many  solutions  if 
that  common  rank  is  less  than  n.  The  system  has  no  solution  if  those  two  matrices  have 
different  rank. 

To  state  this  precisely  and  prove  it,  we  shall  use  the  generally  important  concept  of  a 
submatrix  of  A.  By  this  we  mean  any  matrix  obtained  from  A by  omitting  some  rows  or 
columns  (or  both).  By  definition  this  includes  A itself  (as  the  matrix  obtained  by  omitting 
no  rows  or  columns);  this  is  practical. 


THEOREM  1 


Fundamental  Theorem  for  Linear  Systems 

(a)  Existence.  A linear  system  of  m equations  in  n unknowns  x1;  • • • , xn 

a±iXi  + ai2x2  + ■ ■ ■ + alnxn  = bx 
®2iAi  tz22x2  + • ■ • + a2nxn  — b2 

am2x2  T ■ ■ • + arnnXrl  bm 

is  consistent,  that  is,  has  solutions,  if  and  only  if  the  coefficient  matrix  A and  the 
augmented  matrix  A have  the  same  rank.  Here, 


an 

a\n 

and  A = 

an 

@ln 

^1 

aml 

@mn 

a ml 

&mn 

(b)  Uniqueness.  The  system  (1)  has  precisely  one  solution  if  and  only  if  this 
common  rank  r of  A and  A equals  n. 
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(c)  Infinitely  many  solutions.  If  this  common  rank  r is  less  than  n,  the  system 
(1)  has  infinitely  many  solutions.  All  of  these  solutions  are  obtained  by  determining 
r suitable  unknowns  ( whose  submatrix  of  coefficients  must  have  rank  r)  in  terms  of 
the  remaining  n — r unknowns,  to  which  arbitrary  values  can  be  assigned.  (See 
Example  3 in  Sec.  7.3.) 

(d)  Gauss  elimination  (Sec.  7.3).  If  solutions  exist,  they  can  all  be  obtained  by 
the  Gauss  elimination.  (This  method  will  automatically  reveal  whether  or  not 
solutions  exist;  see  Sec.  7.3.) 


PROOF  (a)  We  can  write  the  system  (1)  in  vector  form  Ax  = b or  in  terms  of  column  vectors 
C(l>  ‘ » C(n)  of  A- 

(2)  ca)xi  + C(2)X2  + • • • + c (n)xn  = b. 

A is  obtained  by  augmenting  A by  a single  column  b.  Hence,  by  Theorem  3 in  Sec.  7.4, 
rank  A equals  rank  A or  rank  A + 1.  Now  if  (1)  has  a solution  x,  then  (2)  shows  that  b 
must  be  a linear  combination  of  those  column  vectors,  so  that  A and  A have  the  same 
maximum  number  of  linearly  independent  column  vectors  and  thus  the  same  rank. 

Conversely,  if  rank  A = rank  A,  then  b must  be  a linear  combination  of  the  column 
vectors  of  A,  say, 

(2*)  b = aic(i)  + • • • + an  c(n) 

since  otherwise  rank  A = rank  A 4-  1.  But  (2*)  means  that  (1)  has  a solution,  namely, 
X\  = a i,  • ■ • , xn  = an,  as  can  be  seen  by  comparing  (2*)  and  (2). 

(b)  If  rank  A = n,  the  n column  vectors  in  (2)  are  linearly  independent  by  Theorem  3 
in  Sec.  7.4.  We  claim  that  then  the  representation  (2)  of  b is  unique  because  otherwise 


^(1)-*T  T ■ • • + C(n)Xn  Cq).Y;l  T ■ ■ • + C cri)Xn. 

This  would  imply  (take  all  terms  to  the  left,  with  a minus  sign) 

(xi  - Xi)C(d  + • • • + (xn  - xn)c(n)  = 0 

and*!  — ~X\  = xn  — xn  = 0 by  linear  independence.  But  this  means  that  the  scalars 

Xi,  ■ ■ ■ , xn  in  (2)  are  uniquely  determined,  that  is,  the  solution  of  (1)  is  unique. 

(c)  If  rank  A = rank  A = r < n,  then  by  Theorem  3 in  Sec.  7.4  there  is  a linearly 
independent  set  K of  r column  vectors  of  A such  that  the  other  n — r column  vectors  of 
A are  linear  combinations  of  those  vectors.  We  renumber  the  columns  and  unknowns, 
denoting  the  renumbered  quantities  by  ",  so  that  {c(1),  • • • , C(r) } is  that  linearly  independent 
set  K.  Then  (2)  becomes 

C(l)^T  + ‘ ‘ ‘ + C(r,.Vr  T ‘ - b, 

C(r+i),  ■ • • , C(n)  are  linear  combinations  of  the  vectors  of  K,  and  so  are  the  vectors 
xr+  iC(r+1),  • • ■ , xnc(n).  Expressing  these  vectors  in  terms  of  the  vectors  of  K and  collect- 
ing terms,  we  can  thus  write  the  system  in  the  form 


(3) 


^(1)34  T • * • + c (j.yyr  b 
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THEOREM  2 


PROOF 


with  y;  = Xj  + /3j , where  [3j  results  from  the  n — r terms  c(r+1)jcr+1,  ■ • • , c(n)xn.  here, 
j = 1,  • ■ ■ , r.  Since  the  system  has  a solution,  there  are  y i , ■ • ■ , yr  satisfying  (3).  These 
scalars  are  unique  since  K is  linearly  independent.  Choosing  xr+1,  ■■  ■ ,xn  fixes  the  /3;  and 
corresponding  x,-,  = y7  — f3j,  where  j = 1 , • • • , r. 

(d)  This  was  discussed  in  Sec.  7.3  and  is  restated  here  as  a reminder. 

The  theorem  is  illustrated  in  Sec.  7.3.  In  Example  2 there  is  a unique  solution  since  rank 
A = rank  A = n = 3 (as  can  be  seen  from  the  last  matrix  in  the  example).  In  Example  3 
we  have  rank  A = rank  A = 2 < n = 4 and  can  choose  X3  and  x4  arbitrarily.  In 
Example  4 there  is  no  solution  because  rank  A = 2 < rank  A = 3. 


Homogeneous  Linear  System 

Recall  from  Sec.  7.3  that  a linear  system  (1)  is  called  homogeneous  if  all  the  bj’s  are 
zero,  and  nonhomogeneous  if  one  or  several  bj’s  are  not  zero.  For  the  homogeneous 
system  we  obtain  from  the  Fundamental  Theorem  the  following  results. 


Homogeneous  Linear  System 

A homogeneous  linear  system 

allxl  + + ' 

“f"  0 

«21«1  + «22-<2  + ' 
(4) 

“h  Cl2n^n  0 

~h  drnnXn  0 

always  has  the  trivial  solution  X\  = 0,  ■ ■ • , xn  = 0.  Nontrivial  solutions  exist  if  and 
only  if  rank  A < n.  If  rank  A = r < n,  these  solutions,  together  with  x = 0,  form  a 
vector  space  (see  Sec.  7.4)  of  dimension  n — r called  the  solution  space  of  (4). 

In  particular,  ifx q)  and  x(2)  are  solution  vectors  of  (A),  then  x = ciXq)  + c2x(2) 
with  any  scalars  ci  and  c2  is  a solution  vector  of  (4).  (This  does  not  hold  for 
nonhomogeneous  systems.  Also,  the  term  solution  space  is  used  for  homogeneous 
systems  only.) 

The  first  proposition  can  be  seen  directly  from  the  system.  It  agrees  with  the  fact  that 
b = 0 implies  that  rank  A = rank  A,  so  that  a homogeneous  system  is  always  consistent. 
If  rank  A = n,  the  trivial  solution  is  the  unique  solution  according  to  (b)  in  Theorem  1 . 
If  rank  A < n,  there  are  nontrivial  solutions  according  to  (c)  in  Theorem  1 . The  solutions 
form  a vector  space  because  if  x(1)  and  x(2)  are  any  of  them,  then  Ax(1)  = 0,  Ax(2)  = 0, 
and  this  implies  A(x(1)  + x(2))  = Ax(1)  + Ax(2)  = 0 as  well  as  A(cx(1))  = rAx(1)  = 0, 
where  c is  arbitrary.  If  rank  A = r < n,  Theorem  1 (c)  implies  that  we  can  choose  n — r 
suitable  unknowns,  call  them  xr+i,  ■ • • , xn,  in  an  arbitrary  fashion,  and  every  solution  is 
obtained  in  this  way.  Hence  a basis  for  the  solution  space,  briefly  called  a basis  of 
solutions  of  (4),  is  y(1),  • • ■ , y(n_r),  where  the  basis  vector  y(j}  is  obtained  by  choosing 
xr+j  = 1 and  the  other  xr+1,  • ■ • , xn  zero;  the  corresponding  first  r components  of  this 
solution  vector  are  then  determined.  Thus  the  solution  space  of  (4)  has  dimension  n — r. 
This  proves  Theorem  2. 
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The  solution  space  of  (4)  is  also  called  the  null  space  of  A because  Ax  = 0 for  every  x in 
the  solution  space  of  (4).  Its  dimension  is  called  the  nullity  of  A.  Hence  Theorem  2 states  that 

(5)  rank  A + nullity  A = n 

where  n is  the  number  of  unknowns  (number  of  columns  of  A). 

Furthermore,  by  the  definition  of  rank  we  have  rank  A 2§  in  in  (4).  Hence  if  in  < n, 
then  rank  A < n.  By  Theorem  2 this  gives  the  practically  important 


THEOREM  3 


Homogeneous  Linear  System  with  Fewer  Equations  Than  Unknowns 

A homogeneous  linear  system  with  fewer  equations  than  unknowns  always  has 
nontrivial  solutions. 


Nonhomogeneous  Linear  Systems 

The  characterization  of  all  solutions  of  the  linear  system  (1)  is  now  quite  simple,  as  follows. 


THEOREM  4 


Nonhomogeneous  Linear  System 

If  a nonhomogeneous  linear  system  (1)  is  consistent,  then  all  of  its  solutions  are 
obtained  as 

(6)  x = x0  + xh 

where  x0  is  any  (fixed)  solution  of  { 1)  and  x^  runs  through  all  the  solutions  of  the 
corresponding  homogeneous  system  (4). 


PROOF  The  difference  xjj  = x - Xo  of  any  two  solutions  of  (1)  is  a solution  of  (4)  because 
A Xh  = A(x  — Xo)  = Ax  — Axo  = b — b = 0.  Since  x is  any  solution  of  (1),  we  get  all 
the  solutions  of  (1)  if  in  (6)  we  take  any  solution  x0  of  (1)  and  let  xh  vary  throughout  the 
solution  space  of  (4).  ■ 

This  covers  a main  part  of  our  discussion  of  characterizing  the  solutions  of  systems  of 
linear  equations.  Our  next  main  topic  is  determinants  and  their  role  in  linear  equations. 


7.6  For  Reference: 

Second-  and  Third-Order  Determinants 


We  created  this  section  as  a quick  general  reference  section  on  second-  and  third-order 
determinants.  It  is  completely  independent  of  the  theory  in  Sec.  7.7  and  suffices  as  a 
reference  for  many  of  our  examples  and  problems.  Since  this  section  is  for  reference,  go 
on  to  the  next  section,  consulting  this  material  only  when  needed. 

A determinant  of  second  order  is  denoted  and  defined  by 


(1) 


D 


detA  = 


an 


021 


fl12 

a22 


— al\a22  a12a21- 


So  here  we  have  bars  (whereas  a matrix  has  brackets ). 
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EXAMPLE  1 


(2) 


Cramer’s  rule  for  solving  linear  systems  of  two  equations  in  two  unknowns 

(a)  anXi  + a12x2  = by 

(b)  a2\x  i + a22x2  = b2 


x1  = 


(3) 


*2  = 


with  D as  in  (1),  provided 


by  fli2 

b-2  a22 

b\a22  — a\2b2 

D 

D 

«n  b\ 

«2i  b2 

a\\b2  — b\a2\ 

D 

D 

D ¥=  0. 

The  value  D = 0 appears  for  homogeneous  systems  with  nontrivial  solutions. 


We  prove  (3).  To  eliminate  x2  multiply  (2a)  by  a22  and  (2b)  by  — «12  and  add, 

(fllla22  — a12a‘2l)x  l = b\Cl22  — U\2h2. 

Similarly,  to  eliminate  X\  multiply  (2a)  by  — a2 \ and  (2b)  by  ay  1 and  add, 

(allfl22  ~ ^12^21)^2  = a11^2  ~ b\0.2\. 

Assuming  that  D = a 1 1 a22  — a y 2a2  y A 0,  dividing,  and  writing  the  right  sides  of  these 
two  equations  as  determinants,  we  obtain  (3). 


Cramer’s  Rule  for  Two  Equations 


12  3 


4 12 


If 


4*1  + 3x2  — 12 
2xi  + 5x2  = —8 


-8  5 

84  a 

= =6,  19  = 

14 

2 -8 

-56 

4 

3 

4 

3 

14 

2 

5 

2 

5 

Third-Order  Determinants 

A determinant  of  third  order  can  be  defined  by 


(4)  D 


flll 

fl12 

«13 

«21 

«22 

a23 

fl31 

a32 

a33 

a22 

a23 

fl12 

Ol3 

fl12 

a\3 

«11 

~ 021 

+ «31 

a32 

a33 

a32 

033 

a22 

a23 
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Note  the  following.  The  signs  on  the  right  are  + — +.  Each  of  the  three  terms  on  the 
right  is  an  entry  in  the  first  column  of  D times  its  minor,  that  is,  the  second-order 
determinant  obtained  from  D by  deleting  the  row  and  column  of  that  entry;  thus,  for  an 
delete  the  first  row  and  first  column,  and  so  on. 

If  we  write  out  the  minors  in  (4),  we  obtain 

(4*)  D = Cl\  1«22«33  ~ U\ 1O23O32  + «21fl13,<:,.32  ~ a21a12<^33  + (l3\d\2a23  — d31a13a22- 

Cramer’s  Rule  for  Linear  Systems  of  Three  Equations 

on*  i + a12*2  + o13x3  = h 

(5)  021*1  + (1-22X2  + «23*3  = h2 

031*1  + «32*2  + 033*3  = 

is 

D i Z^2  Z^3 

(6)  *1  = — , *2  = — > *3  = — (D  =h  0) 

with  the  determinant  D of  the  system  given  by  (4)  and 


b\ 

a12 

013 

Oil 

b\ 

Ol3 

On 

012 

b\ 

Di  = 

l>2 

a22 

023 

, d2  = 

021 

bz 

023 

, d3  = 

021 

022 

b3 

b3 

a32 

«33 

031 

b3 

O33 

031 

032 

b3 

Note  that  Di,  D%,  are  obtained  by  replacing  Columns  1,  2,  3,  respectively,  by  the 
column  of  the  right  sides  of  (5). 

Cramer’s  rule  (6)  can  be  derived  by  eliminations  similar  to  those  for  (3),  but  it  also 
follows  from  the  general  case  (Theorem  4)  in  the  next  section. 

7.7  Determinants.  Cramers  Rule 

Determinants  were  originally  introduced  for  solving  linear  systems.  Although  impractical 
in  computations,  they  have  important  engineering  applications  in  eigenvalue  problems 
(Sec.  8.1),  differential  equations,  vector  algebra  (Sec.  9.3),  and  in  other  areas.  They  can 
be  introduced  in  several  equivalent  ways.  Our  definition  is  particularly  for  dealing  with 
linear  systems. 

A determinant  of  order  n is  a scalar  associated  with  an  n X n (hence  square !)  matrix 
A = [ajk],  and  is  denoted  by 

flll  a12  ' ’ ' aln 

a21  a22  ' ’ ' a2n 


anl  (t-n2  * * ' (t-nn 


0) 


D = det  A = 
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For  n = 1,  this  determinant  is  defined  by 
(2)  D = an. 

For  « g 2 by 

(3a)  D cij ] C'j  ] F dj2Cj2  "F  * * ■ T cijYiCjyi  0 1»  2,  * * * , or  n) 

or 

(3b)  D flifcC'ifc  "F  ^2k^2k  + ' ‘ + cinkCnk  (k  1?  2, * * ■ , or  n ). 

Flere, 

Cjk  = (-1  )j+kMjk 

and  Mjfc  is  a determinant  of  order  n — 1,  namely,  the  determinant  of  the  submatrix  of  A 
obtained  from  A by  omitting  the  row  and  column  of  the  entry  ajk,  that  is,  the  /'th  row  and 
the  Mi  column. 

In  this  way,  D is  defined  in  terms  of  n determinants  of  order  n — 1 , each  of  which  is, 
in  turn,  defined  in  terms  of  n — 1 determinants  of  order  n — 2,  and  so  on — until  we 
finally  arrive  at  second-order  determinants,  in  which  those  submatrices  consist  of  single 
entries  whose  determinant  is  defined  to  be  the  entry  itself. 

From  the  definition  it  follows  that  we  may  expand  D by  any  row  or  column,  that  is,  choose 
in  (3)  the  entries  in  any  row  or  column,  similarly  when  expanding  the  C.jp-’s  in  (3),  and  so  on. 

This  definition  is  unambiguous,  that  is,  it  yields  the  same  value  for  D no  matter  which 
columns  or  rows  we  choose  in  expanding.  A proof  is  given  in  App.  4. 

Terms  used  in  connection  with  determinants  are  taken  from  matrices.  In  D we  have  n2 
entries  a^-,  also  n rows  and  n columns,  and  a main  diagonal  on  which  an,  a22,  • • ■ , ann 
stand.  Two  terms  are  new: 

Mjp-  is  called  the  minor  of  a:jp  in  D,  and  Cjk  the  cofactor  of  a:jp  in  D. 

For  later  use  we  note  that  (3)  may  also  be  written  in  terms  of  minors 


n 


(4a) 

D=  2(“ Vj+kajkMjk 

k= 1 

U = 1,2," 

■ ,ov  n) 

(4b) 

n 

D = 2(“D  j + \kMjk 
3= 1 

II 

to 

■ , or  n). 

Minors  and  Cofactors  of  a Third-Order  Determinant 

In  (4)  of  the  previous  section  the  minors  and  cofactors  of  the  entries  in  the  first  column  can  be  seen  directly. 
For  the  entries  in  the  second  row  the  minors  are 


a12 

a13 

«11 

a13 

all 

a12 

M2 1 - 

, M22  ~ 

, M2  3 — 

a32 

a33 

a31 

a33 

a31 

a32 

and  the  cofactors  are  C21  = — M21,  C22  = +M22,  and  C23  = — A/23.  Similarly  for  the  third  row — write  these 
down  yourself.  And  verify  that  the  signs  in  Cjk  form  a checkerboard  pattern 
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EXAMPLE  2 


EXAMPLE  3 


THEOREM  1 


PROOF 


Expansions  of  a Third-Order  Determinant 


1 3 0 

6 4 

2 4 

2 6 

2 6 4 

= 1 

- 3 

+ 0 

0 2 

-1  2 

-1  0 

-10  2 

= 1(12  - 0)  - 3(4  + 4)  + 0(0  + 6)  = -12. 


This  is  the  expansion  by  the  first  row.  The  expansion  by  the  third  column  is 


2 6 

1 3 

1 3 

D = 0 

- 4 

+ 2 

-1  0 

-1  0 

2 6 

Verify  that  the  other  four  expansions  also  give  the  value  — 12. 


Determinant  of  a Triangular  Matrix 


-3 

0 

0 

4 

0 

6 

4 

0 

= -3 

2 

5 

-1 

2 

5 

Inspired  by  this,  can  you  formulate  a little  theorem  on  determinants  of  triangular  matrices?  Of  diagonal 
matrices? 


General  Properties  of  Determinants 

There  is  an  attractive  way  of  finding  determinants  (1)  that  consists  of  applying  elementary 
row  operations  to  (1).  By  doing  so  we  obtain  an  “upper  triangular"  determinant  (see 
Sec.  7.1,  for  definition  with  “matrix”  replaced  by  “determinant”)  whose  value  is  then  very 
easy  to  compute,  being  just  the  product  of  its  diagonal  entries.  This  approach  is  similar 
( but  not  the  samel)  to  what  we  did  to  matrices  in  Sec.  7.3.  In  particular,  be  aware  that 
interchanging  two  rows  in  a determinant  introduces  a multiplicative  factor  of  —l  to  the 
value  of  the  determinant^.  Details  are  as  follows. 


Behavior  of  an  nth-Order  Determinant  under  Elementary  Row  Operations 

(a)  Interchange  of  two  rows  multiplies  the  value  of  the  determinant  by  —1. 

(b)  Addition  of  a multiple  of  a row  to  another  row  does  not  alter  the  value  of  the 
determinant. 

(c)  Multiplication  of  a row  by  a nonzero  constant  c multiplies  the  value  of  the 
determinant  by  c.  (This  holds  also  when  c = 0,  but  no  longer  gives  an  elementary 
row  operation.) 


(a)  By  induction.  The  statement  holds  for  n = 2 because 


a 

b 

= ad  — be. 

but 

c 

d 

c 

d 

a 

b 

= be  — ad. 
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We  now  make  the  induction  hypothesis  that  (a)  holds  for  determinants  of  order  n — 1 § 2 
and  show  that  it  then  holds  for  determinants  of  order  n.  Let  D be  of  order  n.  Let  E be 
obtained  from  D by  the  interchange  of  two  rows.  Expand  D and  £ by  a row  that  is  not 
one  of  those  interchanged,  call  it  the  jth  row.  Then  by  (4a), 


(5)  D=  2(“1  y + kajkMjk,  E=  2(“1  )i+kajkNjk 

k = 1 k= 1 

where  Nj^  is  obtained  from  the  minor  Mjk  of  in  I)  by  the  interchange  of  those  two 
rows  which  have  been  interchanged  in  D (and  which  Njk  must  both  contain  because  we 
expand  by  another  row!).  Now  these  minors  are  of  order  n — 1.  Hence  the  induction 
hypothesis  applies  and  gives  Njk  = — Thus  E = —D  by  (5). 

(b)  Add  c times  Row  i to  Row  j.  Let  D be  the  new  determinant.  Its  entries  in  Row  j 
are  ajk  + ca.^,.  If  we  expand  D by  this  Row  j , we  see  that  we  can  write  it  as 
D = Di  + cD2,  where  D\  = D has  in  Row  j the  cilk,  whereas  D2  has  in  that  Row  j the 
a,jp-  from  the  addition.  Hence  I)2  has  a3p  in  both  Row  i and  Row  j.  Interchanging  these 
two  rows  gives  D2  back,  but  on  the  other  hand  it  gives  —D2  by  (a).  Together 
D2  = ~D2  = 0,  so  that  D = Di  = D. 

(c)  Expand  the  determinant  by  the  row  that  has  been  multiplied. 

CAUTION!  det  (cA)  = cn  det  A (not  c det  A).  Explain  why. 


Evaluation  of  Determinants  by  Reduction  to  Triangular  Form 

Because  of  Theorem  1 we  may  evaluate  determinants  by  reduction  to  triangular  form,  as  in  the  Gauss  elimination 
for  a matrix.  For  instance  (with  the  blue  explanations  always  referring  to  the  preceding  determinant) 


2 

0 

-4 

6 

4 

5 

1 

0 

0 

2 

6 

-1 

3 

8 

9 

1 

2 

0 

-4 

6 

0 

5 

9 

-12 

Row  2 — 2 Row  1 

0 

2 

6 

-1 

0 

8 

3 

10 

Row  4+1.5  Row  1 

2 

0 

-4 

6 

0 

5 

9 

-12 

0 

0 

2.4 

3.8 

Row  3 — 0.4  Row  2 

0 

0 

-11.4 

29.2 

Row  4—1.6  Row  2 

2 

0 

-4 

6 

0 

5 

9 

-12 

0 

0 

2.4 

3.8 

0 

0 

-0 

47.25 

Row  4 + 4.75  Row  3 

= 2 • 5 • 2.4  • 47.25  = 1134. 
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THEOREM  2 


PROOF 


THEOREM  3 


PROOF 


Further  Properties  of  nth-Order  Determinants 

(a)-(c)  in  Theorem  1 hold  also  for  columns. 

(d)  Transposition  leaves  the  value  of  a determinant  unaltered. 

(e)  A zero  row  or  column  renders  the  value  of  a determinant  zero. 

(f ) Proportional  rows  or  columns  render  the  value  of  a determinant  zero.  In 
particular,  a determinant  with  two  identical  rows  or  columns  has  the  value  zero. 


(a)-(e)  follow  directly  from  the  fact  that  a determinant  can  be  expanded  by  any  row 
column.  In  (d),  transposition  is  defined  as  for  matrices,  that  is,  the  /'th  row  becomes  the 
jth  column  of  the  transpose. 

(f)  If  Row  j = c times  Row  i,  then  D = cZ)1;  where  If  has  Row  j = Row  i.  Hence 
an  interchange  of  these  rows  reproduces  Dj_,  but  it  also  gives  —If  by  Theorem  1(a). 
Hence  Di  = 0 and  D = cD\  = 0.  Similarly  for  columns.  ■ 

It  is  quite  remarkable  that  the  important  concept  of  the  rank  of  a matrix  A,  which  is  the 
maximum  number  of  linearly  independent  row  or  column  vectors  of  A (see  Sec.  7.4),  can 
be  related  to  determinants.  Here  we  may  assume  that  rank  A > 0 because  the  only  matrices 
with  rank  0 are  the  zero  matrices  (see  Sec.  7.4). 


Rank  in  Terms  of  Determinants 

Consider  an  m X n matrix  A = [djk]'. 

(1)  A has  rank  1 if  and  only  if  A has  an  r X r submatrix  with  a nonzero 
determinant. 

(2)  The  determinant  of  any  square  submatrix  with  more  than  r rows,  contained 
in  A (if  such  a matrix  exists!)  has  a value  equal  to  zero. 

Furthermore,  if  m = n,  we  have: 

(3)  An  n X n square  matrix  A has  rank  n if  and  only  if 

det  A ¥=  0. 


The  key  idea  is  that  elementary  row  operations  (Sec.  7.3)  alter  neither  rank  (by  Theorem 
1 in  Sec.  7.4)  nor  the  property  of  a determinant  being  nonzero  (by  Theorem  1 in  this 
section).  The  echelon  form  A of  A (see  Sec.  7.3)  has  r nonzero  row  vectors  (which  are 
the  first  r row  vectors)  if  and  only  if  rank  A = r.  Without  loss  of  generality,  we  can 
assume  that  rg  1.  Let  R be  the  r X r submatrix  in  the  left  upper  corner  of  A (so  that 
the  entries  of  R are  in  both  the  first  r rows  and  r columns  of  A).  Now  R is  triangular, 
with  all  diagonal  entries  r:n  nonzero.  Thus,  det  R = rn  • ■ • r„  !=■  0.  Also  det  R C 0 for 
the  corresponding  r X r submatrix  R of  A because  R results  from  R by  elementary  row 
operations.  This  proves  part  (1). 

Similarly,  det  S = 0 for  any  square  submatrix  S of  r + 1 or  more  rows  perhaps 
contained  in  A because  the  corresponding  submatrix  S of  A must  contain  a row  of  zeros 
(otherwise  we  would  have  rank  A §S  r + 1),  so  that  det  S = 0 by  Theorem  2.  This  proves 
part  (2).  Furthermore,  we  have  proven  the  theorem  for  an  m X n matrix. 
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For  an  n X n square  matrix  A we  proceed  as  follows.  To  prove  (3),  we  apply  part  (1) 
(already  proven!).  This  gives  us  that  rank  A = n §=  1 if  and  only  if  A contains  an  n X n 
submatrix  with  nonzero  determinant.  But  the  only  such  submatrix  contained  in  our  square 
matrix  A,  is  A itself,  hence  det  A # 0.  This  proves  part  (3). 

Cramer’s  Rule 

Theorem  3 opens  the  way  to  the  classical  solution  formula  for  linear  systems  known  as 
Cramer’s  rule,2  which  gives  solutions  as  quotients  of  determinants.  Cramer’s  rule  is  not 
practical  in  computations  for  which  the  methods  in  Secs.  7.3  and  20.1-20.3  are  suitable. 
However,  Cramer’s  rule  is  of  theoretical  interest  in  differential  equations  (Secs.  2.10  and 
3.3)  and  in  other  theoretical  work  that  has  engineering  applications. 


Cramer’s  Theorem  (Solution  of  Linear  Systems  by  Determinants) 

(a)  If  a linear  system  of  n equations  in  the  same  number  of  unknowns  X\,  ■ • • , xn 

fln*i  + 012*2  + ' • • + a\nxn  = b\ 

021*1  T 022*2  T * T ^2n*n  ^2 

(6) 


On  1*  1 + an 2*2  T ’ T Onn*  n bn 

has  a nonzero  coefficient  determinant  D = det  A,  the  system  has  precisely  one 
solution.  This  solution  is  given  by  the  formulas 

Dx  Dq.  T)n 

(7)  *i  = — , *2  = —,•••,  *n  = — (Cramer’s  rule) 

where  is  the  determinant  obtained  from  D by  replacing  m D the  kth  column  by 
the  column  with  the  entries  b\,  ■ • ■ , bn. 

(b)  Hence  if  the  system  (6)  is  homogeneous  and  I)  T 0,  it  has  only  the  trivial 
solution  Xi  = 0,  x2  = 0,  • • • , xn  = 0.  If  D = 0,  the  homogeneous  system  also  has 
nontrivial  solutions. 


The  augmented  matrix  A of  the  system  (6)  is  of  size  n X (n  + 1).  Hence  its  rank  can  be 
at  most  n.  Now  if 


on  ‘ ' ‘ oi n 


(8) 


D = det  A = 


* 0, 


Ortl  * * * 0,m 


2GABRIEL  CRAMER  (1704-1752),  Swiss  mathematician. 
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then  rank  A = n by  Theorem  3.  Thus  rank  A = rank  A.  Hence,  by  the  Fundamental 
Theorem  in  Sec.  7.5,  the  system  (6)  has  a unique  solution. 

Let  us  now  prove  (7).  Expanding  D by  its  kth  column,  we  obtain 

(9)  D = alkClk  + a2k,C2k  + ‘ ' ' + ankCnk, 

where  is  the  cofactor  of  entry  alk  in  I).  If  we  replace  the  entries  in  the  kth  column  of 
D by  any  other  numbers,  we  obtain  a new  determinant,  say,  D.  Clearly,  its  expansion  by 
the  Arth  column  will  be  of  the  form  (9),  with  aik,  • • • , ank  replaced  by  those  new  numbers 
and  the  cofactors  Clk  as  before.  In  particular,  if  we  choose  as  new  numbers  the  entries 
an,  • • • , ani  of  the  Zth  column  of  D (where  / # k),  we  have  a new  determinant  D which 
has  the  column  [an  ■ ■ ■ ani]T  twice,  once  as  its  /th  column,  and  once  as  its  kth  because 
of  the  replacement.  Hence  D = 0 by  Theorem  2(f).  If  we  now  expand  D by  the  column 
that  has  been  replaced  (the  kth  column),  we  thus  obtain 

(10)  auCik  + a2iC2k  + ■ ■ • + aniCnk  = 0 (/  + k). 

We  now  multiply  the  first  equation  in  (6)  by  C ik  on  both  sides,  the  second  by  C2k,  ■ • ■ , 
the  last  by  Cnk,  and  add  the  resulting  equations.  This  gives 


(ID 


f^ifc(^ii^i  T • ■ ■ T a~LnX'yi)  + ■ ■ • + ^nki^ni^-i  F * * ■ T annXYi) 

— b\Cik  + • ■ • + bnCnk- 


Collecting  terms  with  the  same  x-p  we  can  write  the  left  side  as 


X\{ai\C\k  T Q21C2 k "F  * ‘ ' T an\Cnk ) F ■ * ' F Xn{a±nCik  F d2n^-2k  F ■ ■ • F dnnCnk)- 


From  this  we  see  that  xk  is  multiplied  by 


alk(-lk  F d2kC2k  + ' ' ' + ank^nk- 

Equation  (9)  shows  that  this  equals  D.  Similarly,  x j is  multiplied  by 


ailCik  F a2iC2k  F • ■ • F aniCnk- 


Equation  (10)  shows  that  this  is  zero  when  l k.  Accordingly,  the  left  side  of  (1 1)  equals 
simply  X};D,  so  that  (11)  becomes 


XkP  — hClk  F b2C2k  F • ■ • F bnCnk. 

Now  the  right  side  of  this  is  Dk  as  defined  in  the  theorem,  expanded  by  its  kth  column, 
so  that  division  by  D gives  (7).  This  proves  Cramer’s  rule. 

If  (6)  is  homogeneous  and  D ¥=  0,  then  each  Dk  has  a column  of  zeros,  so  that  Dk  = 0 
by  Theorem  2(e),  and  (7)  gives  the  trivial  solution. 

Finally,  if  (6)  is  homogeneous  and  D = 0,  then  rank  A < n by  Theorem  3,  so  that 
nontrivial  solutions  exist  by  Theorem  2 in  Sec.  7.5. 

Illustration  of  Cramer’s  Rule  (Theorem  4) 

For  n = 2,  see  Example  1 of  Sec.  7.6.  Also,  at  the  end  of  that  section,  we  give  Cramer’s  rule  for  a general 
linear  system  of  three  equations. 
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Finally,  an  important  application  for  Cramer’s  rule  dealing  with  inverse  matrices  will 
be  given  in  the  next  section. 


FRQBLrEM~SET-7~7 


1-6 


GENERAL  PROBLEMS 


1.  General  Properties  of  Determinants.  Illustrate  each 
statement  in  Theorems  1 and  2 with  an  example  of 
your  choice. 


2.  Second-Order  Determinant.  Expand  a general 
second-order  determinant  in  four  possible  ways  and 
show  that  the  results  agree. 


3.  Third-Order  Determinant.  Do  the  task  indicated  in 
Theorem  2.  Also  evaluate  D by  reduction  to  triangular 
form. 


4.  Expansion  Numerically  Impractical.  Show  that  the 
computation  of  an  nth-order  determinant  by  expansion 
involves  n\  multiplications,  which  if  a multiplication 
takes  10-9  sec  would  take  these  times: 


1 

2 


15. 


0 

0 


2 

4 

2 

0 


0 0 
2 0 
9 2 

2 16 


16.  CAS  EXPERIMENT.  Determinant  of  Zeros  and 
Ones.  Find  the  value  of  the  determinant  of  the  n X n 
matrix  An  with  main  diagonal  entries  all  0 and  all 
others  1.  Try  to  find  a formula  for  this.  Try  to  prove  it 
by  induction.  Interpret  A3  and  A 4 as  incidence  matrices 
(as  in  Problem  Set  7.1  but  without  the  minuses)  of  a 
triangle  and  a tetrahedron,  respectively;  similarly  for  an 
n-simplex,  having  n vertices  and  n(n  — l)/2  edges  (and 
spanning  Rn~1,  n = 5,  6,  ■ ■ ■ ). 


n 

10 

15 

20 

25 

Time 

0.004 

22 

77 

0.5  • 109 

sec 

min 

years 

years 

5.  Multiplication  by  Scalar.  Show  that  det  (kA)  — 
kn  det  A (not  k det  A).  Give  an  example. 

6.  Minors,  cofactors.  Complete  the  list  in  Example  1. 


7-15 


EVALUATION  OF  DETERMINANTS 


Showing  the  details,  evaluate: 


cos  a 

sin  a 

0.4  4.9 

7. 

sin  (3 

cos  /3 

8. 

1.5  -1.3 

cos  nd 

sin  nd 

cosh  t sinh  t 

9. 

10. 

—sin  nd 

cos  nd 

sinh  t cosh  t 

4 -1 

8 

a b 

c 

11. 


13. 


0 

0 

0 

-4 

1 

-5 


2 

0 

4 

0 

-3 

2 


3 

5 

-1 

3 

0 

-1 


12. 


5 

-2 

1 

0 


14. 


c 

7 

8 
0 
0 


b 

a 

0 

0 

1 

-2 


17-19 


RANK  BY  DETERMINANTS 


Find  the  rank  by  Theorem  3 (which  is  not  very  practical) 
and  check  by  row  reduction.  Show  details. 


4 

9" 

0 

4 

—6~ 

17. 

-8 

-6 

18. 

4 

0 

10 

16 

12 

-6 

10 

0 

1 

5 

2 

2 

19. 

1 

3 

2 

6 

4 

0 

8 

48  _ 

20.  TEAM  PROJECT.  Geometric  Applications:  Curves 
and  Surfaces  Through  Given  Points.  The  idea  is  to 
get  an  equation  from  the  vanishing  of  the  determinant 
of  a homogeneous  linear  system  as  the  condition  for  a 
nontrivial  solution  in  Cramer’s  theorem.  We  explain 
the  trick  for  obtaining  such  a system  for  the  case  of 
a line  L through  two  given  points  P\\  (x\,  yi)  and 
P2:  (x2,  y2).  The  unknown  line  is  ax  + by  = — c, 
say.  We  write  it  as  ax  + by  + c ■ 1 = 0.  To  get  a 
nontrivial  solution  a,  b,  c,  the  determinant  of  the 
“coefficients”  x,  y,  1 must  be  zero.  The  system  is 


ax  + by  + c ■ 1 = 0 (Line  L ) 
(12)  ax  1 + by  1 + c ■ 1 = 0 (Pi  oni) 

ax2  + by 2 + c • 1 = 0 (P2  on  L). 
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(a)  Line  through  two  points.  Derive  from  D = 0 in 
(12)  the  familiar  formula 


21-25 


CRAMER’S  RULE 


x — X\ 


y i 


Solve  by  Cramer’s  rule.  Check  by  Gauss  elimination  and 
back  substitution.  Show  details. 


Xi  - X2  yi  - y2' 

21. 

3x  - 5 y = 15.5 

22.  2x  — 4y  = - 

-24 

(b)  Plane.  Find  the  analog  of  (12)  for  a plane  through 
three  given  points.  Apply  it  when  the  points  are 

6x  + 16v  = 5.0 

5x  + 2y  = 

0 

(1,  1,  1),  (3,2,  6),  (5,0,  5). 

(c)  Circle.  Find  a similar  formula  for  a circle  in  the 

23. 

3y  - 4 z = 

16 

24.  3x  - 2y  + 

z = 

13 

plane  through  three  given  points.  Find  and  sketch  the 

2x  — 5y  + Iz  = - 

■27 

—2x  + y + 

4 z = 

11 

circle  through  (2,  6),  (6,  4),  (7,  1). 

(d)  Sphere.  Find  the  analog  of  the  formula  in  (c)  for 

— x — 9z  = 

9 

x + 4y  — 

5 z = 

-31 

a sphere  through  four  given  points.  Find  the  sphere 
through  (0,  0,  5),  (4,  0,  1),  (0,  4,  1),  (0,  0,  -3)  by  this 

25. 

—4  w + x + y 

= 

-10 

formula  or  by  inspection. 

5 

1 

£ 

+ 

z = 

1 

(e)  General  conic  section.  Find  a formula  for  a 
general  conic  section  (the  vanishing  of  a determinant 

w — 4y  + 

z = 

-7 

of  6th  order).  Try  it  out  for  a quadratic  parabola  and 

x + y — 

4 z = 

10 

for  a more  general  conic  section  of  your  own  choice. 


7.8  Inverse  of  a Matrix. 

Gauss-Jordan  Elimination 

In  this  section  we  consider  square  matrices  exclusively . 

The  inverse  of  an  n X n matrix  A = [ajk]  is  denoted  by  A-1  and  is  an  n X n matrix 
such  that 

(!)  A A-1  = A-1  A = I 

where  I is  the  n X n unit  matrix  (see  Sec.  7.2). 

If  A has  an  inverse,  then  A is  called  a nonsingular  matrix.  If  A has  no  inverse,  then 
A is  called  a singular  matrix. 

If  A has  an  inverse,  the  inverse  is  unique. 

Indeed,  if  both  B and  C are  inverses  of  A,  then  AB  = I and  CA  = I,  so  that  we  obtain 
the  uniqueness  from 


B = IB  = (CA)B  = C(AB)  = Cl  = C. 


We  prove  next  that  A has  an  inverse  (is  nonsingular)  if  and  only  if  it  has  maximum 
possible  rank  n.  The  proof  will  also  show  that  Ax  = b implies  x = A-1b  provided  A-1 
exists,  and  will  thus  give  a motivation  for  the  inverse  as  well  as  a relation  to  linear  systems. 
(But  this  will  not  give  a good  method  of  solving  Ax  = b numerically  because  the  Gauss 
elimination  in  Sec.  7.3  requires  fewer  computations.) 


THEOREM  1 


Existence  of  the  Inverse 

The  inverse  A-1  of  an  n X n matrix  A exists  if  and  only  (/rank  A = n,  thus  (by 
Theorem  3,  Sec.  7.7)  if  and  only  if  det  A T-  0.  Hence  A is  nonsingular  if  rank  A = n, 
and  is  singular  if  rank  A < n. 
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PROOF  Let  A be  a given  n X n matrix  and  consider  the  linear  system 
(2)  Ax  = b. 

If  the  inverse  A-1  exists,  then  multiplication  from  the  left  on  both  sides  and  use  of  (1) 
gives 


A 1Ax  = x = A 1b. 

This  shows  that  (2)  has  a solution  x,  which  is  unique  because,  for  another  solution  u,  we 
have  Au  = b,  so  that  u = A- 1 b = x.  Hence  A must  have  rank  n by  the  Fundamental 
Theorem  in  Sec.  7.5. 

Conversely,  let  rank  A = n.  Then  by  the  same  theorem,  the  system  (2)  has  a unique 
solution  x for  any  b.  Now  the  back  substitution  following  the  Gauss  elimination  (Sec.  7.3) 
shows  that  the  components  xj  of  x are  linear  combinations  of  those  of  b.  Hence  we  can 
write 

(3)  x = Bb 

with  B to  be  determined.  Substitution  into  (2)  gives 

Ax  = A(Bb)  = (AB)b  = Cb  = b (C  = AB) 

for  any  b.  Hence  C = AB  = I,  the  unit  matrix.  Similarly,  if  we  substitute  (2)  into  (3)  we  get 

x = Bb  = B(Ax)  = (BA)x 

for  any  x (and  b = Ax).  Hence  BA  = I.  Together,  B = A-1  exists. 

Determination  of  the  Inverse  by  the 
Gauss-Jordan  Method 

To  actually  determine  the  inverse  A-1  of  a nonsingular  n X n matrix  A,  we  can  use  a 
variant  of  the  Gauss  elimination  (Sec.  7.3),  called  the  Gauss-Jordan  elimination.3  The 
idea  of  the  method  is  as  follows. 

Using  A,  we  form  n linear  systems 


Ax(d  £(i),  * ■ ■ 5 Ax(n)  C(n) 


where  the  vectors  e(i),  • • • , e(n)  are  the  columns  of  the  n X n unit  matrix  I;  thus, 
e(i)  = [ 1 0 ■ • • 0]T,  e(2)  = [0  1 0 • • ■ 0]T,  etc.  These  are  n vector  equations 
in  the  unknown  vectors  xq),  • • • , X(n>  We  combine  them  into  a single  matrix  equation 


3WILHELM  JORDAN  (1842-1899),  German  geodesist  and  mathematician.  He  did  important  geodesic  work 
in  Africa,  where  he  surveyed  oases.  [See  Althoen,  S.C.  and  R.  McLaughlin,  Gauss-Jordan  reduction:  A brief 
history.  American  Mathematical  Monthly,  Vol.  94,  No.  2 (1987),  pp.  130-142.] 

We  do  not  recommend  it  as  a method  for  solving  systems  of  linear  equations,  since  the  number  of  operations 
in  addition  to  those  of  the  Gauss  elimination  is  larger  than  that  for  back  substitution,  which  the  Gauss-Jordan 
elimination  avoids.  See  also  Sec.  20.1. 


SEC.  7.8  Inverse  of  a Matrix.  Gauss-Jordan  Elimination 


303 


EXAMPLE  1 


AX  = I,  with  the  unknown  matrix  X having  the  columns  x(1),  ■ • • , x(rl).  Correspondingly, 
we  combine  the  n augmented  matrices  [A  e^],  • • ■ , [A  ] into  one  wide  n X 2 n 
“augmented  matrix”  A = [A  I].  Now  multiplication  of  AX  = I by  A-1  from  the  left 
gives  X = A-1I  = A-1.  Hence,  to  solve  AX  = I for  X,  we  can  apply  the  Gauss 
elimination  to  A = [A  I],  This  gives  a matrix  of  the  form  [U  H]  with  upper  triangular 
U because  the  Gauss  elimination  triangularizes  systems.  The  Gauss-Jordan  method 
reduces  U by  further  elementary  row  operations  to  diagonal  form,  in  fact  to  the  unit  matrix 
I.  This  is  done  by  eliminating  the  entries  of  U above  the  main  diagonal  and  making  the 
diagonal  entries  all  1 by  multiplication  (see  Example  1).  Of  course,  the  method  operates 
on  the  entire  matrix  [U  H],  transforming  H into  some  matrix  K,  hence  the  entire  [U  H] 
to  [I  K],  This  is  the  “augmented  matrix”  of  IX  = K.  Now  IX  = X = A-1,  as  shown 
before.  By  comparison,  K = A-1,  so  that  we  can  read  A-1  directly  from  [I  K], 

The  following  example  illustrates  the  practical  details  of  the  method. 

Finding  the  inverse  of  a Matrix  by  Gauss-Jordan  Elimination 

Determine  the  inverse  A-1  of 


-1  1 2 


A = 


3 -1 


-13  4 


Solution.  We  apply  the  Gauss  elimination  (Sec.  7.3)  to  the  following  n X 2«  = 3 X 6 matrix,  where  BLUE 
always  refers  to  the  previous  matrix. 


-1 

1 

2 

1 

0 

o' 

3 

-1 

1 

0 

1 

0 

-1 

3 

4 

0 

0 

1 

-1 

1 

2 

1 

0 

(f 

0 

2 

7 

3 

1 

0 

Row  2 + 3 Row  1 

0 

2 

2 

-1 

0 

1 

Row  3 — Row  1 

-1 

1 

2 

1 

0 

0 

0 

2 

7 

3 

1 

0 

0 

0 

-5 

-4 

-1 

1 

Row  3 — Row  2 

This  is  [U  H]  as  produced  by  the  Gauss  elimination.  Now  follow  the  additional  Gauss-Jordan  steps,  reducing 
U to  I,  that  is,  to  diagonal  form  with  entries  1 on  the  main  diagonal. 


i 

-1 

-2 

-1 

0 

0 

0 

1 

3.5 

1.5 

0.5 

0 

0 

0 

1 

0.8 

0.2 

-0.2 

1 

-1 

0 

0.6 

0.4 

-0.4~ 

0 

1 

0 

-1.3 

-0.2 

0.7 

0 

0 

1 

0.8 

0.2 

-0.2 

1 

0 

0 

-0.7 

0.2 

0.3" 

0 

1 

0 

-1.3 

-0.2 

0.7 

0 

0 

1 

0.8 

0.2 

-0.2 

—Row  1 
0.5  Row  2 
-0.2  Row  3 
Row  1 + 2 Row  3 
Row  2 - 3.5  Row  3 


Row  1 + Row  2 
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THEOREM  2 


PROOF 


The  last  three  columns  constitute  A 1.  Check: 


-1  1 2 

-0.7  0.2  0.3 

1 0 0 

3 -1  1 

-1.3  -0.2  0.7 

= 

0 1 0 

1 

1 

GJ 

0.8  0.2  -0.2 

0 0 1 

Hence  AA  1 = I.  Similarly,  A 1A  = I. 

Formulas  for  Inverses 

Since  finding  the  inverse  of  a matrix  is  really  a problem  of  solving  a system  of  linear 
equations,  it  is  not  surprising  that  Cramer’s  rule  (Theorem  4,  Sec.  7.7)  might  come  into 
play.  And  similarly,  as  Cramer’s  rule  was  useful  for  theoretical  study  but  not  for 
computation,  so  too  is  the  explicit  formula  (4)  in  the  following  theorem  useful  for 
theoretical  considerations  but  not  recommended  for  actually  determining  inverse  matrices, 
except  for  the  frequently  occurring  2X2  case  as  given  in  (4*). 


Inverse  of  a Matrix  by  Determinants 

The  inverse  of  a nonsingular  n X n matrix  A = [to]  is  given  by 


(4) 


A"1  = 


det  A 


[Cjk\T  = 


C it  C2i 

Cl  2 C2 2 


CVti 

1 r,'2 


det  A 


C\n  C- 


2 n 


c 


where  Cjk  is  the  cofactor  of  ajk  in  det  A (see  Sec.  7.7).  (CAUTION!  Note  well  that 
in  A-1,  the  cofactor  Cjk  occupies  the  same  place  as  ap-j  (not  ajk)  does  in  A.) 

In  particular,  the  inverse  of 


all 

a12 

is 

A-1  1 

a22 

a12 

.021 

«22. 

det  A 

. ~ «21 

We  denote  the  right  side  of  (4)  by  B and  show  that  BA  = I.  We  first  write 

(5)  BA  = G = [gkl] 

and  then  show  that  G = I.  Now  by  the  definition  of  matrix  multiplication  and  because  of 
the  form  of  B in  (4),  we  obtain  (CAUTION ! Csk,  not  Cp-S) 

^ Csk  i 

(6)  gkl  ^ i deUA  det~A  T ■ ■ ■ T oniCnk)- 

s= 1 
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EXAMPLE  2 


EXAMPLE  3 


PROOF 


Now  (9)  and  (10)  in  Sec.  7.7  show  that  the  sum  ( • ■ ■ ) on  the  right  is  D = det  A when 
l = k,  and  is  zero  when  I ¥=  k.  Hence 


1 


8kk 


det  A = 1 , 


det  A 

gki  = 0 (l=f=  k). 

In  particular,  for  n = 2 we  have  in  (4),  in  the  first  row,  C n = 022,  C21  = ~ai2  and, 
in  the  second  row,  C 12  = —021*  C22  = an-  This  gives  (4*). 

The  special  case  n = 2 occurs  quite  frequently  in  geometric  and  other  applications.  You 
may  perhaps  want  to  memorize  formula  (4*).  Example  2 gives  an  illustration  of  (4*). 

Inverse  of  a 2 X 2 Matrix  by  Determinants 


"3  f 

, A-1  = — 

4 

-1 

0.4 

-0.1 

2 4 

10 

-2 

3 

-0.2 

0.3 

A = 

Further  Illustration  of  Theorem  2 

Using  (4),  find  the  inverse  of 

~-l  1 2 

A = 3-11 

-1  3 4_ 

Solution.  We  obtain  det  A = — 1( — 7)  — 1 ■ 13  + 2-8  = 10,  and  in  (4), 


C11  — 


-1  1 

1 2 

1 2 

1 

II 

<0 

1 

II 

= 2,  C31  = 

3 4 

3 4 

-1  1 

= 3, 


3 

1 

- 

1 2 

-1 

2 

C12  = 

-1 

4 

= -13, 

C22  - 

- 

1 4 

-2, 

C32  - 

3 

1 

3 

-1 

-1 

1 

- 

1 

1 

C13  = 

— 

1 

3 

= 8, 

C23  = 

“ 

-1 

3 

= 2, 

C33  = 

3 

-1 

= 7, 


= -2, 


so  that  by  (4),  in  agreement  with  Example  1 , 


-0.7 

0.2 

0.3 

-1.3 

-0.2 

0.7 

0.8 

0.2 

-0.2 

Diagonal  matrices  A = [a,* J,  alk  = 0 when  j i=-  k,  have  an  inverse  if  and  only  if  all 
ajj  4 0.  Then  A-1  is  diagonal,  too,  with  entries  1 /flu,  ■ ■ • , 1 /ann. 

For  a diagonal  matrix  we  have  in  (4) 

C11  a22  ®nn  1 

= = , etc. 

D allfl22 ' ' ‘ ^nri  flll 
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EXAMPLE  4 


PROOF 


Inverse  of  a Diagonal  Matrix 

Let 


-0.5  0 0 


A = 0 4 0 . 


0 0 1 


Then  we  obtain  the  inverse  A 1 by  inverting  each  individual  diagonal  element  of  A,  that  is,  by  taking  l/(— 0.5),  j , 
and  ] as  the  diagonal  entries  of  A-1,  that  is, 


A"1  = 


-2  0 0 

0 0.25  0 

0 0 1 


Products  can  be  inverted  by  taking  the  inverse  of  each  factor  and  multiplying  these 
inverses  in  reverse  order , 


(7) 


(AC)-1  = C-1A-1. 


Hence  for  more  than  two  factors, 

(8)  (AC  PQ)-1  = Q-1P-1  C-1A-1. 


The  idea  is  to  start  from  (1)  for  AC  instead  of  A,  that  is,  AC(AC)  = I,  and  multiply 
it  on  both  sides  from  the  left,  first  by  A-1,  which  because  of  A-1  A = I gives 

A-1AC(AC)-1  = C(AC)-1 
= A-1I  = A-1, 

and  then  multiplying  this  on  both  sides  from  the  left,  this  time  by  C-1  and  by  using 
C-1C  = I, 


C-1C(AC)-1  = (AC)-1  = C-1A-1. 

This  proves  (7),  and  from  it,  (8)  follows  by  induction. 

We  also  note  that  the  inverse  of  the  inverse  is  the  given  matrix , as  you  may  prove, 

(9)  (A-1)-1  = A. 

Unusual  Properties  of  Matrix  Multiplication. 
Cancellation  Laws 

Section  7.2  contains  warnings  that  some  properties  of  matrix  multiplication  deviate  from 
those  for  numbers,  and  we  are  now  able  to  explain  the  restricted  validity  of  the  so-called 
cancellation  laws  [2]  and  [3]  below,  using  rank  and  inverse,  concepts  that  were  not  yet 
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THEOREM  3 


PROOF 


THEOREM  4 


available  in  Sec.  7.2.  The  deviations  from  the  usual  are  of  great  practical  importance  and 
must  be  carefully  observed.  They  are  as  follows. 

[1]  Matrix  multiplication  is  not  commutative,  that  is,  in  general  we  have 


AB  # BA. 

[2]  AB  = 0 does  not  generally  imply  A = 0 or  B = 0 (or  BA  = 0);  for  example, 


1 

l" 

’-1 

f 

0 

o' 

2 

2 

1 

-1 

0 

0 

[3]  AC  = AD  does  not  generally  imply  C = D (even  when  A =7=  0). 
Complete  answers  to  [2]  and  [3]  are  contained  in  the  following  theorem. 


Cancellation  Laws 

Let  A,  B,  C be  n X n matrices.  Then: 

(a)  If  rank  A = n and  AB  = AC,  then  B = C. 

(b)  If  rank  A = n,  then  AB  = 0 implies  B = 0.  Hence  if  AB  = 0,  but  A L 0 
as  well  as  B L 0,  then  rank  A < n and  rank  B < n. 

(c)  If  A is  singular,  so  are  BA  and  AB. 


(a)  The  inverse  of  A exists  by  Theorem  1.  Multiplication  by  A 1 from  the  left  gives 
A_1AB  = A-1  AC,  hence  B = C. 

(b)  Let  rank  A = n.  Then  A-1  exists,  and  AB  = 0 implies  A_1AB  = B = 0.  Similarly 
when  rank  B = n.  This  implies  the  second  statement  in  (b). 

(ci)  Rank  A < n by  Theorem  1.  Hence  Ax  = 0 has  nontrivial  solutions  by  Theorem  2 
in  Sec.  7.5.  Multiplication  by  B shows  that  these  solutions  are  also  solutions  of  BAx  = 0, 
so  that  rank  (BA)  < n by  Theorem  2 in  Sec.  7.5  and  BA  is  singular  by  Theorem  1. 

(C2)  AT  is  singular  by  Theorem  2(d)  in  Sec.  7.7.  Hence  BTAT  is  singular  by  part  (ci), 
and  is  equal  to  (AB)T  by  (lOd)  in  Sec.  7.2.  Hence  AB  is  singular  by  Theorem  2(d)  in 
Sec.  7.7. 

Determinants  of  Matrix  Products 

The  determinant  of  a matrix  product  AB  or  BA  can  be  written  as  the  product  of  the 
determinants  of  the  factors,  and  it  is  interesting  that  det  AB  = det  BA,  although  AB  # BA 
in  general.  The  corresponding  formula  (10)  is  needed  occasionally  and  can  be  obtained 
by  Gauss-Jordan  elimination  (see  Example  1)  and  from  the  theorem  just  proved. 


Determinant  of  a Product  of  Matrices 

For  any  n X n matrices  A and  B, 

(10)  det  (AB)  = det  (BA)  = det  A det  B. 
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PROOF  If  A or  B is  singular,  so  are  AB  and  BA  by  Theorem  3(c),  and  (10)  reduces  to  0 = 0 by 
Theorem  3 in  Sec.  7.7. 

Now  let  A and  B be  nonsingular.  Then  we  can  reduce  A to  a diagonal  matrix  A = 
by  Gauss-Jordan  steps.  Under  these  operations,  det  A retains  its  value,  by  Theorem  1 in 
Sec.  7.7,  (a)  and  (b)  [not  (c)]  except  perhaps  for  a sign  reversal  in  row  interchanging  when 
pivoting.  But  the  same  operations  reduce  AB  to  AB  with  the  same  effect  on  det  (AB). 

' ’ ' bin 

^22  ' ■ ' b2n 


by/,2  * ' * bnn 

bubin 

«22^2n 


annb 

nn\ 

We  now  take  the  determinant  det  (AB).  On  the  right  we  can  take  out  a factor  tin  from 
the  first  row,  ti22  from  the  second,  • • • , ann  from  the  nth.  But  this  product  ana22  ' ' ' ann 
equals  det  A because  A is  diagonal.  The  remaining  determinant  is  det  B.  This  proves  (10) 
for  det  (AB),  and  the  proof  for  det  (BA)  follows  by  the  same  idea. 

This  completes  our  discussion  of  linear  systems  (Secs.  7. 3-7. 8).  Section  7.9  on  vector 
spaces  and  linear  transformations  is  optional.  Numeric  methods  are  discussed  in  Secs. 
20.1-20.4,  which  are  independent  of  other  sections  on  numerics. 


Hence  it  remains  to  prove  (10)  for  AB;  written  out, 

611 

bz\ 

AB  = 


°nl 


an 

0 

0 

0 

«22 

0 

0 

0 

&nn 

dnbi2 

= 

^22^21 

«22^22 

flnnbnl 

®nnbn2 

PRO  BLEM  SET  7.8 


1-10 


INVERSE 


Find  the  inverse  by  Gauss-Jordan  (or  by  (4*)  if  n = 
Check  by  using  (1). 


1.80 

2.32 

cos  28 

sin  28 

1. 

2. 

-0.25 

0.60 

— sin  28 

cos  28 

~0.3  - 

0.1 

0.5" 

~0 

0 

0.1 

3. 

2 

6 

4 

4. 

0 

0.4 

0 

5 

0 

9 

2.5 

0 

0 

”1  0 

0" 

-4 

0 

0 

5. 

2 1 

0 

6. 

0 

8 

13 

5 4 

1 

0 

3 

5 

2). 


"0 

1 

0" 

1 

2 

3” 

7. 

1 

0 

0 

8. 

4 

5 

6 

0 

0 

1 

7 

8 

9_ 

"0 

8 

0" 

2 

3 

1 

3 

2 

3 

9. 

0 

0 

4 

10. 

2 

3 

2 

3 

1 

3 

2 

0 

0 

1 

3 

2 

3 

2 

3. 

11-18 


SOME  GENERAL  FORMULAS 


11.  Inverse  of  the  square.  Verify  (A2)  1 = 
in  Prob.  1. 


(A-1)2  for  A 


12.  Prove  the  formula  in  Prob.  1 1. 
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13.  Inverse  of  the  transpose.  Verify  (AT)  1 = (A  1)Tfor 
A in  Prob.  1 . 

14.  Prove  the  formula  in  Prob.  13. 

15.  Inverse  of  the  inverse.  Prove  that  (A-1)-1  = A. 

16.  Rotation.  Give  an  application  of  the  matrix  in  Prob.  2 
that  makes  the  form  of  the  inverse  obvious. 

17.  Triangular  matrix.  Is  the  inverse  of  a triangular 
matrix  always  triangular  (as  in  Prob.  5)?  Give  reason. 


18.  Row  interchange.  Same  task  as  in  Prob.  16  for  the 
matrix  in  Prob.  7. 


FORMULA  (4) 

Formula  (4)  is  occasionally  needed  in  theory.  To  understand 
it,  apply  it  and  check  the  result  by  Gauss-Jordan: 

19.  In  Prob.  3 

20.  In  Prob.  6 


19-20 


7.9  Vector  Spaces,  Inner  Product  Spaces, 

Linear  Transformations  Optional 

We  have  captured  the  essence  of  vector  spaces  in  Sec.  7.4.  There  we  dealt  with  special 
vector  spaces  that  arose  quite  naturally  in  the  context  of  matrices  and  linear  systems.  The 
elements  of  these  vector  spaces,  called  vectors,  satisfied  rules  (3)  and  (4)  of  Sec.  7.1 
(which  were  similar  to  those  for  numbers).  These  special  vector  spaces  were  generated 
by  spans,  that  is,  linear  combination  of  finitely  many  vectors.  Furthermore,  each  such 
vector  had  n real  numbers  as  components.  Review  this  material  before  going  on. 

We  can  generalize  this  idea  by  taking  all  vectors  with  n real  numbers  as  components 
and  obtain  the  very  important  real  n-dimensional  vector  space  Rn.  The  vectors  are  known 
as  “real  vectors.”  Thus,  each  vector  in  Rn  is  an  ordered  /(-tuple  of  real  numbers. 

Now  we  can  consider  special  values  for  n.  For  n = 2,  we  obtain  R2.  the  vector  space 
of  all  ordered  pairs,  which  correspond  to  the  vectors  in  the  plane.  For  n = 3,  we  obtain 
R3,  the  vector  space  of  all  ordered  triples,  which  are  the  vectors  in  3-space.  These  vectors 
have  wide  applications  in  mechanics,  geometry,  and  calculus  and  are  basic  to  the  engineer 
and  physicist. 

Similarly,  if  we  take  all  ordered  n-tuples  of  complex  numbers  as  vectors  and  complex 
numbers  as  scalars,  we  obtain  the  complex  vector  space  Cn,  which  we  shall  consider  in 
Sec.  8.5. 

Furthermore,  there  are  other  sets  of  practical  interest  consisting  of  matrices,  functions, 
transformations,  or  others  for  which  addition  and  scalar  multiplication  can  be  defined  in 
an  almost  natural  way  so  that  they  too  form  vector  spaces. 

It  is  perhaps  not  too  great  an  intellectual  jump  to  create,  from  the  concrete  model  Rn, 
the  abstract  concept  of  a real  vector  space  V by  taking  the  basic  properties  (3)  and  (4) 
in  Sec.  7.1  as  axioms.  In  this  way,  the  definition  of  a real  vector  space  arises. 


DEFINITION 


Real  Vector  Space 

A nonempty  set  V of  elements  a,  b,  • • • is  called  a real  vector  space  (or  real  linear 
space),  and  these  elements  are  called  vectors  (regardless  of  their  nature,  which  will 
come  out  from  the  context  or  will  be  left  arbitrary)  if,  in  V,  there  are  defined  two 
algebraic  operations  (called  vector  addition  and  scalar  multiplication)  as  follows. 

I.  Vector  addition  associates  with  every  pair  of  vectors  a and  b of  V a unique 
vector  of  V,  called  the  sum  of  a and  b and  denoted  by  a + b,  such  that  the  following 
axioms  are  satisfied. 
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1.1  Commutativity.  For  any  two  vectors  a and  b of  V, 

a + b = b + a. 

1.2  Associativity.  For  any  three  vectors  a,  b,  c of  V, 


(a  + b)  + c = a + (b  + c)  (written  a + b + c). 

1.3  There  is  a unique  vector  in  V,  called  the  zero  vector  and  denoted  by  0,  such 
that  for  every  a in  V, 


a + 0 = a. 


1.4  For  every  a in  V there  is  a unique  vector  in  V that  is  denoted  by  —a  and  is 
such  that 


a + (-a)  = 0. 


II.  Scalar  multiplication.  The  real  numbers  are  called  scalars.  Scalar 
multiplication  associates  with  every  a in  V and  every  scalar  c a unique  vector  of  V, 
called  the  product  of  c and  a and  denoted  by  ca  (or  ac)  such  that  the  following 
axioms  are  satisfied. 

11.1  Distributivity.  For  every  scalar  c and  vectors  a and  b in  V, 

c(a  + b)  = ca  + cb. 

11.2  Distributivity.  For  all  scalars  c and  k and  every  a in  V, 

(c  + k)  a = ca  + k a. 

11.3  Associativity.  For  all  scalars  c and  k and  every  a in  V, 

c(ki\)  = (ck)a  (written  cka). 

11.4  For  every  a in  V, 

la  = a. 


If,  in  the  above  definition,  we  take  complex  numbers  as  scalars  instead  of  real  numbers, 
we  obtain  the  axiomatic  definition  of  a complex  vector  space. 

Take  a look  at  the  axioms  in  the  above  definition.  Each  axiom  stands  on  its  own:  It 
is  concise,  useful,  and  it  expresses  a simple  property  of  V.  There  are  as  few  axioms  as 
possible  and  together  they  express  all  the  desired  properties  of  V.  Selecting  good  axioms 
is  a process  of  trial  and  error  that  often  extends  over  a long  period  of  time.  But  once 
agreed  upon,  axioms  become  standard  such  as  the  ones  in  the  definition  of  a real  vector 
space. 


SEC.  7.9  Vector  Spaces,  Inner  Product  Spaces,  Linear  Transformations  Optional 


311 


EXAMPLE  1 


EXAMPLE  2 


The  following  concepts  related  to  a vector  space  are  exactly  defined  as  those  given  in 
Sec.  7.4.  Indeed,  a linear  combination  of  vectors  a( \ >,  • • • , U(m)  in  a vector  space  V is  an 
expression 


ciaa)  + • • • + cmam  (ci,  ■■■  ,cm  any  scalars). 

These  vectors  form  a linearly  independent  set  (briefly,  they  are  called  linearly 
independent)  if 

(1)  cia(i)  + • ■ • + cm3(m)  0 

implies  that  c\  = 0,  • • ■ , cm  = 0.  Otherwise,  if  (1)  also  holds  with  scalars  not  all  zero,  the 
vectors  are  called  linearly  dependent. 

Note  that  (1)  with  m =1  is  ca  = 0 and  shows  that  a single  vector  a is  linearly 
independent  if  and  only  if  a A 0. 

V has  dimension  n,  or  is  n -dimensional,  if  it  contains  a linearly  independent  set  of  n 
vectors,  whereas  any  set  of  more  than  n vectors  in  V is  linearly  dependent.  That  set  of 
n linearly  independent  vectors  is  called  a basis  for  V.  Then  every  vector  in  V can  be 
written  as  a linear  combination  of  the  basis  vectors.  Furthermore,  for  a given  basis,  this 
representation  is  unique  (see  Prob.  2). 


Vector  Space  of  Matrices 

The  real  2X2  matrices  form  a four-dimensional  real  vector  space.  A basis  is 


T o 

o T 

’ O 

o 

0 o' 

Bn  = 

0 0 

b12  - 

0 °, 

. B2i  - 

1 0 

> b22  - 

0 1 

because  any  2X2  matrix  A = [< a has  a unique  representation  A = flnBn  + 012B12  + ^21^21  + 022^22- 
Similarly,  the  real  m X n matrices  with  fixed  m and  n form  an  mn-dimensional  vector  space.  What  is  the 
dimension  of  the  vector  space  of  all  3 X 3 skew- symmetric  matrices?  Can  you  find  a basis? 


Vector  Space  of  Polynomials 

The  set  of  all  constant,  linear,  and  quadratic  polynomials  in  x together  is  a vector  space  of  dimension  3 with 
basis  {l,x,x2}  under  the  usual  addition  and  multiplication  by  real  numbers  because  these  two  operations  give 
polynomials  not  exceeding  degree  2.  What  is  the  dimension  of  the  vector  space  of  all  polynomials  of  degree 
not  exceeding  a given  fixed  n?  Can  you  find  a basis? 


If  a vector  space  V contains  a linearly  independent  set  of  n vectors  for  every  n,  no  matter 
how  large,  then  V is  called  infinite  dimensional,  as  opposed  to  a finite  dimensional 
(n-dimensional)  vector  space  just  defined.  An  example  of  an  infinite  dimensional  vector 
space  is  the  space  of  all  continuous  functions  on  some  interval  [a,  b\  of  the  x-axis,  as  we 
mention  without  proof. 


Inner  Product  Spaces 

If  a and  b are  vectors  in  Rn,  regarded  as  column  vectors,  we  can  form  the  product  aTb. 
This  is  a 1 X 1 matrix,  which  we  can  identify  with  its  single  entry,  that  is,  with  a number. 
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This  product  is  called  the  inner  product  or  dot  product  of  a and  b.  Other  notations  for 
it  are  (a,  b)  and  a • b.  Thus 


aTb  = (a,  b)  = a • b = [a\  ■ ■ ■ an\ 


n 

2 alh  = aih  + ■ • • + <*nbn- 

i=  1 


We  now  extend  this  concept  to  general  real  vector  spaces  by  taking  basic  properties  of 
(a,  b)  as  axioms  for  an  “abstract  inner  product”  (a,  b)  as  follows. 


DEFINITION 


Real  Inner  Product  Space 

A real  vector  space  V is  called  a real  inner  product  space  (or  real  pre-Hilbert 4 
space ) if  it  has  the  following  property.  With  every  pair  of  vectors  a and  b in  V there 
is  associated  a real  number,  which  is  denoted  by  (a,  b)  and  is  called  the  inner 
product  of  a and  b,  such  that  the  following  axioms  are  satisfied. 

I.  For  all  scalars  cp  and  q2  and  all  vectors  a,  b,  c in  V, 

i a + 172b,  c)  = <?i(a,  c)  + qofb,  c)  ( Linearity ). 

II.  For  all  vectors  a and  b in  V, 


(a,  b)  = (b,  a) 


(Symmetry). 


III.  For  every  a in  V, 


(a,  a) 


(a,  a)  g 0, 

0 if  and  only  if  a = 0 


(Positive-definiteness). 


Vectors  whose  inner  product  is  zero  are  called  orthogonal. 
The  length  or  norm  of  a vector  in  V is  defined  by 

(2)  ||  a ||  = V(a,  a)  (it  0). 

A vector  of  norm  1 is  called  a unit  vector. 


4DAVID  HILBERT  (1862-1943),  great  German  mathematician,  taught  at  Konigsberg  and  Gottingen  and  was 
the  creator  of  the  famous  Gottingen  mathematical  school.  He  is  known  for  his  basic  work  in  algebra,  the  calculus 
of  variations,  integral  equations,  functional  analysis,  and  mathematical  logic.  His  “Foundations  of  Geometry” 
helped  the  axiomatic  method  to  gain  general  recognition.  His  famous  23  problems  (presented  in  1900  at  the 
International  Congress  of  Mathematicians  in  Paris)  considerably  influenced  the  development  of  modem 
mathematics. 

If  V is  finite  dimensional,  it  is  actually  a so-called  Hilbert  space ; see  [GenRef7],  p.  128,  listed  in  App.  1. 
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EXAMPLE  3 


EXAMPLE  4 


From  these  axioms  and  from  (2)  one  can  derive  the  basic  inequality 

(3)  | (a,  b) | S§||a||||b||  (Cauchy-Schwarz5  inequality). 
From  this  follows 

(4)  ||  a + b ||  Si  ||  a ||  + ||b||  ( Triangle  inequality). 

A simple  direct  calculation  gives 

(5)  ||  a + b||2  + ||  a — b||2  = 2(||a||2  + ||b||2)  ( Parallelogram  equality ). 


n-Dimensional  Euclidean  Space 

Rn  with  the  inner  product 

(6)  (a,  b)  = aTb  = a^b^  + ■ ■ ■ + anbn 

(where  both  a and  b are  column  vectors)  is  called  the  a-dimensional  Euclidean  space  and  is  denoted  by  fi"  or 
again  simply  by  Rn.  Axioms  I — III  hold,  as  direct  calculation  shows.  Equation  (2)  gives  the  '‘Euclidean  norm” 

(7)  ||  a ||  = V(a,  a)  = Vaht  = Vaf  + ■ • • + a%. 

An  Inner  Product  for  Functions.  Function  Space 

The  set  of  all  real- valued  continuous  functions  fix),  g(x),  ■ ■ ■ on  a given  interval  a = x = fi  is  a real  vector 
space  under  the  usual  addition  of  functions  and  multiplication  by  scalars  (real  numbers).  On  this  “function 
space”  we  can  define  an  inner  product  by  the  integral 


(8) 


if,  8)  = | fix)  g(x)  dx. 


Axioms  I — III  can  be  verified  by  direct  calculation.  Equation  (2)  gives  the  norm 


(9)  ll/ll  = VoT)  = 

Our  examples  give  a first  impression  of  the  great  generality  of  the  abstract  concepts  of 
vector  spaces  and  inner  product  spaces.  Further  details  belong  to  more  advanced  courses 
(on  functional  analysis,  meaning  abstract  modern  analysis;  see  [GenRef7]  listed  in  App. 
1)  and  cannot  be  discussed  here.  Instead  we  now  take  up  a related  topic  where  matrices 
play  a central  role. 

Linear  Transformations 

Let  X and  Y be  any  vector  spaces.  To  each  vector  x in  X we  assign  a unique  vector  y in 
Y.  Then  we  say  that  a mapping  (or  transformation  or  operator)  of  X into  Y is  given. 
Such  a mapping  is  denoted  by  a capital  letter,  say  F.  The  vector  y in  Y assigned  to  a vector 
x in  X is  called  the  image  of  x under  F and  is  denoted  by  F (v)  [or  Fx,  without  parentheses] . 


5HERMANN  AMANDUS  SCHWARZ  (1843-1921).  German  mathematician,  known  by  his  work  in  complex 
analysis  (conformal  mapping)  and  differential  geometry.  For  Cauchy  see  Sec.  2.5. 
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F is  called  a linear  mapping  or  linear  transformation  if,  for  all  vectors  v and  x in  X 
and  scalars  c. 


(10) 


F(y  + x)  = F(v)  + F(x) 
F(cx)  = cF(x). 


Linear  Transformation  of  Space  Rn  into  Space  Rm 

From  now  on  we  let  X = Rn  and  Y = Rm.  Then  any  real  m X n matrix  A = [ajk]  gives 
a transformation  of  Rn  into  Rm\ 

(11)  y = Ax. 

Since  A(u  + x)  = Au  + Ax  and  A(cx)  = cAx,  this  transformation  is  linear. 

We  show  that,  conversely,  every  linear  transformation  F of  R"  into  Rm  can  be  given 
in  terms  of  an  m X n matrix  A,  after  a basis  for  R"  and  a basis  for  Rm  have  been  chosen. 
This  can  be  proved  as  follows. 

Let  e(1J,  • • • , e(n)  be  any  basis  for  Rn.  Then  every  x in  R"  has  a unique  representation 

x .viGq)  ■ ■ • + xne(Tl). 

Since  F is  linear,  this  representation  implies  for  the  image  F(x): 


F(x)  = F(x  ieQ)  + • ■ • + xne(n))  = XiFled))  + ■ ■ • + xnF(e(n)). 


Hence  F is  uniquely  determined  by  the  images  of  the  vectors  of  a basis  for  Rn.  We  now 
choose  for  Rn  the  “standard  basis” 


(12) 


1 

0 

0 

0 

1 

0 

0 

> e(2)  - 

0 

> ‘ ‘ ‘ > ®(n) 

0 

0 

0 

1 

where  e<;j)  has  its  y'th  component  equal  to  1 and  all  others  0.  We  show  that  we  can  now 
determine  an  m X n matrix  A = [ajk]  such  that  for  every  x in  Rn  and  image  y = F(x)  in 


y = F(x)  = Ax. 

Indeed,  from  the  image  ycl)  = F(eQ))  of  e(1)  we  get  the  condition 


>" 

a ii 

@ln 

i 

= 

a21 

^2  n 

0 

ym 

®mm 

0 
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EXAMPLE  5 


EXAMPLE  6 


from  which  we  can  determine  the  first  column  of  A,  namely  an  = \'i  , a2\  = >4  , ■ , 

am i = y'jJ.  Similarly,  from  the  image  of  e®)  we  get  the  second  column  of  A,  and  so  on. 
This  completes  the  proof. 


We  say  that  A represents  F,  or  is  a representation  of  F,  with  respect  to  the  bases  for  Rn 
and  Rm.  Quite  generally,  the  purpose  of  a “representation”  is  the  replacement  of  one 
object  of  study  by  another  object  whose  properties  are  more  readily  apparent. 

In  three-dimensional  Euclidean  space  E 3 the  standard  basis  is  usually  written  e(1)  = i, 
C(2)  = j e(3)  = k.  Thus, 


(13) 


T 

~0~ 

”o~ 

0 

. j = 

1 

, k = 

0 

0 

0 

1 

These  are  the  three  unit  vectors  in  the  positive  directions  of  the  axes  of  the  Cartesian 
coordinate  system  in  space,  that  is,  the  usual  coordinate  system  with  the  same  scale  of 
measurement  on  the  three  mutually  perpendicular  coordinate  axes. 


Linear  Transformations 

Interpreted  as  transformations  of  Cartesian  coordinates  in  the  plane,  the  matrices 


0 

1 

T o 

-i  o' 

a 

o' 

1 

0 

o -1. 

0 -1 

0 

1 

represent  a reflection  in  the  line  *2  = *1,  a reflection  in  the  xi-axis,  a reflection  in  the  origin,  and  a stretch 
(when  a > 1,  or  a contraction  when  0 < a < 1)  in  the  xi-direction,  respectively. 


Linear  Transformations 

Our  discussion  preceding  Example  5 is  simpler  than  it  may  look  at  first  sight.  To  see  this,  find  A representing 
the  linear  transformation  that  maps  (xj.,  X2)  onto  ( 2xi  — 5^2,  3xi  + 4x2). 

Solution.  Obviously,  the  transformation  is 


yi  = 2x1  - 5x2 
y2  = 3xi  + 4x2. 


From  this  we  can  directly  see  that  the  matrix  is 


in 

1 

<N 

Check: 

yi 

'2 

-5' 

*1 

2*i  - 5*2 

3 4_ 

72. 

3 

4 

.*2. 

3*1  + 4*2 

If  A in  (11)  is  square,  n X n,  then  (11)  maps  Rn  into  Rn.  If  this  A is  nonsingular,  so  that 
A-1  exists  (see  Sec.  7.8),  then  multiplication  of  (11)  by  A-1  from  the  left  and  use  of 
A-1  A = I gives  the  inverse  transformation 


(14) 


x = A-y 


It  maps  every  y = yo  onto  that  x,  which  by  (1 1)  is  mapped  onto  Vq.  The  inverse  of  a linear 
transformation  is  itself  linear,  because  it  is  given  by  a matrix,  as  (14)  shows. 
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EXAMPLE  7 


Composition  of  Linear  Transformations 

We  want  to  give  you  a flavor  of  how  linear  transformations  in  general  vector  spaces  work. 
You  will  notice,  if  you  read  carefully,  that  definitions  and  verifications  (Example  7)  strictly 
follow  the  given  rules  and  you  can  think  your  way  through  the  material  by  going  in  a 
slow  systematic  fashion. 

The  last  operation  we  want  to  discuss  is  composition  of  linear  transformations.  Let  X, 
Y,  W be  general  vector  spaces.  As  before,  let  F he  a linear  transformation  from  X to  Y. 
Let  G be  a linear  transformation  from  W to  X.  Then  we  denote,  by  //,  the  composition 
of  F and  G,  that  is, 


H = F ° G = FG  = F(G), 

which  means  we  take  transformation  G and  then  apply  transformation  F to  it  (in  that 
order!,  i.e.  you  go  from  left  to  right). 

Now,  to  give  this  a more  concrete  meaning,  if  we  let  w be  a vector  in  W,  then  G(w) 
is  a vector  in  X and  F(G( w))  is  a vector  in  Y.  Thus,  FI  maps  W to  Y,  and  we  can  write 

(15)  H(w)  = (F  ° G ) (w)  = (FG)  (w)  = F(G( w)), 

which  completes  the  definition  of  composition  in  a general  vector  space  setting.  But  is 
composition  really  linear?  To  check  this  we  have  to  verify  that  FI,  as  defined  in  (15),  obeys 
the  two  equations  of  (10). 


The  Composition  of  Linear  Transformations  Is  Linear 

To  show  that  H is  indeed  linear  we  must  show  that  (10)  holds.  We  have,  for  two  vectors  Wi,  W2  in  W, 

H( Wi  + w2)  = (F  o G)(Wi  + w2) 


= A(G(Wi  + w2)) 

= F(G(Wl)  + G(w2)) 

(by  linearity  of  G) 

= F(G(Wl))  + F(G(w2)) 

(by  linearity  of  F) 

= (F°G)(Wi)  + (F  » G)(w2) 

(by  (15)) 

= H(  wi)  + H{  w2) 

(by  definition  of  H). 

Similarly,  fl(cw2)  = (F°  G)(cw2)  = F(G(cw2))  = F(c(G( w2)) 
= cF(G( w2))  = c(F  ° G)(w2)  = cH( w2). 


We  defined  composition  as  a linear  transformation  in  a general  vector  space  setting  and 
showed  that  the  composition  of  linear  transformations  is  indeed  linear. 

Next  we  want  to  relate  composition  of  linear  transformations  to  matrix  multiplication. 

To  do  so  we  let  X = Rn,  Y = Rm,  and  W = Rp.  This  choice  of  particular  vector  spaces 
allows  us  to  represent  the  linear  transformations  as  matrices  and  form  matrix  equations, 
as  was  done  in  (1 1).  Thus  F can  be  represented  by  a general  real  m X n matrix  A = [a:jp] 
and  G by  an  n X p matrix  B = [h-jp-].  Then  we  can  write  for  F,  with  column  vectors  x 
with  n entries,  and  resulting  vector  y,  with  m entries 


(16) 


y = Ax 
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EXAMPLE  8 


and  similarly  for  G,  with  column  vector  w with  p entries, 
(17)  x = Bw. 


Substituting  (17)  into  (16)  gives 

(18)  y = Ax  = A(Bw)  = (AB)w  = ABw  = Cw  where  C = AB. 

This  is  (15)  in  a matrix  setting,  this  is,  we  can  define  the  composition  of  linear  transfor- 
mations in  the  Euclidean  spaces  as  multiplication  by  matrices.  Hence,  the  real  m X p 
matrix  C represents  a linear  transformation  H which  maps  Rp  to  Rn  with  vector  w,  a 
column  vector  with  p entries. 


Remarks.  Our  discussion  is  similar  to  the  one  in  Sec.  7.2,  where  we  motivated  the 
“unnatural”  matrix  multiplication  of  matrices.  Look  back  and  see  that  our  current,  more 
general,  discussion  is  written  out  there  for  the  case  of  dimension  m = 2,  n = 2,  and  p = 2. 
(You  may  want  to  write  out  our  development  by  picking  small  distinct  dimensions,  such 
as  m = 2,  n = 3,  and  p = 4,  and  writing  down  the  matrices  and  vectors.  This  is  a trick 
of  the  trade  of  mathematicians  in  that  we  like  to  develop  and  test  theories  on  smaller 
examples  to  see  that  they  work.) 


Linear  Transformations.  Composition 

In  Example  5 of  Sec.  7.9,  let  A be  the  first  matrix  and  B be  the  fourth  matrix  with  a > 1.  Then,  applying  B to 
a vector  w = [wi  , stretches  the  element  w\  by  a in  the  X\  direction.  Next,  when  we  apply  A to  the 
“stretched”  vector,  we  reflect  the  vector  along  the  line  xi  = X2,  resulting  in  a vector  y = [w?2  awi]  . But  this 
represents,  precisely,  a geometric  description  for  the  composition  H of  two  linear  transformations  F and  G 
represented  by  matrices  A and  B.  We  now  show  that,  for  this  example,  our  result  can  be  obtained  by 
straightforward  matrix  multiplication,  that  is. 


and  as  in  (18)  calculate 


o T 

a ()' 

0 f 

1 0 

0 1 

a 0 

’o 

r 

Wi 

w2 

a 

0 

w2 

awi 

which  is  the  same  as  before.  This  shows  that  indeed  AB  = C,  and  we  see  the  composition  of  linear 
transformations  can  be  represented  by  a linear  transformation.  It  also  shows  that  the  order  of  matrix  multiplication 
is  important  (!).  You  may  want  to  try  applying  A first  and  then  B,  resulting  in  BA.  What  do  you  see?  Does  it 
make  geometric  sense?  Is  it  the  same  result  as  AB? 


We  have  learned  several  abstract  concepts  such  as  vector  space,  inner  product  space, 
and  linear  transformation.  The  introduction  of  such  concepts  allows  engineers  and 
scientists  to  communicate  in  a concise  and  common  language.  For  example,  the  concept 
of  a vector  space  encapsulated  a lot  of  ideas  in  a very  concise  manner.  For  the  student, 
learning  such  concepts  provides  a foundation  for  more  advanced  studies  in  engineering. 

This  concludes  Chapter  7.  The  central  theme  was  the  Gaussian  elimination  of  Sec.  7.3 
from  which  most  of  the  other  concepts  and  theory  flowed.  The  next  chapter  again  has  a 
central  theme,  that  is,  eigenvalue  problems,  an  area  very  rich  in  applications  such  as  in 
engineering,  modern  physics,  and  other  areas. 
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1.  Basis.  Find  three  bases  of  R1 2. 

2.  Uniqueness.  Show  that  the  representation  v = CiaQ) 
+ • • • + cna(n)  of  any  given  vector  in  an  n-dimensional 
vector  space  V in  terms  of  a given  basis  a(i),  ■ ■ • , a(n) 
for  V is  unique.  Hint.  Take  two  representations  and 
consider  the  difference. 


3-10 


VECTOR  SPACE 


(More  problems  in  Problem  Set  9.4.)  Is  the  given  set,  taken 
with  the  usual  addition  and  scalar  multiplication,  a vector 
space?  Give  reason.  If  your  answer  is  yes,  find  the  dimen- 
sion and  a basis. 


3.  All  vectors  in  R3 4  satisfying  —v±  + 2v2  + 3p3  = 0, 
—4V\  + v2  + V3  = 0. 


4.  All  skew-symmetric  3X3  matrices. 

5.  All  polynomials  in  * of  degree  4 or  less  with 
nonnegative  coefficients. 

6.  All  functions  y (*)  = a cos  2 x + b sin  2x  with  arbitrary 
constants  a and  b. 


7.  All  functions  y (*)  = (ax  + b)e  x with  any  constant  a 
and  b. 


8.  All  n X n matrices  A with  fixed  n and  det  A = 0. 

9.  All  2X2  matrices  [a^]  with  + a2 2 = 0. 

10.  All  3X2  matrices  [ajk]  with  first  column  any  multiple 

of  [3 


0 — 5]t. 


11-14 


LINEAR  TRANSFORMATIONS 


Find  the  inverse  transformation.  Show  the  details. 

11.  yx  = 0.5*  1 — 0.5*2  12.  Vi  = 3*i  + 2*2 


y2  = 1.5*1  “ 2.5*2 


y2  = 4*!  + *2 


13.  yi  = 5*i  + 3*2  — 3*3 
y2  = 3*i  + 2*2  - 2*3 
y3  = 2*1  - *2  + 2*3 

14.  yi  = 0.2*i  — 0.1*2 


y2  = “ 0.2*2  + 0.1*3 

y3  = 0.1*i  + 0.1*3 


15-20 

EUCLIDEAN  NORM 

Find  the  Euclidean  norm  of  the  vectors: 

15.  [3 

1 -4]T  16.  [h  h 

1 

2 

4]T 

17.  [1 

0 0 1-1  0-1 

i]T 

18.  [-4 

8 -1]T  19.  [1  | 

1 

3 

0]T 

20.  [1 

1 1 liT 

2 2 2J 

21-25 


INNER  PRODUCT.  ORTHOGONALITY 


21.  Orthogonality.  For  what  value(s)  of  k are  the  vectors 
[2  g —4  0]T  and  [5  k 0 |]T  orthogonal? 


22.  Orthogonality.  Find  all  vectors  in  R3  orthogonal  to 
[2  0 1].  Do  they  form  a vector  space? 


23.  Triangle  inequality.  Verify  (4)  for  the  vectors  in 
Probs.  15  and  18. 


24.  Cauchy-Schwarz  inequality.  Verify  (3)  for  the 
vectors  in  Probs.  16  and  19. 


25.  Parallelogram  equality.  Verify  (5)  for  the  first  two 
column  vectors  of  the  coefficient  matrix  in  Prob.  13. 


T I O N S AND  PROBLEMS 


1.  What  properties  of  matrix  multiplication  differ  from 
those  of  the  multiplication  of  numbers? 

2.  Let  A be  a 100  X 100  matrix  and  B a 100  X 50  matrix. 
Are  the  following  expressions  defined  or  not?  A + B, 
A2,  B2,  AB,  BA,  AAt,  BtA,  BtB,  BBt,  BtAB  Give 
reasons. 

3.  Are  there  any  linear  systems  without  solutions?  With 
one  solution?  With  more  than  one  solution?  Give 
simple  examples. 

4.  Let  C be  10  X 10  matrix  and  a a column  vector  with 
10  components.  Are  the  following  expressions  defined 
or  not?  Ca,  CTa,  CaT,  aC,  aTC,  (CaT)T. 


5.  Motivate  the  definition  of  matrix  multiplication. 

6.  Explain  the  use  of  matrices  in  linear  transformations. 

7.  How  can  you  give  the  rank  of  a matrix  in  terms  of  row 
vectors?  Of  column  vectors?  Of  determinants? 

8.  What  is  the  role  of  rank  in  connection  with  solving 
linear  systems? 

9.  What  is  the  idea  of  Gauss  elimination  and  back 
substitution? 

10.  What  is  the  inverse  of  a matrix?  When  does  it  exist? 
How  would  you  determine  it? 
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11-20  MATRIX  AND  VECTOR  CALCULATIONS 

Showing  the  details,  calculate  the  following  expressions  or 
give  reason  why  they  are  not  defined,  when 


3 1 

-3 

0 

4 1 

A = 

1 4 

2 

, B = 

-4 

O 

1 

to 

-3  2 

5 

-1 

1 

o 

<N 

2 

7 

u = 

0 

, v = 

-3 

-5 

3 

11.  AB,  BA 
13.  Au,  utA 
15.  utAu,  vtBv 
17.  det  A,  det  A2,  (det  A)2,  det  B 


18.  (A2)-1,  (A-1)2 

20.  (A  + At)(B  - Bt) 


12.  A1,  B1 
14.  uTv,  uv 
16.  A-1,  B~ 
)2,  det  1 
19.  AB  - BA 


21-28 


LINEAR  SYSTEMS 

Showing  the  details,  find  all  solutions  or  indicate  that  no 
solution  exists. 

21.  4y  + z = 0 

12*  - 5y  - 3z  = 34 
— 6x  + 4z  = 8 

22.  5x  - 3y  + z = 7 
2x  + 3y  — z = 0 
8x  + 9y  — 3z  = 2 

23.  9x  + 3y  — 6z  = 60 
2x  — 4y  + 8z  = 4 

24.  — 6x  + 39y  - 9z  = -12 

2x  - 13y  + 3z  = 4 

25.  0.3x  — 0.7y  + 1.3z  = 3.24 

0.9y  - 0.8z  = -2.53 
0.7z=  1.19 

26.  2x  + 3y  - 7z  = 3 
— 4x  — 6y  + 14z  = 7 


27.  x + 2y  = 6 

3x  + 5y  = 20 

— 4x  + y = —42 

28.  — 8x  + 2z  = 1 

6v  + 4z  = 3 

12x  + 2y  =2 

RANK 


29-32 


Determine  the  ranks  of  the  coefficient  matrix  and  the 
augmented  matrix  and  state  how  many  solutions  the  linear 
system  will  have. 

29.  In  Prob.  23 

30.  In  Prob.  24 

31.  In  Prob.  27 

32.  In  Prob.  26 


33-35 


NETWORKS 


Find  the  currents. 
33.  20  n 


34. 


220  V 


10  V 


130  V 
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summary  of  chapter  .7  = 

Linear  Algebra:  Matrices,  Vectors,  Determinants. 
Linear  Systems 


An  m X n matrix  A = [ajk]  is  a rectangular  array  of  numbers  or  functions 
(“entries,”  “elements”)  arranged  in  m horizontal  rows  and  n vertical  columns.  If 
m = n,  the  matrix  is  called  square.  A 1 X n matrix  is  called  a row  vector  and  an 
m X 1 matrix  a column  vector  (Sec.  7.1). 

The  sum  A + B of  matrices  of  the  same  size  (i.e.,  both  m X n)  is  obtained  by 
adding  corresponding  entries.  The  product  of  A by  a scalar  c is  obtained  by 
multiplying  each  cij^  by  c (Sec.  7.1). 

The  product  C = AB  of  an  m X n matrix  A by  an  r X p matrix  B = [ bp-]  is 
defined  only  when  r = n,  and  is  the  in  X p matrix  C = [cjk]  with  entries 


(1) 


cjk  Gjlblk  "F  aj2^2k  + ' ’ ' + Qjmfink 


(row  j of  A times 
column  k of  B). 


This  multiplication  is  motivated  by  the  composition  of  linear  transformations 
(Secs.  7.2,  7.9).  It  is  associative,  but  is  not  commutative:  if  AB  is  defined,  BA  may 
not  be  defined,  but  even  if  BA  is  defined,  AB  A BA  in  general.  Also  AB  = 0 may 
not  imply  A = 0 or  B = 0 or  BA  = 0 (Secs.  7.2,  7.8).  Illustrations: 
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-1 

-1 

3' 

’3' 

3 6' 

[1  2] 

4 

= [11], 

4 

[1  2]  = 

4 8 

The  transpose  AT  of  a matrix  A = [ap,-]  is  AT  = rows  become  columns 
and  conversely  (Sec.  7.2).  Here,  A need  not  be  square.  If  it  is  and  A = AT,  then  A 
is  called  symmetric;  if  A = — AT,  it  is  called  skew-symmetric.  For  a product, 
(AB)t  = BtAt  (Sec.  7.2). 

A main  application  of  matrices  concerns  linear  systems  of  equations 

(2)  Ax  = b (Sec.  7.3) 

( m equations  in  n unknowns  x±,  ■ ■ ■ , xn;  A and  b given).  The  most  important  method 
of  solution  is  the  Gauss  elimination  (Sec.  7.3),  which  reduces  the  system  to 
“triangular”  form  by  elementary  row  operations,  which  leave  the  set  of  solutions 
unchanged.  (Numeric  aspects  and  variants,  such  as  Doolittle’s  and  Cholesky’s 
methods,  are  discussed  in  Secs.  20.1  and  20.2.) 
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Cramer’s  rule  (Secs.  7.6,  7.7)  represents  the  unknowns  in  a system  (2)  of  n 
equations  in  n unknowns  as  quotients  of  determinants;  for  numeric  work  it  is 
impractical.  Determinants  (Sec.  7.7)  have  decreased  in  importance,  but  will  retain 
their  place  in  eigenvalue  problems,  elementary  geometry,  etc. 

The  inverse  A-1  of  a square  matrix  satisfies  AA-1  = A-1  A = I.  It  exists  if  and 
only  if  det  A ¥=  0.  It  can  be  computed  by  the  Gauss-Jordan  elimination  (Sec.  7.8). 

The  rank  r of  a matrix  A is  the  maximum  number  of  linearly  independent  rows 
or  columns  of  A or,  equivalently,  the  number  of  rows  of  the  largest  square  submatrix 
of  A with  nonzero  determinant  (Secs.  7.4,  7.7). 

The  system  (2)  has  solutions  if  and  only  if  rank  A = rank  [A  b],  where  [A  b] 
is  the  augmented  matrix  (Fundamental  Theorem,  Sec.  7.5). 

The  homogeneous  system 

(3)  Ax  = 0 

has  solutions  x =£  0 (“nontrivial  solutions”)  if  and  only  if  rank  A < n,  in  the  case 
m = n equivalently  if  and  only  if  det  A = 0 (Secs.  7.6,  7.7). 

Vector  spaces,  inner  product  spaces,  and  linear  transformations  are  discussed  in 
Sec.  7.9.  See  also  Sec.  7.4. 


CHAPTER  8 


Linear  Algebra: 

Matrix  Eigenvalue  Problems 

A matrix  eigenvalue  problem  considers  the  vector  equation 
(1)  Ax  = Ax. 

Here  A is  a given  square  matrix,  A an  unknown  scalar,  and  x an  unknown  vector.  In  a 
matrix  eigenvalue  problem,  the  task  is  to  determine  A’s  and  x’s  that  satisfy  (1).  Since 
x = 0 is  always  a solution  for  any  A and  thus  not  interesting,  we  only  admit  solutions 
with  x A 0. 

The  solutions  to  (1)  are  given  the  following  names:  The  A’s  that  satisfy  (1)  are  called 
eigenvalues  of  A and  the  corresponding  nonzero  x’s  that  also  satisfy  (1)  are  called 

eigenvectors  of  A. 

From  this  rather  innocent  looking  vector  equation  flows  an  amazing  amount  of  relevant 
theory  and  an  incredible  richness  of  applications.  Indeed,  eigenvalue  problems  come  up 
all  the  time  in  engineering,  physics,  geometry,  numerics,  theoretical  mathematics,  biology, 
environmental  science,  urban  planning,  economics,  psychology,  and  other  areas.  Thus,  in 
your  career  you  are  likely  to  encounter  eigenvalue  problems. 

We  start  with  a basic  and  thorough  introduction  to  eigenvalue  problems  in  Sec.  8. 1 and 
explain  (1)  with  several  simple  matrices.  This  is  followed  by  a section  devoted  entirely 
to  applications  ranging  from  mass-spring  systems  of  physics  to  population  control  models 
of  environmental  science.  We  show  you  these  diverse  examples  to  train  your  skills  in 
modeling  and  solving  eigenvalue  problems.  Eigenvalue  problems  for  real  symmetric, 
skew-symmetric,  and  orthogonal  matrices  are  discussed  in  Sec.  8.3  and  their  complex 
counterparts  (which  are  important  in  modern  physics)  in  Sec.  8.5.  In  Sec.  8.4  we  show 
how  by  diagonalizing  a matrix,  we  obtain  its  eigenvalues. 

COMMENT.  Numerics  for  eigenvalues  (Secs.  20.6-20.9)  can  be  studied  immediately 
after  this  chapter. 

Prerequisite:  Chap.  7. 

Sections  that  may  be  omitted  in  a shorter  course:  8.4,  8.5. 

References  and  Answers  to  Problems:  App.  1 Part  B,  App.  2. 
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The  following  chart  identifies  where  different  types  of  eigenvalue  problems  appear  in  the 
book. 


Topic 

Where  to  find  it 

Matrix  Eigenvalue  Problem  (algebraic  eigenvalue  problem) 

Chap.  8 

Eigenvalue  Problems  in  Numerics 

Secs.  20.6-20.9 

Eigenvalue  Problem  for  ODEs  (Sturm-Liouville  problems) 

Secs.  11.5,  11.6 

Eigenvalue  Problems  for  Systems  of  ODEs 

Chap.  4 

Eigenvalue  Problems  for  PDEs 

Secs.  12.3-12.11 

8.1  The  Matrix  Eigenvalue  Problem.  Determining 
Eigenvalues  and  Eigenvectors 

Consider  multiplying  nonzero  vectors  by  a given  square  matrix,  such  as 


1 

ON 

LO 

’5' 

’33' 

6 3" 

y 

’30" 

4 7 

1 

27 

* 

4 7 

4 

40_ 

We  want  to  see  what  influence  the  multiplication  of  the  given  matrix  has  on  the  vectors. 
In  the  first  case,  we  get  a totally  new  vector  with  a different  direction  and  different  length 
when  compared  to  the  original  vector.  This  is  what  usually  happens  and  is  of  no  interest 
here.  In  the  second  case  something  interesting  happens.  The  multiplication  produces  a 
vector  [30  40]T  = 10  [3  4]T,  which  means  the  new  vector  has  the  same  direction  as 

the  original  vector.  The  scale  constant,  which  we  denote  by  A is  10.  The  problem  of 
systematically  finding  such  A’s  and  nonzero  vectors  for  a given  square  matrix  will  be  the 
theme  of  this  chapter.  It  is  called  the  matrix  eigenvalue  problem  or,  more  commonly,  the 
eigenvalue  problem. 

We  formalize  our  observation.  Let  A = [a.jf  be  a given  nonzero  square  matrix  of 
dimension  n X n.  Consider  the  following  vector  equation: 

(1)  Ax  = Ax. 

The  problem  of  finding  nonzero  x’s  and  A’s  that  satisfy  equation  (1 ) is  called  an  eigenvalue 
problem. 

Remark.  So  A is  a given  square  (!)  matrix,  x is  an  unknown  vector,  and  A is  an 
unknown  scalar.  Our  task  is  to  find  A’s  and  nonzero  x’s  that  satisfy  (1).  Geometrically, 
we  are  looking  for  vectors,  x,  for  which  the  multiplication  by  A has  the  same  effect  as 
the  multiplication  by  a scalar  A;  in  other  words,  Ax  should  be  proportional  to  x.  Thus, 
the  multiplication  has  the  effect  of  producing,  from  the  original  vector  x,  a new  vector 
Ax  that  has  the  same  or  opposite  (minus  sign)  direction  as  the  original  vector.  (This  was 
all  demonstrated  in  our  intuitive  opening  example.  Can  you  see  that  the  second  equation  in 
that  example  satisfies  (1)  with  A = 10  and  x = [3  4]T,  and  A the  given  2X2  matrix? 
Write  it  out.)  Now  why  do  we  require  x to  be  nonzero?  The  reason  is  that  x = 0 is 
always  a solution  of  (1)  for  any  value  of  A,  because  AO  = 0.  This  is  of  no  interest. 
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We  introduce  more  terminology.  A value  of  A,  for  which  (1)  has  a solution  x A 0,  is 
called  an  eigenvalue  or  characteristic  value  of  the  matrix  A.  Another  term  for  A is  a latent 
root.  (“Eigen”  is  German  and  means  “proper”  or  “characteristic.”).  The  corresponding 
solutions  x A 0 of  (1)  are  called  the  eigenvectors  or  characteristic  vectors  of  A 
corresponding  to  that  eigenvalue  A.  The  set  of  all  the  eigenvalues  of  A is  called  the 
spectrum  of  A.  We  shall  see  that  the  spectrum  consists  of  at  least  one  eigenvalue  and  at 
most  of  n numerically  different  eigenvalues.  The  largest  of  the  absolute  values  of  the 
eigenvalues  of  A is  called  the  spectral  radius  of  A,  a name  to  be  motivated  later. 


How  to  Find  Eigenvalues  and  Eigenvectors 

Now,  with  the  new  terminology  for  (1),  we  can  just  say  that  the  problem  of  determining 
the  eigenvalues  and  eigenvectors  of  a matrix  is  called  an  eigenvalue  problem.  (However, 
more  precisely,  we  are  considering  an  algebraic  eigenvalue  problem,  as  opposed  to  an 
eigenvalue  problem  involving  an  ODE  or  PDE,  as  considered  in  Secs.  11.5  and  12.3,  or 
an  integral  equation.) 

Eigenvalues  have  a very  large  number  of  applications  in  diverse  fields  such  as  in 
engineering,  geometry,  physics,  mathematics,  biology,  environmental  science,  economics, 
psychology,  and  other  areas.  You  will  encounter  applications  for  elastic  membranes, 
Markov  processes,  population  models,  and  others  in  this  chapter. 

Since,  from  the  viewpoint  of  engineering  applications,  eigenvalue  problems  are  the  most 
important  problems  in  connection  with  matrices,  the  student  should  carefully  follow  our 
discussion. 

Example  1 demonstrates  how  to  systematically  solve  a simple  eigenvalue  problem. 


Determination  of  Eigenvalues  and  Eigenvectors 

We  illustrate  all  the  steps  in  terms  of  the  matrix 


A = 


-5 

2 


2 

-2 


Solution,  (a)  Eigenvalues.  These  must  be  determined  first.  Equation  (1)  is 


-5  2 

*i 

*1 

—5xi  + 2^2  = Axi 

Ax  = 

<N  ' 
1 

CN 

.*2. 

= A 

*2. 

; in  components, 

2xi  — 2x2  = Ax‘2 

Transferring  the  terms  on  the  right  to  the  left,  we  get 

(—5  — A)*!  + 2x2  = 0 


(2*) 


2*!  + (—2  — A)x2  = 0. 


This  can  be  written  in  matrix  notation 


(3*) 


(A  — AI)x  = 0 


because  (1)  is  Ax  — Ax  = Ax  — AIx  = (A  — AI)x  = 0,  which  gives  (3*).  We  see  that  this  is  a homogeneous 
linear  system.  By  Cramer’s  theorem  in  Sec.  7.7  it  has  a nontrivial  solution  x A 0 (an  eigenvector  of  A we  are 
looking  for)  if  and  only  if  its  coefficient  determinant  is  zero,  that  is. 


(4*)  D( A)  = det(A  - AI)  = 


-5  - A 
2 


2 

-2  - A 


= (-5  - A)(— 2 — A)  — 4 = A2  + 7A  + 6 = 0. 
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We  call  Z)(A)  the  characteristic  determinant  or,  if  expanded,  the  characteristic  polynomial,  and  D(A)  = 0 
the  characteristic  equation  of  A.  The  solutions  of  this  quadratic  equation  are  Aj.  = — 1 and  A2  = —6.  These 
are  the  eigenvalues  of  A. 

(bi)  Eigenvector  of  A corresponding  to  Ai.  This  vector  is  obtained  from  (2*)  with  A = Ai  = — 1,  that  is, 


— 4a'i  + 2*2  — 0 
2a  1 — a 2 = 0. 

A solution  is  *2  = 2a  1,  as  we  see  from  either  of  the  two  equations,  so  that  we  need  only  one  of  them.  This 
determines  an  eigenvector  corresponding  to  Ai  = —1  up  to  a scalar  multiple.  If  we  choose  Ai  = 1,  we  obtain 
the  eigenvector 


1 

, Check: 

Axi  = 

-5 

2" 

1 

= 

-1 

2 

2 

-2 

2 

-2 

= (-l)xi  = A1X1. 


(b2)  Eigenvector  of  A corresponding  to  A2.  For  A = A2  = —6,  equation  (2*)  becomes 

Ai  + 2A2  — 0 

2ai  + 4a2  = 0. 


A solution  is  A2  = — ai/2  with  arbitrary  x±.  If  we  choose  Ai  = 2,  we  get  A2  — — 1.  Thus  an  eigenvector  of  A 
corresponding  to  A2  = —6  is 


2 

, Check: 

Ax2  = 

'-5 

2" 

2 

= 

-12 

-1 

2 

-2 

-1 

6 

= (— 6)x2  = A2x2. 


For  the  matrix  in  the  intuitive  opening  example  at  the  start  of  Sec.  8.1,  the  characteristic  equation  is 
A2  — 13A  + 30  = (A  — 10)(A  — 3)  = 0.  The  eigenvalues  are  {10,  3}.  Corresponding  eigenvectors  are 
[3  4]t  and[—  1 1]T  , respectively.  The  reader  may  want  to  verify  this. 

This  example  illustrates  the  general  case  as  follows.  Equation  (1)  written  in  components  is 

011*1  + • • • + ci\nxn  ~ A*i 
^21xl  T • • ■ T"  Cl2nxn  A* 2 


T annxn  A Xn. 


& nlx  1 T 

Transferring  the  terms  on  the  right  side  to  the  left  side,  we  have 


(2) 


(an  - A)xi 

+ &12X2  + ’ ' 

alnxn 

= 0 

+ («22  _ h)x2  + ‘ ' 

T &2  nxn 

= 0 

^nlxl 

+ an2X2  + ■ 1 

+ ( ann  ~ A )xn 

= 0. 

In  matrix  notation, 


0) 


(A  - AI)x  = 0. 
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By  Cramer’s  theorem  in  Sec.  7.7,  this  homogeneous  linear  system  of  equations  has  a 
nontrivial  solution  if  and  only  if  the  corresponding  determinant  of  the  coefficients  is  zero: 


(4) 


D(A) 


det(A  - AI) 


ci\\  A 

a12 

aln 

a21 

®22  — A 

a2  n 

^nl 

an2 

&nn 

A — AI  is  called  the  characteristic  matrix  and  D ( A)  the  characteristic  determinant  of 
A.  Equation  (4)  is  called  the  characteristic  equation  of  A.  By  developing  D( A)  we  obtain 
a polynomial  of  nth  degree  in  A.  This  is  called  the  characteristic  polynomial  of  A. 

This  proves  the  following  important  theorem. 


Eigenvalues 

The  eigenvalues  of  a square  matrix  A are  the  roots  of  the  characteristic  equation 
(4)  of  A. 

Hence  an  n X n matrix  has  at  least  one  eigenvalue  and  at  most  n numerically 
different  eigenvalues. 


For  larger  n,  the  actual  computation  of  eigenvalues  will,  in  general,  require  the  use 
of  Newton’s  method  (Sec.  19.2)  or  another  numeric  approximation  method  in  Secs. 
20.7-20.9. 

The  eigenvalues  must  be  determined  first.  Once  these  are  known,  corresponding 
eigen  vectors  are  obtained  from  the  system  (2),  for  instance,  by  the  Gauss  elimination, 
where  A is  the  eigenvalue  for  which  an  eigenvector  is  wanted.  This  is  what  we  did  in 
Example  1 and  shall  do  again  in  the  examples  below.  (To  prevent  misunderstandings: 
numeric  approximation  methods,  such  as  in  Sec.  20.8,  may  determine  eigen  vectors  first.) 

Eigenvectors  have  the  following  properties. 


Eigenvectors,  Eigenspace 

If  w and  x are  eigenvectors  of  a matrix  A corresponding  to  the  same  eigenvalue  A, 
so  are  w + x (provided  x T — vv)  and  kxfor  any  k T 0. 

Hence  the  eigenvectors  corresponding  to  one  and  the  same  eigenvalue  A of  A, 
together  with  0,  form  a vector  space  (cf.  Sec.  7.4),  called  the  eigenspace  of  A 
corresponding  to  that  A. 


Aw  = Aw  and  Ax  = Ax  imply  A(w  + x)  = Aw  + Ax  = Aw  + Ax  = A(w  + x)  and 
A(&w)  = k{ Aw)  = k( Aw)  = A(£w);  hence  A(£w  + fx)  = A(Aw  + €x). 

In  particular,  an  eigenvector  x is  determined  only  up  to  a constant  factor.  Hence  we 
can  normalize  x,  that  is,  multiply  it  by  a scalar  to  get  a unit  vector  (see  Sec.  7.9).  For 
instance,  Xi  = [1  2]T  in  Example  1 has  the  length  ||xi||  = k/l2  + 22  = V5;  hence 

[1/V5  2/V5]T  is  a normalized  eigenvector  (a  unit  eigenvector). 
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Examples  2 and  3 will  illustrate  that  an  n X n matrix  may  have  n linearly  independent 
eigenvectors,  or  it  may  have  fewer  than  n.  In  Example  4 we  shall  see  that  a real  matrix 
may  have  complex  eigenvalues  and  eigenvectors. 


Multiple  Eigenvalues 

Find  the  eigenvalues  and  eigenvectors  of 


-2 


A = 2 


2 -3 

1 -6  . 


-1  -2  0 


Solution.  For  our  matrix,  the  characteristic  determinant  gives  the  characteristic  equation 

-A3  - A2  + 21A  + 45  = 0. 

The  roots  (eigenvalues  of  A)  are  Ai  = 5,  A2  = A3  = —3.  (If  you  have  trouble  finding  roots,  you  may  want  to 
use  a root  finding  algorithm  such  as  Newton’s  method  (Sec.  19.2).  Your  CAS  or  scientific  calculator  can  find 
roots.  However,  to  really  learn  and  remember  this  material,  you  have  to  do  some  exercises  with  paper  and  pencil.) 
To  find  eigenvectors,  we  apply  the  Gauss  elimination  (Sec.  7.3)  to  the  system  (A  — AI)x  = 0,  first  with  A = 5 
and  then  with  A = — 3.  For  A = 5 the  characteristic  matrix  is 


-7 

2 

-3' 

-7 

2 

-3 

A - AI  = A - 51  = 

2 

-4 

-6 

It  row-reduces  to 

0 

24 

7 

05  li^ 
1 

-1 

-2 

—5 

0 

0 

0 

Hence  it  has  rank  2.  Choosing  X3  = —1  we  have  *2  — 2 from  — ^X2  — = 0 and  then  x\  = 1 from 

— 7*1  + 2^2  — 3^3  = 0.  Hence  an  eigenvector  of  A corresponding  to  A = 5 is  Xj_  = [1  2 — 1]. 

For  A = — 3 the  characteristic  matrix 


1 

2 

-3_ 

1 

2 

-3’ 

A - AI  = A + 31  = 

2 

4 

-6 

row-reduces  to 

0 

0 

0 

-1 

-2 

3 

0 

0 

0 

Hence  it  has  rank  1.  From  xi  + 2x2  — 3x3  = 0 we  have  xi  = —2x2  + 3x3.  Choosing  X2  — 1,  X3  = 0 and 
X2  — 0,  X3  = 1,  we  obtain  two  linearly  independent  eigenvectors  of  A corresponding  to  A = —3  [as  they  must 
exist  by  (5),  Sec.  7.5,  with  rank  = 1 and  n = 3], 


-2 


x2  = 


1 


and 


x,3  = 


The  order  MA  of  an  eigenvalue  A as  a root  of  the  characteristic  polynomial  is  called  the 
algebraic  multiplicity  of  A.  The  number  mA  of  linearly  independent  eigenvectors 
corresponding  to  A is  called  the  geometric  multiplicity  of  A.  Thus  raA  is  the  dimension 
of  the  eigenspace  corresponding  to  this  A. 
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Since  the  characteristic  polynomial  has  degree  n,  the  sum  of  all  the  algebraic 
multiplicities  must  equal  n.  In  Example  2 for  A = —3  we  have  mk  = M k = 2.  In  general, 
mA  Ma,  as  can  be  shown.  The  difference  Aa  = MA  — m A is  called  the  defect  of  A. 
Thus  A_3  = 0 in  Example  2,  but  positive  defects  Aa  can  easily  occur: 

Algebraic  Multiplicity,  Geometric  Multiplicity.  Positive  Defect 

The  characteristic  equation  of  the  matrix 


0 

1 

-A  1 

is 

det  (A  — AI)  = 

0 

0 

-< 

O 

Hence  A = 0 is  an  eigenvalue  of  algebraic  multiplicity  Mq  = 2.  But  its  geometric  multiplicity  is  only  m o — 1, 
since  eigenvectors  result  from  — (ki  + *2  = 0,  hence  X2  — 0,  in  the  form  [xi  0].  Hence  for  A = 0 the  defect 
is  Aq  = 1. 

Similarly,  the  characteristic  equation  of  the  matrix 


3 2 

3 - A 2 

A = 

is 

det  (A  — AI)  = 

0 3. 

0 3 - A 

Hence  A = 3 is  an  eigenvalue  of  algebraic  multiplicity  M3  = 2,  but  its  geometric  multiplicity  is  only  m3  = 1, 
since  eigenvectors  result  from  Oxi  + 2^2  — 0 in  the  form  [x±  0]T. 

Real  Matrices  with  Complex  Eigenvalues  and  Eigenvectors 

Since  real  polynomials  may  have  complex  roots  (which  then  occur  in  conjugate  pairs),  a real  matrix  may  have 
complex  eigenvalues  and  eigenvectors.  For  instance,  the  characteristic  equation  of  the  skew- symmetric  matrix 


0 

T 

-A  1 

is 

det  (A  — AI)  = 

-1 

0 

-1  -A 

It  gives  the  eigenvalues  Ai  = i (=  V—  1),  A2  = — i.  Eigenvectors  are  obtained  from  —ix  1 + *2  = 0 and 
ix  1 + *2  = 0,  respectively,  and  we  can  choose  X\  = 1 to  get 


In  the  next  section  we  shall  need  the  following  simple  theorem. 


Eigenvalues  of  the  Transpose 

The  transpose  AT  of  a square  matrix  A has  the  same  eigenvalues  as  A. 


Transposition  does  not  change  the  value  of  the  characteristic  determinant,  as  follows  from 
Theorem  2d  in  Sec.  7.7. 

Having  gained  a first  impression  of  matrix  eigenvalue  problems,  we  shall  illustrate  their 
importance  with  some  typical  applications  in  Sec.  8.2. 
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1-16 1 EIGENVALUES,  EIGENVECTORS 

Find  the  eigenvalues.  Find  the  corresponding  eigenvectors. 
Use  the  given  A or  factor  in  Probs.  11  and  15. 


1. 


3. 


5. 


3.0  0 

0 -0.6 

5 -2 

9 -6 


0 

-3 


2. 


4. 


15. 


16. 


1 2 

6. 

0 3 

17- 

20 

1 

0 

12 

0 

0 

-1 

0 

12 

0 

0 

-1 

-4 

0 

0 

-4 

-1 

3 

0 

4 

2 

0 

1 

-2 

4 

2 

4 

-1 

-2 

0 

2 

-2 

3 

(A  + If 


LINEAR  TRANSFORMATIONS 
AND  EIGENVALUES 


7. 


9. 


0 

1 

8. 

a 

b 

0 

0 

-b 

a 

0.8 

-0.6' 

cos  8 

— sin  8 

10. 

0.6 

0.8 

sin  8 

cos  8 

11. 


2 -2 
5 0 

0 7 


A = 3 


Find  the  matrix  A in  the  linear  transformation  y = Ax, 
where  x = |xj  x2]T  (x  = Ui  x2  *3]T)  are  Cartesian 
coordinates.  Find  the  eigenvalues  and  eigenvectors  and 
explain  their  geometric  meaning. 

17.  Counterclockwise  rotation  through  the  angle  7t/2  about 
the  origin  in  R2. 

18.  Reflection  about  the  x^-axis  in  R2. 

19.  Orthogonal  projection  (perpendicular  projection)  of  R2 
onto  the  x2-axis. 

20.  Orthogonal  projection  of  R 3 onto  the  plane  x2  = X\. 


21-25 


GENERAL  PROBLEMS 


12. 


3 

0 

0 


5 

4 

0 


3 

6 

1 


14. 


2 

0 

1 


0 -1 

h o 

0 4 


21.  Nonzero  defect.  Find  further  2X2  and  3X3 
matrices  with  positive  defect.  See  Example  3. 

22.  Multiple  eigenvalues.  Find  further  2X2  and  3X3 
matrices  with  multiple  eigenvalues.  See  Example  2. 

23.  Complex  eigenvalues.  Show  that  the  eigenvalues  of  a 
real  matrix  are  real  or  complex  conjugate  in  pairs. 

24.  Inverse  matrix.  Show  that  A-1  exists  if  and  only  if 
the  eigenvalues  Ai,  ■ ■ • , Xn  are  all  nonzero,  and  then 
A-1  has  the  eigenvalues  1/Ai,  ■ ■ ■ , 1/An. 

25.  Transpose.  Illustrate  Theorem  3 with  examples  of  your 
own. 


8.j  Some  Applications  of  Eigenvalue  Problems 

We  have  selected  some  typical  examples  from  the  wide  range  of  applications  of  matrix 
eigenvalue  problems.  The  last  example,  that  is.  Example  4,  shows  an  application  involving 
vibrating  springs  and  ODEs.  It  falls  into  the  domain  of  Chapter  4,  which  covers  matrix 
eigenvalue  problems  related  to  ODE’s  modeling  mechanical  systems  and  electrical 
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networks.  Example  4 is  included  to  keep  our  discussion  independent  of  Chapter  4. 
(However,  the  reader  not  interested  in  ODEs  may  want  to  skip  Example  4 without  loss 
of  continuity.) 

Stretching  of  an  Elastic  Membrane 

An  elastic  membrane  in  the  x1x2-plane  with  boundary  circle  a?  + x|  = 1 (Fig.  160)  is  stretched  so  that  a point 
P:  (x  ] . A' 2)  goes  over  into  the  point  Q:  (yj,  y2)  given  by 


y< 

'5  3" 

*i 

yi  = 5a  ! + 3x  2 

(i)  y = 

= Ax  = 

; in  components. 

.yz. 

3 5 

*2. 

y2  = 3aa  + 5x  2 

Find  the  principal  directions,  that  is,  the  directions  of  the  position  vector  x of  P for  which  the  direction  of  the 
position  vector  y of  Q is  the  same  or  exactly  opposite.  What  shape  does  the  boundary  circle  take  under  this 
deformation? 

Solution.  We  are  looking  for  vectors  x such  that  y = Ax.  Since  y = Ax,  this  gives  Ax  = Ax,  the  equation 
of  an  eigenvalue  problem.  In  components.  Ax  = Ax  is 


(2) 


5ai  + 3x2  = Aai 

or 

3xj  + 5a2  = Aa2 


(5  — A)xj  + 3.v2  = 0 

3x  x + (5  — A)a2  = 0. 


The  characteristic  equation  is 


(3) 


5 - A 
3 


3 

5 - A 


= (5  - A)2  - 9 = 0. 


Its  solutions  are  Ai  = 8 and  A2  = 2.  These  are  the  eigenvalues  of  our  problem.  For  A = Ai  = 8,  our  system  (2) 
becomes 


— 3*i  + 3*2  = 0, 


Solution  *2  — *1,  *1  arbitrary, 


3*i  — 3x2  = 0. 


for  instance,  jc  1 = = 1. 


For  A2  = 2,  our  system  (2)  becomes 


3*1  + 3*2  — 0,  Solution  x^  — — x\,  x\  arbitrary, 

3*1  + 3*2  — 0.  for  instance,  x 1 = \,x%  — — 1. 

We  thus  obtain  as  eigenvectors  of  A,  for  instance,  [1  1]T  corresponding  to  Ai  and  [1  — 1]T  corresponding  to 

A2  (or  a nonzero  scalar  multiple  of  these).  These  vectors  make  45°  and  135°  angles  with  the  positive  j^-direction. 
They  give  the  principal  directions,  the  answer  to  our  problem.  The  eigenvalues  show  that  in  the  principal 
directions  the  membrane  is  stretched  by  factors  8 and  2,  respectively;  see  Fig.  160. 

Accordingly,  if  we  choose  the  principal  directions  as  directions  of  a new  Cartesian  wiM2"coordinate  system, 
say,  with  the  positive  wi-semi-axis  in  the  first  quadrant  and  the  positive  «2"semi"axis  in  the  second  quadrant  of 
the  xix 2-system,  and  if  we  set  u 1 = rcos  </>,  U2  = a*  sin  </>,  then  a boundary  point  of  the  unstretched  circular 
membrane  has  coordinates  cos  </>,  sin  </>.  Hence,  after  the  stretch  we  have 


Z\  = 8 cos  (j>,  Z2  ~ 2 sin  </>. 

Since  cos2  <f>  + sin2  </>=!,  this  shows  that  the  deformed  boundary  is  an  ellipse  (Fig.  160) 


(4) 


^2  2 

Zl  z 2 


= 1. 
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EXAMPLE  2 


EXAMPLE  3 


Fig.  160.  Undeformed  and  deformed  membrane  in  Example  1 

Eigenvalue  Problems  Arising  from  Markov  Processes 

Markov  processes  as  considered  in  Example  13  of  Sec.  7.2  lead  to  eigenvalue  problems  if  we  ask  for  the  limit 
state  of  the  process  in  which  the  state  vector  x is  reproduced  under  the  multiplication  by  the  stochastic  matrix 
A governing  the  process,  that  is.  Ax  = x.  Hence  A should  have  the  eigenvalue  1 , and  x should  be  a corresponding 
eigenvector.  This  is  of  practical  interest  because  it  shows  the  long-term  tendency  of  the  development  modeled 
by  the  process. 

In  that  example. 


0.7  0.1  0 

0.7  0.2  0.1 

1 

1 

0.2  0.9  0.2 

For  the  transpose, 

0.1  0.9  0 

1 

= 

1 

1 

© 

O 

O 

OO 

1 

0 0.2  0.8 

1 

1 

Hence  AT  has  the  eigenvalue  1,  and  the  same  is  true  for  A by  Theorem  3 in  Sec.  8.1.  An  eigenvector  x of  A 
for  A = 1 is  obtained  from 


—0.3 

0.1 

0 

3 

10 

1 

10 

0 

A - I = 

0.2 

-0.1 

0.2 

, row-reduced  to 

0 

1 

30 

1 

5 

0.1 

0 

-0.2 

0 

0 

0 

Taking  = 1,  we  get  X2  — 6 from  —*2/30  + *3/5  = 0 and  then  x±  = 2 from  — 3*i/10  + X2/IO  = 0-  This 
gives  x = [2  6 1]T.  It  means  that  in  the  long  run,  the  ratio  Commercial  industrial: Residential  will  approach 

2:6:1,  provided  that  the  probabilities  given  by  A remain  (about)  the  same.  (We  switched  to  ordinary  fractions 
to  avoid  rounding  errors.) 

Eigenvalue  Problems  Arising  from  Population  Models.  Leslie  Model 

The  Leslie  model  describes  age-specified  population  growth,  as  follows.  Let  the  oldest  age  attained  by  the 
females  in  some  animal  population  be  9 years.  Divide  the  population  into  three  age  classes  of  3 years  each.  Let 
the  “ Leslie  matrix ” be 


(5) 


0 2.3  0.4 


L = [ljk\  - 0.6  0 0 


0 0.3  0 


where  lik  is  the  average  number  of  daughters  bom  to  a single  female  during  the  time  she  is  in  age  class  k,  and 
lj  j—i(J  = 2,  3)  is  the  fraction  of  females  in  age  class  j — 1 that  will  survive  and  pass  into  class  j.  (a)  What  is  the 
number  of  females  in  each  class  after  3,  6,  9 years  if  each  class  initially  consists  of  400  females?  (b)  For  what  initial 
distribution  will  the  number  of  females  in  each  class  change  by  the  same  proportion?  What  is  this  rate  of  change? 
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Solution. 


(a)  Initially,  x^0)  = [400  400 


XC3)  — Lx(o)  — 


0 

0.6 

0 


400],  After  3 years, 


2.3 

0.4~ 

~400_ 

1 080_ 

0 

0 

400 

= 

240 

0.3 

0 

400 

120 

Similarly,  after  6 years  the  number  of  females  in  each  class  is  given  by  x^>  = (Lx(3))T  = [600  648  72],  and 

after  9 years  we  have  x^>  = (Lx(6))T  = [1519.2  360  194.4], 

(b)  Proportional  change  means  that  we  are  looking  for  a distribution  vector  x such  that  Lx  = Ax,  where  A is 
the  rate  of  change  (growth  if  A > 1,  decrease  if  A < 1).  The  characteristic  equation  is  (develop  the  characteristic 
determinant  by  the  first  column) 


det  (L  - AI)  = -A3  - 0.6(— 2.3A  - 0.3  • 0.4)  = -A3  + 1.38A  + 0.072  = 0. 


A positive  root  is  found  to  be  (for  instance,  by  Newton’s  method,  Sec.  19.2)  A = 1 .2.  A corresponding  eigenvector 
x can  be  determined  from  the  characteristic  matrix 


-1.2 

2.3 

0.4 

1 

A - 1.21  = 

0.6 

-1.2 

0 

, say,  x = 

0.5 

0 

0.3 

-1.2 

_0125_ 

where  x 3 = 0.125  is  chosen,  X2  = 0-5  then  follows  from  0.3^2  — 1-2x3  = 0,  and  xi  = 1 from 
— 1.2xi  + 2.3x2  + 0.4x3  = 0.  To  get  an  initial  population  of  1200  as  before,  we  multiply  x by 
1200/(1  + 0.5  + 0.125)  = 738.  Answer:  Proportional  growth  of  the  numbers  of  females  in  the  three  classes 
will  occur  if  the  initial  values  are  738,  369,  92  in  classes  1,  2,  3,  respectively.  The  growth  rate  will  be  1.2  per 
3 years. 


Vibrating  System  of  Two  Masses  on  Two  Springs  (Fig.  161) 

Mass-spring  systems  involving  several  masses  and  springs  can  be  treated  as  eigenvalue  problems.  For  instance, 
the  mechanical  system  in  Fig.  161  is  governed  by  the  system  of  ODEs 


(6) 


y i = -3yi  - 2(yi  - y2)  = -5yi  + 2y2 


y'i  = -2(y2  - yi)  = 1y\  - 2y2 


where  and  y 2 are  the  displacements  of  the  masses  from  rest,  as  shown  in  the  figure,  and  primes  denote 

derivatives  with  respect  to  time  t.  In  vector  form,  this  becomes 


n ' 

yi 

-5  2" 

yi 

(7) 

y"  = 

y’i . 

= Ay  = 

2 -2 

yi. 

System  in 
static 

equilibrium 


System  in 
motion 


Fig.  161.  Masses  on  springs  in  Example  4 
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We  try  a vector  solution  of  the  form 

(8)  y = xewt. 

This  is  suggested  by  a mechanical  system  of  a single  mass  on  a spring  (Sec.  2.4),  whose  motion  is  given  by 
exponential  functions  (and  sines  and  cosines).  Substitution  into  (7)  gives 

<o2xewt  = Axe^. 

Dividing  by  i’"‘l  and  writing  co2  = A,  we  see  that  our  mechanical  system  leads  to  the  eigenvalue  problem 

(9)  Ax  = Ax  where  A = w2. 

From  Example  1 in  Sec.  8.1  we  see  that  A has  the  eigenvalues  Ai  = — 1 and  A2  = —6.  Consequently, 
(t>  = ±V=T  = ±i  and  V— 6 = ±/V 6,  respectively.  Corresponding  eigenvectors  are 


Y 

2 

(10) 

Xl  = 

2 

and 

X2  = 

-1 

From  (8)  we  thus  obtain  the  four  complex  solutions  [see  (10),  Sec.  2.2] 

X\e±l  = Xi(cos  t ± i sin  t ), 
x2e±i^t  = X2 (cos  V6  t ± i sin  V6 t). 

By  addition  and  subtraction  (see  Sec.  2.2)  we  get  the  four  real  solutions 

Xi  cos  t,  Xi  sin  t,  x2  cos  V6  t,  x2  sin  V6  t. 

A general  solution  is  obtained  by  taking  a linear  combination  of  these, 

y = Xi  ( ai  cos  t + sin  t)  + X2  ( a2  cos  V6  t + b2  sin  V6  t ) 

with  arbitrary  constants  a\,  b\,  a2,  b2  (to  which  values  can  be  assigned  by  prescribing  initial  displacement  and 
initial  velocity  of  each  of  the  two  masses).  By  (10),  the  components  of  y are 

yi  = ci\  cos  t + bi  sin  t + 2a2  cos  VfT t + 2b2  sin  V6  t 
y2  = 2ai  cos  t + 2Z?i  sin  t — a2  cos  V6  t — sin  V6  t. 

These  functions  describe  harmonic  oscillations  of  the  two  masses.  Physically,  this  had  to  be  expected  because 
we  have  neglected  damping. 


P R QBr  E M S E T"fi  7 
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ELASTIC  DEFORMATIONS 


Given  A in  a deformation  y 
directions  and  corresponding 
contraction.  Show  the  details. 


3.0 

1.5 

1.5 

3.0_ 

7 

V6 

V6 

2 

= Ax, 

find  the  principal 

factors 

of  extension  or 

"2.0 

0.4" 

0.4 

2.0 

5 2 

2 13 

1.25  0.75 
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MARKOV  PROCESSES 


Find  the  limit  state  of  the  Markov  process  modeled  by  the 
given  matrix.  Show  the  details. 


0.5 


0.8  0.5 


"0.4 

0.3 

0.3" 

"0.6 

0.1 

0.2~ 

8. 

0.3 

0.6 

0.1 

9. 

0.4 

0.1 

0.4 

0.3 

0.1 

0.6 

0 

0.8 

0.4 

5. 


1 


6. 


0.75 


1.25 
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Find  the  growth  rate  in  the  Leslie  model  (see  Example  3) 
with  the  matrix  as  given.  Show  the  details. 


10. 


12. 


0 

0.4 

0 

0 

0.5 

0 

0 


9.0 

0 

0.4 

3.0 

0 

0.5 

0 


5.0 

0 

0 

2.0 
0 

0 

0.1 


11. 


0 

0.90 

0 


2.0 

0 

0 

0 


3.45 

0 

0.45 


0.60 

0 

0 
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LEONTIEF  MODELS1 


13.  Leontief  input-output  model.  Suppose  that  three 
industries  are  interrelated  so  that  their  outputs  are  used 
as  inputs  by  themselves,  according  to  the  3X3 

consumption  matrix 


0.1 

0.5 

0 

0.8 

0 

0.4 

0.1 

0.5 

0.6 

where  is  the  fraction  of  the  output  of  industry  k 
consumed  (purchased)  by  industry  j.  Let  pj  be  the  price 
charged  by  industry  j for  its  total  output.  A problem  is 
to  find  prices  so  that  for  each  industry,  total 
expenditures  equal  total  income.  Show  that  this  leads 
to  Ap  = p,  where  p = [pj  p2  P3]T,  and  find  a 
solution  p with  nonnegative  p\,  p2,  P3- 

14.  Show  that  a consumption  matrix  as  considered  in  Prob. 
13  must  have  column  sums  1 and  always  has  the 
eigenvalue  1. 

15.  Open  Leontief  input-output  model.  If  not  the  whole 
output  but  only  a portion  of  it  is  consumed  by  the 


industries  themselves,  then  instead  of  Ax  = x (as  in  Prob. 
13),  we  have  x — Ax  = y,  where  x = [x\  x2  *3]T 
is  produced,  Ax  is  consumed  by  the  industries,  and,  thus, 
y is  the  net  production  available  for  other  consumers. 
Find  for  what  production  x a given  demand  vector 
y = [0.1  0.3  0.1  ]T  can  be  achieved  if  the  consump- 

tion matrix  is 


0.1 

0.4 

0.2 

0.5 

0 

0.1 

0.1 

0.4 

0.4 
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GENERAL  PROPERTIES  OF  EIGENVALUE 
PROBLEMS 


Let  A = be  an  n X n matrix  with  (not  necessarily 

distinct)  eigenvalues  A,,  • ■ ■ , \n.  Show. 

16.  Trace.  The  sum  of  the  main  diagonal  entries,  called 
the  trace  of  A,  equals  the  sum  of  the  eigenvalues  of  A. 

17.  “Spectral  shift.”  A — kl  has  the  eigenvalues 
Ai  — k,  ■ ■ ■ , \n  — k and  the  same  eigenvectors  as  A. 

18.  Scalar  multiples,  powers.  kA  has  the  eigenvalues 
k\i,  ■ ■ ■ , kXn.  A m(in  = 1,  2,  ■ • • ) has  the  eigenvalues 
A™,  ■ ■ ■ , A™.  The  eigenvectors  are  those  of  A. 

19.  Spectral  mapping  theorem.  The  “polynomial 
matrix” 


p( A)  — kmAn  + A'm_]Am  ^ • + k-\ A + AqI 


has  the  eigenvalues 

p(\j)  = km  A™  + km_!\jl  1 + • ■ • + kiXj  + k0 

where  j = 1,  ■ ■ • , n,  and  the  same  eigenvectors  as  A. 
20.  Perron’s  theorem.  A Leslie  matrix  L with  positive 
/i2,  / 13, 1 2i,  / 32  has  a positive  eigenvalue.  (This  is  a 
special  case  of  the  Perron-Frobenius  theorem  in  Sec. 
20.7,  which  is  difficult  to  prove  in  its  general  form.) 


8.:  Symmetric,  Skew-Symmetric, 
and  Orthogonal  Matrices 

We  consider  three  classes  of  real  square  matrices  that,  because  of  their  remarkable 
properties,  occur  quite  frequently  in  applications.  The  first  two  matrices  have  already  been 
mentioned  in  Sec.  7.2.  The  goal  of  Sec.  8.3  is  to  show  their  remarkable  properties. 


1WASSILY  LEONTIEF  (1906-1999).  American  economist  at  New  York  University.  For  his  input-output 
analysis  he  was  awarded  the  Nobel  Prize  in  1973. 
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DEFINITIONS 


EXAMPLE  1 


EXAMPLE  2 


THEOREM  1 


Symmetric,  Skew-Symmetric,  and  Orthogonal  Matrices 

A real  square  matrix  A = [a;fe]  is  called 

symmetric  if  transposition  leaves  it  unchanged, 

(1)  At  = A,  thus  afcj  = 

skew-symmetric  if  transposition  gives  the  negative  of  A, 

(2)  A1"  = —A,  thus  akj  = — 

orthogonal  if  transposition  gives  the  inverse  of  A, 

(3)  At  = A-1. 


Symmetric,  Skew-Symmetric,  and  Orthogonal  Matrices 

The  matrices 


~-3  1 5~ 

0 9 -12" 

2 1 2 

3 3 3 

1 0 -2 

» 

-9  0 20 

» 

2 2 1 

3 3 3 

5-2  4 

12  -20  0 

12  2 
3 3 3 

are  symmetric,  skew- symmetric,  and  orthogonal,  respectively,  as  you  should  verify.  Every  skew-symmetric 
matrix  has  all  main  diagonal  entries  zero.  (Can  you  prove  this?) 


Any  real  square  matrix  A may  be  written  as  the  sum  of  a symmetric  matrix  R and  a skew- 
symmetric  matrix  S,  where 

(4)  R = |(A  + At)  and  S = |(A  - AT). 

Illustration  of  Formula  (4) 


9 

5 

2 

~9.0 

3.5 

3.5" 

0 

1.5 

-1.5 

2 

3 

-8 

= R + S = 

3.5 

3.0 

-2.0 

+ 

-1.5 

0 

-6.0 

5 

4 

3 

3.5 

-2.0 

3.0 

1.5 

6.0 

0 

Eigenvalues  of  Symmetric  and  Skew-Symmetric  Matrices 

(a)  The  eigenvalues  of  a symmetric  matrix  are  real. 

(b)  The  eigenvalues  of  a skew-symmetric  matrix  are  pure  imaginary  or  zero. 


This  basic  theorem  (and  an  extension  of  it)  will  be  proved  in  Sec.  8.5. 
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THEOREM  2 


PROOF 


Eigenvalues  of  Symmetric  and  Skew-Symmetric  Matrices 

The  matrices  in  (1)  and  (7)  of  Sec.  8.2  are  symmetric  and  have  real  eigenvalues.  The  skew-symmetric  matrix 
in  Example  1 has  the  eigenvalues  0,  — 25 i,  and  25  r.  (Verify  this.)  The  following  matrix  has  the  real  eigenvalues 
1 and  5 but  is  not  symmetric.  Does  this  contradict  Theorem  1? 

'3  4" 

■ 

1 3 


Orthogonal  Transformations  and  Orthogonal  Matrices 

Orthogonal  transformations  are  transformations 


(5) 


y = Ax  where  A is  an  orthogonal  matrix. 


With  each  vector  x in  Rn  such  a transformation  assigns  a vector  y in  Rn.  For  instance, 
the  plane  rotation  through  an  angle  9 


(6) 


cos  9 

— sin  9 

Xl 

sin  9 

cos  9 

_*2_ 

is  an  orthogonal  transformation.  It  can  be  shown  that  any  orthogonal  transformation  in 
the  plane  or  in  three-dimensional  space  is  a rotation  (possibly  combined  with  a reflection 
in  a straight  line  or  a plane,  respectively). 

The  main  reason  for  the  importance  of  orthogonal  matrices  is  as  follows. 


Invariance  of  Inner  Product 

An  orthogonal  transformation  preserves  the  value  of  the  inner  product  of  vectors 
a and  b in  Rn,  defined  by 

(7)  a • b = aTb  = [cii  • • • an] 


That  is,  for  any  a and  b in  Rn,  orthogonal  n X n matrix  A,  and  u = Aa,  v = Ab 
we  have  u • v = a • b. 

Hence  the  transformation  also  preserves  the  length  or  norm  of  any  vector  a in 
Rn  given  by 


(8) 


= Va  • a = 


Let  A be  orthogonal.  Let  u = Aa  and  v = Ab.  We  must  show  that  u • v = a • b.  Now 
(Aa)T  = aTAT  by  (lOd)  in  Sec.  7.2  and  ATA  = A-1A  = I by  (3).  Hence 

(9)  u • v = uTv  = (Aa)TAb  = aTATAb  = aTIb  = aTb  = a • b. 

From  this  the  invariance  of  II  all  follows  if  we  set  b = a. 
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THEOREM  3 


PROOF 


THEOREM  4 


PROOF 


EXAMPLE  4 


THEOREM  5 


Orthogonal  matrices  have  further  interesting  properties  as  follows. 


Orthonormality  of  Column  and  Row  Vectors 

A real  square  matrix  is  orthogonal  if  and  only  if  its  column  vectors  ai,  • ■ • , an  ( and 
also  its  row  vectors ) form  an  orthonormal  system,  that  is, 

("0  if  j ¥=  k 

(10)  a,  • afc  = a/afc  = \ 

if  j = k. 


(a)  Let  A be  orthogonal.  Then  A XA  = ATA  = I.  In  terms  of  column  vectors  ai,  ■ ■ • , a„, 


a! 

alai 

ala2 

aian 

(11) 

I = A-1  A = AtA  = 

3n_ 

[ar 

’ a n\ 

_anal 

aIa2 

anan_ 

The  last  equality  implies  (10),  by  the  definition  of  the  n X n unit  matrix  I.  From  (3)  it 
follows  that  the  inverse  of  an  orthogonal  matrix  is  orthogonal  (see  CAS  Experiment  12). 
Now  the  column  vectors  of  A-1(=AT)  are  the  row  vectors  of  A.  Hence  the  row  vectors 
of  A also  form  an  orthonormal  system. 

(b)  Conversely,  if  the  column  vectors  of  A satisfy  (10),  the  off-diagonal  entries  in  (11) 
must  be  0 and  the  diagonal  entries  1.  Hence  ATA  = I,  as  (11)  shows.  Similarly,  AAT  = I. 
This  implies  AT  = A-1  because  also  A-1A  = AA-1  = I and  the  inverse  is  unique.  Hence 
A is  orthogonal.  Similarly  when  the  row  vectors  of  A form  an  orthonormal  system,  by 
what  has  been  said  at  the  end  of  part  (a). 


Determinant  of  an  Orthogonal  Matrix 

The  determinant  of  an  orthogonal  matrix  has  the  value  +1  or  — 1. 


From  det  AB  = det  A det  B (Sec.  7.8,  Theorem  4)  and  det  AT  = det  A (Sec.  7.7, 
Theorem  2d),  we  get  for  an  orthogonal  matrix 

1 = det  I = det(AA_1)  = det(AAT)  = det  A det  AT  = (det  A)2. 

Illustration  of  Theorems  3 and  4 

The  last  matrix  in  Example  1 and  the  matrix  in  (6)  illustrate  Theorems  3 and  4 because  their  determinants  are 
— 1 and  +1,  as  you  should  verify. 


Eigenvalues  of  an  Orthogonal  Matrix 

The  eigenvalues  of  an  orthogonal  matrix  A are  real  or  complex  conjugates  in  pairs 
and  have  absolute  value  1. 


338 


CHAP.  8 Linear  Algebra:  Matrix  Eigenvalue  Problems 


PROOF  The  first  part  of  the  statement  holds  for  any  real  matrix  A because  its  characteristic 
polynomial  has  real  coefficients,  so  that  its  zeros  (the  eigenvalues  of  A)  must  be  as 
indicated.  The  claim  that  | A | = 1 will  be  proved  in  Sec.  8.5. 

Eigenvalues  of  an  Orthogonal  Matrix 

The  orthogonal  matrix  in  Example  1 has  the  characteristic  equation 

-A3  + §A2  + |A  - 1 = 0. 

Now  one  of  the  eigenvalues  must  be  real  (why?),  hence  +1  or  —1.  Trying,  we  find  —1.  Division  by  A + 1 
gives  —(A2  — 5A/3  + 1)  = 0 and  the  two  eigenvalues  (5  + zVTT)/6  and  (5  — z'V TT)/6,  which  have  absolute 
value  1.  Verify  all  of  this. 


Looking  back  at  this  section,  you  will  find  that  the  numerous  basic  results  it  contains  have 
relatively  short,  straightforward  proofs.  This  is  typical  of  large  portions  of  matrix 
eigenvalue  theory. 


1-10 


SPECTRUM 


Are  the  following  matrices  symmetric,  skew-symmetric,  or 
orthogonal?  Find  the  spectrum  of  each,  thereby  illustrating 
Theorems  1 and  5.  Show  your  work  in  detail. 


0.8 

0.6 

-0.6 

0.8 

b 

a 


3. 


5. 


2 8 
-8  2 
6 0 0 
0 2-2 
0-2  5 


4. 


cos  8 
sin  8 


— sin  6 
cos  6 


6. 


k k 

a k 

k a 


(b)  Rotation.  Show  that  (6)  is  an  orthogonal  trans- 
formation. Verify  that  it  satisfies  Theorem  3.  Find  the 
inverse  transformation. 

(c)  Powers.  Write  a program  for  computing  powers 
A m {m  — 1,2,  ■■■)  of  a 2X2  matrix  A and  their 
spectra.  Apply  it  to  the  matrix  in  Prob.  1 (call  it  A).  To 
what  rotation  does  A correspond?  Do  the  eigenvalues 
of  Am  have  a limit  as  m — » °°? 

(d)  Compute  the  eigenvalues  of  (0.9A)m,  where  A is 
the  matrix  in  Prob.  1 . Plot  them  as  points.  What  is  their 
limit?  Along  what  kind  of  curve  do  these  points 
approach  the  limit? 

(e)  Find  A such  that  y = Ax  is  a counterclockwise 
rotation  through  30°  in  the  plane. 


0 

9 

-12 

1 

0 

0 

7. 

-9 

0 

20 

8. 

0 

cos  6 

- 

sin  8 

12 

-20 

0_ 

0 

sin  6 

cos  8 

0 

0 

1 

4 

9 

8 

9 

1 

9 

9. 

0 

1 

0 

10. 

7 

9 

4 

9 

4 

9 

-1 

0 

0 

4 

9 

1 

9 

8 

9_ 

11.  WRITING  PROJECT.  Section  Summary.  Sum- 
marize the  main  concepts  and  facts  in  this  section, 
giving  illustrative  examples  of  your  own. 

12.  CAS  EXPERIMENT.  Orthogonal  Matrices. 

(a)  Products.  Inverse.  Prove  that  the  product  of  two 
orthogonal  matrices  is  orthogonal,  and  so  is  the  inverse 
of  an  orthogonal  matrix.  What  does  this  mean  in  terms 
of  rotations? 


13-20 


GENERAL  PROPERTIES 


13.  Verification.  Verify  the  statements  in  Example  1. 

14.  Verify  the  statements  in  Examples  3 and  4. 

15.  Sum.  Are  the  eigenvalues  of  A + B sums  of  the 
eigenvalues  of  A and  of  B? 

16.  Orthogonality.  Prove  that  eigenvectors  of  a symmetric 
matrix  corresponding  to  different  eigenvalues  are 
orthogonal.  Give  examples. 

17.  Skew-symmetric  matrix.  Show  that  the  inverse  of  a 
skew-symmetric  matrix  is  skew-symmetric. 

18.  Do  there  exist  nonsingular  skew-symmetric  n X n 
matrices  with  odd  nl 


19.  Orthogonal  matrix.  Do  there  exist  skew-symmetric 
orthogonal  3X3  matrices? 

20.  Symmetric  matrix.  Do  there  exist  nondiagonal 
symmetric  3X3  matrices  that  are  orthogonal? 
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8^  Eigenbases.  Diagonalization. 

Quadratic  Forms 

So  far  we  have  emphasized  properties  of  eigenvalues.  We  now  turn  to  general  properties 
of  eigenvectors.  Eigenvectors  of  an  n X n matrix  A may  (or  may  not!)  form  a basis  for 
Rn.  If  we  are  interested  in  a transformation  y = Ax,  such  an  “eigenbasis”  (basis  of 
eigenvectors) — if  it  exists — is  of  great  advantage  because  then  we  can  represent  any  x in 
Rn  uniquely  as  a linear  combination  of  the  eigenvectors  xi,  ■ ■ • , xn,  say, 

X = CiX!  + C2X2  + ' ' • + CnXn. 

And,  denoting  the  corresponding  (not  necessarily  distinct)  eigenvalues  of  the  matrix  A by 
Ai,  • • • , An,  we  have  A Xj  = \jXj,  so  that  we  simply  obtain 

y = Ax  = A(ciXi  + • ■ ■ + cnxn) 

(1)  = ciAx!  + • • • + cnAxn 

cjA^Xi  4*  • ■ • 4~  cn\nxn. 


This  shows  that  we  have  decomposed  the  complicated  action  of  A on  an  arbitrary  vector 
x into  a sum  of  simple  actions  (multiplication  by  scalars)  on  the  eigenvectors  of  A.  This 
is  the  point  of  an  eigenbasis. 

Now  if  the  n eigenvalues  are  all  different,  we  do  obtain  a basis: 


THEOREM  1 


Basis  of  Eigenvectors 

If  an  n X n matrix  A has  n distinct  eigenvalues,  then  A has  a basis  of  eigenvectors 
xi,  • • • , xn  for  Rn. 


PROOF  All  we  have  to  show  is  that  Xi,  • • • , xn  are  linearly  independent.  Suppose  they  are  not.  Let 
r be  the  largest  integer  such  that  {xi,  • • • , xr}  is  a linearly  independent  set.  Then  r < n 
and  the  set  {xi,  • • • , xr,  xr+i}  is  linearly  dependent.  Thus  there  are  scalars  ci,  • ■ • , cr+ 1, 
not  all  zero,  such  that 

(2)  CiX!  + • • • + cr+1xr+1  = 0 

(see  Sec.  7.4).  Multiplying  both  sides  by  A and  using  A xj  = A .jxj,  we  obtain 

(3)  A(ciXi  + • • • + cr+ixr+i)  = ciA]Xi  + • • • + cr+iAr+iXr+i  = AO  = 0. 

To  get  rid  of  the  last  term,  we  subtract  Ar+i  times  (2)  from  this,  obtaining 

Ci(Ai  Ay-f-]_)xj_  * * ■ 4~  cr(Ar  Ar+i)xr  0. 

Here  ci(Ai  — Ar+1)  = 0,  • • • , cr( Ar  — Ar+i)  = 0 since  { x ] , • • • ,xr}  is  linearly  independent. 
Hence  c\=  • • • = cr  = 0,  since  all  the  eigenvalues  are  distinct.  But  with  this,  (2)  reduces  to 
cy+ixr+i  = 0,  hence  cy+i  = 0,  since  xr+i  # 0 (an  eigenvector!).  This  contradicts  the  fact 
that  not  all  scalars  in  (2)  are  zero.  Hence  the  conclusion  of  the  theorem  must  hold. 
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THEOREM  2 


EXAMPLE  2 


DEFINITION 


THEOREM  3 


Eigenbasis.  Nondistinct  Eigenvalues.  Nonexistence 


5 3' 

V 

T 

The  matrix  A = 

3 5 

has  a basis  of  eigenvectors 

t_ 

> 

-l 

corresponding  to  the  eigenvalues  A,  = 8, 


A2  = 2.  (See  Example  1 in  Sec.  8.2.) 

Even  if  not  all  n eigenvalues  are  different,  a matrix  A may  still  provide  an  eigenbasis  for  Rn.  See  Example  2 
in  Sec.  8.1.  where  n = 3. 

On  the  other  hand,  A may  not  have  enough  linearly  independent  eigenvectors  to  make  up  a basis.  For 
instance,  A in  Example  3 of  Sec.  8.1  is 


o T 

k 

A = 

° 0 

and  has  only  one  eigenvector 

0 

(k  A 0,  arbitrary). 


Actually,  eigenbases  exist  under  much  more  general  conditions  than  those  in  Theorem  1 . 
An  important  case  is  the  following. 


Symmetric  Matrices 

A symmetric  matrix  has  an  orthonormal  basis  of  eigenvectors  for  Rn  . 


For  a proof  (which  is  involved)  see  Ref.  [B3],  vol.  1,  pp.  270-272. 

Orthonormal  Basis  of  Eigenvectors 

The  first  matrix  in  Example  1 is  symmetric,  and  an  orthonormal  basis  of  eigenvectors  is  [1/V2  1/V2]T, 

[1/V2  -1/V2]t. 

Similarity  of  Matrices.  Diagonalization 

Eigenbases  also  play  a role  in  reducing  a matrix  A to  a diagonal  matrix  whose  entries  are 
the  eigenvalues  of  A.  This  is  done  by  a “similarity  transformation,”  which  is  defined  as 
follows  (and  will  have  various  applications  in  numerics  in  Chap.  20). 


Similar  Matrices.  Similarity  Transformation 

An  n X n matrix  A is  called  similar  to  an  n X n matrix  A if 

(4)  A = P_1AP 

for  some  (nonsingular!)  n X n matrix  P.  This  transformation,  which  gives  A from 
A,  is  called  a similarity  transformation. 


The  key  property  of  this  transformation  is  that  it  preserves  the  eigenvalues  of  A: 


Eigenvalues  and  Eigenvectors  of  Similar  Matrices 

If  A is  similar  to  A,  then  A has  the  same  eigenvalues  as  A. 

Furthermore,  ifx  is  an  eigenvector  of  A,  then  y = P-1x  is  an  eigen  vector  of  A 
corresponding  to  the  same  eigenvalue. 
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PROOF 


EXAMPLE  3 


THEOREM  4 


From  Ax  = Ax  (A  an  eigenvalue,  x =£  0)  we  get  P xAx  = AP  *x.  Now  I = PP  1.  By 
this  identity  trick  the  equation  P_1Ax  = AP_1x  gives 

P_1Ax  = P - 1 AIx  = P_1APP_1x  = (P_1AP)P_1x  = A(P_1x)  = AP_1x. 

Hence  A is  an  eigenvalue  of  A and  P_1x  a corresponding  eigenvector.  Indeed,  P_1x  # 0 
because  P_1x  = 0 would  give  x = lx  = PP_1x  = P0  = 0,  contradicting  x A 0. 


Eigenvalues  and  Vectors  of  Similar  Matrices 


6 

-3' 

1 

3' 

Let, 

A 

= 

and 

p = 

4 

-1 

1 

4 

-3" 

r 

6 

-3" 

1 

3" 

3 

Then 

A = 

-l 

1 

4 

-1 

1 

4 

— 

0 

Here  P 1 was  obtained  from  (4*)  in  Sec.  7.8  with  det  P = 1.  We  see  that  A has  the  eigenvalues  A]  = 3,  A2  = 2. 
The  characteristic  equation  of  A is  (6  — A)(— 1 — A)  + 12  = A2  — 5A  + 6 = 0.  It  has  the  roots  (the  eigenvalues 
of  A)  A j = 3,  A2  = 2,  confirming  the  first  part  of  Theorem  3. 

We  confirm  the  second  part.  From  the  first  component  of  (A  — AI)x  = 0 we  have  (6  — A)jci  — 3x2  = 0.  For 
A = 3 this  gives  3x1  — 3x2  = 0,  say,  xx  = [1  1]T.  For  A = 2 it  gives  4x1  — 3x2  = 0,  say,  x2  = [3  4]T.  In 

Theorem  3 we  thus  have 


4 

-3" 

Y 

_ 

1 

y2  = P xx2  = 

4 

-3" 

"3' 

_ 

0 

-1 

1 

1 

0 

-1 

1 

4 

1 

Indeed,  these  are  eigenvectors  of  the  diagonal  matrix  A. 

Perhaps  we  see  that  \|  and  x2  are  the  columns  of  P.  This  suggests  the  general  method  of  transforming  a 
matrix  A to  diagonal  form  D by  using  P = X,  the  matrix  with  eigenvectors  as  columns. 


By  a suitable  similarity  transformation  we  can  now  transform  a matrix  A to  a diagonal 
matrix  D whose  diagonal  entries  are  the  eigenvalues  of  A: 


Diagonalization  of  a Matrix 

If  an  n X n matrix  A has  a basis  of  eigenvectors,  then 

(5)  D = X_1AX 

is  diagonal,  with  the  eigenvalues  of  A.  as  the  entries  on  the  main  diagonal.  Here  X 
is  the  matrix  with  these  eigenvectors  as  column  vectors.  Also, 

(5*)  Dm  = X-1AmX  (m  = 2,  3,  • ■ ■ ). 
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Let  xj,  • ■ ■ , xn  be  a basis  of  eigenvectors  of  A for  K".  Let  the  corresponding  eigenvalues 
of  A be  Ai,  ■ ■ • , A„,  respectively,  so  that  Axj  = AjXj,  • • ■ , Axn  = \nxn.  Then 
X = [xj  ■ ■ ■ xn]  has  rank  n,  by  Theorem  3 in  Sec.  7.4.  Hence  X-1  exists  by  Theorem  1 
in  Sec.  7.8.  We  claim  that 


(6)  Ax  = A[x!  • • • xn]  = [Ax!  • ■ • Axn]  = [A^  • • • \nxn]  = XD 


where  D is  the  diagonal  matrix  as  in  (5).  The  fourth  equality  in  (6)  follows  by  direct 
calculation.  (Try  it  for  n = 2 and  then  for  general  n .)  The  third  equality  uses  Axj,  = Aj,x/,. 
The  second  equality  results  if  we  note  that  the  first  column  of  AX  is  A times  the  first 
column  of  X,  which  is  xi,  and  so  on.  For  instance,  when  n = 2 and  we  write 
xi  = [in  *2i],  x2  = [x12  x22],  we  have 


AX  = A[xi  x2] 


Oil  O12 

1 

* 

h-4 

* 

to 

1 

fl21  o22 

*21  *22 

aiixu  + a12x21 

021*11  + a22x21 

Column  1 


011*12  + 012*22 

fl21*12  + 022*22 

Column  2 


[AXj  Ax2]. 


If  we  multiply  (6)  by  X-1  from  the  left,  we  obtain  (5).  Since  (5)  is  a similarity 
transformation.  Theorem  3 implies  that  D has  the  same  eigenvalues  as  A.  Equation  (5*) 
follows  if  we  note  that 

D2  = DD  = (X_1AX)(X_1AX)  = X-1A(XX-1)AX  = X_1AAX  = X_1A2X,  etc.  ■ 


Diagonalization 

Diagonalize 


7.3 

0.2 

-3.7 

— 11.5 

1.0 

5.5 

17.7 

1.8 

-9.3 

Solution.  The  characteristic  determinant  gives  the  characteristic  equation  — A3  — A2  + 12A  = 0.  The  roots 
(eigenvalues  of  A)  are  Ai  = 3,  A2  = —4,  A3  = 0.  By  the  Gauss  elimination  applied  to  (A  — AI)x  = 0 with 
A = Ai,  A2,  A3  we  find  eigenvectors  and  then  X-1  by  the  Gauss-Jordan  elimination  (Sec.  7.8,  Example  1).  The 
results  are 


-1 

1 

2 

~-l  1 2" 

-0.7  0.2  0.3 

3 

-1 

, 

1 

X = 

3 -1  1 

, X-1  = 

-1.3  -0.2  0.7 

-1 

3 

4 

1 

"'t 

co 

T 

1 

1 

O 

OO 

O 

to 

1 

0 

i!±_ 

Calculating  AX  and  multiplying  by  X 1 from  the  left,  we  thus  obtain 


-0.7  0.2  0.3" 

~-3  -4  0" 

"3  0 0" 

-1.3  -0.2  0.7 

9 4 0 

= 

0 -4  0 

1 

(N 

O 

1 

<N 

O 

OO 

O 

1 

-3  -12  0_ 

_0  0 0_ 

D = X-1AX  = 
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EXAMPLE  5 


Quadratic  Forms.  Transformation  to  Principal  Axes 

By  definition,  a quadratic  form  Q in  the  components  xy,  ■ ■ • , xn  of  a vector  x is  a sum 
of  n2  terms,  namely, 

n n 

Q.  X Ax  djkXjXk. 

3 = 1 fc  = l 

^llAl  “1“  fl^2-Tl-T2  "f”  * ’ ' Q,\yiX \X yi 
(7)  + fl2 1*2*1  + 022*1  + •••  + U2nX2Xn 

+ 

“f  Uri\XnX  r 4"  Un2XriX2  4"  ' ' ' 4“  QnnXn. 


A = [to]  is  called  the  coefficient  matrix  of  the  form.  We  may  assume  that  A is 
symmetric,  because  we  can  take  off-diagonal  terms  together  in  pairs  and  write  the  result 
as  a sum  of  two  equal  terms;  see  the  following  example. 

Quadratic  Form.  Symmetric  Coefficient  Matrix 

Let 


xtAx  = [*i  x2] 


3 4 
6 2 


= 3x?  + 4xix2  + 6x2x1  + 2x1  = 3xf  + 10xiX2  + 2x1. 


Here  4 + 6=  10  = 5 + 5.  From  the  corresponding  symmetric  matrix  C = | o^!:  ] , where  Cjk  = h(ajk  + ak:j)> 
thus  Cn  = 3,  c12  = c2 1 = 5,  c22  = 2,  we  get  the  same  result;  indeed. 


xTCx  = [a  i x2] 


3 5 
5 2 


= 3a|  + 5x^2  + 5a2A!  + 2x\  = 3xf  + 10x^2  + 2v|. 


Quadratic  forms  occur  in  physics  and  geometry,  for  instance,  in  connection  with  conic 
sections  (ellipses  x2 /a2  4-  x2/b2  = 1,  etc.)  and  quadratic  surfaces  (cones,  etc.).  Their 
transformation  to  principal  axes  is  an  important  practical  task  related  to  the  diagonalization 
of  matrices,  as  follows. 

By  Theorem  2,  the  symmetric  coefficient  matrix  A of  (7)  has  an  orthonormal  basis  of 
eigenvectors.  Hence  if  we  take  these  as  column  vectors,  we  obtain  a matrix  X that  is 
orthogonal,  so  that  X-1  = XT.  From  (5)  we  thus  have  A = XDX-1  = XDXT.  Substitution 
into  (7)  gives 


(8) 


Q = xtXDXtx. 


If  we  set  XTx  = y,  then,  since  XT  = X 1,  we  have  X 'x  = v and  thus  obtain 


(9)  x = Xy. 

Furthermore,  in  (8)  we  have  xTX  = (XTx)T  = yT  and  XTx  = y,  so  that  Q becomes  simply 


(10)  Q - yTDy  - Ajyf  4-  A^yl  4-  • • • 4-  \ny\. 
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This  proves  the  following  basic  theorem. 


Principal  Axes  Theorem 

The  substitution  (9)  transforms  a quadratic  form 

n n 

Q Ax  QjfcXjXfc  (Okj  ttjjf) 

3 = 1 fc=l 

to  the  principal  axes  form  or  canonical  form  (10),  where  Ai,  • • • , \n  are  the  ( not 
necessarily  distinct ) eigenvalues  of  the  ( symmetric !)  matrix  A,  and  X is  an 
orthogonal  matrix  with  corresponding  eigenvectors  xi,  ■ ■ ■ , xn,  respectively,  as 
column  vectors. 


Transformation  to  Principal  Axes.  Conic  Sections 

Find  out  what  type  of  conic  section  the  following  quadratic  form  represents  and  transform  it  to  principal  axes: 

Q = 17*1  - 30x1*2  + 17*1  = 128. 


Solution.  We  have  Q = xTAx,  where 


17 

-15' 

*1 

A = 

> x = 

— !5 

17 

*2. 

This  gives  the  characteristic  equation  (17  — A)2  — 152  = 0.  It  has  the  roots  Ai  = 2,  A2  = 32.  Hence  (TO) 
becomes 


Q = 2yi  + 32 yi 

We  see  that  Q = 128  represents  the  ellipse  2yi  + 32y|  “ 128,  that  is, 


If  we  want  to  know  the  direction  of  the  principal  axes  in  the  x ix 2-coordinates,  we  have  to  determine  normalized 
eigenvectors  from  (A  — AI)x  = 0 with  A = Ai  = 2 and  A = A2  = 32  and  then  use  (9).  We  get 


hence 


1/V2 

1/V2 


and 


-1/V2 

1/V2J’ 


1/V2 

-1/V2" 

^1 

1/V2 

1/V2 

y2m 

Xj  = yi/V2  - y2/V2 
x 2 = yi/V2  + y2/V2. 


This  is  a 45°  rotation.  Our  results  agree  with  those  in  Sec.  8.2,  Example  1,  except  for  the  notations.  See  also 
Fig.  160  in  that  example. 
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1-5  SIMILAR  MATRICES  HAVE  EQUAL 
EIGENVALUES 

Verify  this  for  A and  A = P-1AP.  If  y is  an  eigenvector 
of  P,  show  that  x = Py  are  eigenvectors  of  A.  Show  the 
details  of  your  work. 


1.  A = 


2.  A = 


3.  A = 


4.  A = 


"3 

4" 

P = 

4 

-3 

. 

"l 

o' 

P = 

2 

-1 

. 

"8 

-4" 

P = 

2 

2 

. 

"0 

0 

2 

0 

3 

2 

, I 

1 

0 

1 

-4  2 

3 -1 

7 -5 

10  -7 

0.28 
-0.96 
2 
0 
3 


0.96 

0.28 

0 

1 

0 


Ai  = 3 


~-5 

0 

15" 

"0 

1 

0" 

3 

4 

-9 

, P = 

1 

0 

0 

-5 

0 

15 

0 

0 

1 

5.  A = 


6.  PROJECT.  Similarity  of  Matrices.  Similarity  is 
basic,  for  instance,  in  designing  numeric  methods. 

(a)  Trace.  By  definition,  the  trace  of  an  n X n matrix 
A = [fljk]  is  the  sum  of  the  diagonal  entries. 


trace  A = on  + a2 2 + 


+ CL.-, 


Show  that  the  trace  equals  the  sum  of  the  eigenvalues, 
each  counted  as  often  as  its  algebraic  multiplicity 
indicates.  Illustrate  this  with  the  matrices  A in  Probs. 
1,  3,  and  5. 

(b)  Trace  of  product.  Let  B = [bjk\  be  n X n.  Show 
that  similar  matrices  have  equal  traces,  by  first  proving 

n n 

trace  AB  = ^ ^ auhi  ~ trace  BA. 

i=l  1=1 

(c)  Find  a relationship  between  A in  (4)  and 
A = PAP-1. 

(d)  Diagonalization.  What  can  you  do  in  (5)  if  you 
want  to  change  the  order  of  the  eigenvalues  in  D,  for 
instance,  interchange  d\\  = Aj  and  d2 2 = A2? 

7.  No  basis.  Find  further  2X2  and  3X3  matrices 
without  eigenbasis. 


8.  Orthonormal  basis.  Illustrate  Theorem  2 with  further 
examples. 


9-16 


DIAGONALIZATION  OF  MATRICES 


Find  an  eigenbasis  (a  basis  of  eigenvectors)  and  diagonalize. 
Show  the  details. 


9. 


11. 


13. 


14. 


15. 


16. 


1 2 

2 4 

-19  7 

-42  16 

’ 4 0 

12  -2 
21  -6 


10. 


12. 


1 0 
2 -1 
-4.3 

1.3 


7.7 

9.3 


-5 

-6 

-9 

-8 

-12 

-12 

6 

12 

16 


Ai  = -2 


Ai  = 10 


17-23 


0 -4 


PRINCIPAL  AXES.  CONIC  SECTIONS 


What  kind  of  conic  section  (or  pair  of  straight  lines)  is  given 
by  the  quadratic  form?  Transform  it  to  principal  axes. 
Express  xT  = [xi  X2]  in  terms  of  the  new  coordinate 
vector  yT  = [yi  >'2],  as  in  Example  6. 

17.  7x?  + 6x]X2  + 7x1  = 200 

18.  3x?  + 8xi.x2  - 3x1  = 10 

19.  3x?  + 22x1X2  + 3x|  = 0 

20.  9xf  + 6x1X2  ‘ x|  — 10 

21.  xf  — 1 2x1X2  + x|  = 70 

22.  4xf  + 12xix2  + 13x1  = 16 

23.  - 1 lx?  + 84xix2  + 24x1  = 156 
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24.  Definiteness.  A quadratic  form  Q (x)  = xTAx  and  its 
(symmetric!)  matrix  A are  called  (a)  positive  definite 
if  Q(x)  > 0 for  all  x A 0,  (b)  negative  definite  if 
<2(x)  < 0 for  all  x A 0,  (c)  indefinite  if  Q(x)  takes 
both  positive  and  negative  values.  (See  Fig.  162.) 
[2(x)  and  A are  called  positive  semidefinite  ( negative 
semidefinite)  if  Q (x)  £ 0 (Q  (x)  S 0)  for  all  x.]  Show 
that  a necessary  and  sufficient  condition  for  (a),  (b), 
and  (c)  is  that  the  eigenvalues  of  A are  (a)  all  positive, 
(b)  all  negative,  and  (c)  both  positive  and  negative. 
Hint.  Use  Theorem  5. 

25.  Definiteness.  A necessary  and  sufficient  condition  for 
positive  definiteness  of  a quadratic  form  Q (x)  = xTAx 
with  symmetric  matrix  A is  that  all  the  principal  minors 
are  positive  (see  Ref.  [B3],  vol.  1,  p.  306),  that  is, 


tin  it  12 


an 

> o. 

Ol2 

> 0, 

022 

till 

a 12 

a13 

ai2 

022 

023 

> o, 

■ ■ ■ , det  A > 0. 

tl!3 

023 

033 

Show  that  the  form  in  Prob.  22  is  positive  definite, 
whereas  that  in  Prob.  23  is  indefinite. 


Q(x) 


(c)  Indefinite  form 

Fig.  162.  Quadratic  forms  in  two  variables  (Problem  24) 


8.5  Complex  Matrices  and  Forms.  Optional 

The  three  classes  of  matrices  in  Sec.  8.3  have  complex  counterparts  which  are  of  practical 
interest  in  certain  applications,  for  instance,  in  quantum  mechanics.  This  is  mainly  because 
of  their  spectra  as  shown  in  Theorem  1 in  this  section.  The  second  topic  is  about  extending 
quadratic  forms  of  Sec.  8.4  to  complex  numbers.  (The  reader  who  wants  to  brush  up  on 
complex  numbers  may  want  to  consult  Sec.  13.1.) 


Notations 

A = [ojk]  is  obtained  from  A = [ctjk]  by  replacing  each  entry  ajk  = a + if3 
(a,  |S  real)  with  its  complex  conjugate  = a — i/3.  Also,  A = [a^]  is  the  transpose 
of  A,  hence  the  conjugate  transpose  of  A. 


EXAMPLE  1 Notations 


3 + 4i  1 - i 

3 -4  ( 1 + i 

3 - 4 i 6 

, then  A = 

and  A T = 

6 2 - 5i 

6 2 + 5( 

1 + ( 2 + 5i 

If  A = 
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DEFINITION 


EXAMPLE  2 


Hermitian,  Skew-Hermitian,  and  Unitary  Matrices 

A square  matrix  A = [ajy]  is  called 


Hermitian 

skew-Hermitian 

unitary 


if  AT  = A, 
if  At  = -A, 
if  At  = A-1. 


that  is, 
that  is, 


«fej  @jk 
akj  — ~ajk 


The  first  two  classes  are  named  after  Hermite  (see  footnote  13  in  Problem  Set  5.8). 

From  the  definitions  we  see  the  following.  If  A is  Hermitian,  the  entries  on  the  main 
diagonal  must  satisfy  djj  = ajf  that  is,  they  are  real.  Similarly,  if  A is  skew-Hermitian, 
then  aTj  = — ajj.  If  we  set  a™  = a + i(3.  this  becomes  a — i(i  = — (a  + i/3).  Hence  a = 0, 
so  that  a,jj  must  be  pure  imaginary  or  0. 

Hermitian,  Skew-Hermitian,  and  Unitary  Matrices 


4 

1 - 3 i 

3/ 

2 + i 

hi 

z V3 

A = 

1 + 3/ 

1 

B = 

— 2 + i 

— i 

c = 

jvT 

h 

are  Hermitian,  skew-Hermitian,  and  unitary  matrices,  respectively,  as  you  may  verify  by  using  the  definitions. 

If  a Hermitian  matrix  is  real,  then  AT  = AT  = A.  Hence  a real  Hermitian  matrix  is  a 
symmetric  matrix  (Sec.  8.3). 

Similarly,  if  a skew-Hermitian  matrix  is  real,  then  A = AT  = —A.  Hence  a real  skew- 
Hermitian  matrix  is  a skew-symmetric  matrix. 

Finally,  if  a unitary  matrix  is  real,  then  A = AT  = A-1.  Hence  a real  unitary  matrix 
is  an  orthogonal  matrix. 

This  shows  that  Hermitian,  skew-Hermitian,  and  unitary  matrices  generalize  symmetric, 
skew-symmetric,  and  orthogonal  matrices,  respectively. 

Eigenvalues 

It  is  quite  remarkable  that  the  matrices  under  consideration  have  spectra  (sets  of  eigenvalues; 
see  Sec.  8.1)  that  can  be  characterized  in  a general  way  as  follows  (see  Fig.  163). 


Fig.  163.  Location  of  the  eigenvalues  of  Hermitian,  skew-Hermitian, 
and  unitary  matrices  in  the  complex  A-plane 
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THEOREM  1 


EXAMPLE  3 


PROOF 


Eigenvalues 

(a)  The  eigenvalues  of  a Hermitian  matrix  ( and  thus  of  a symmetric  matrix) 
are  real. 

(b)  The  eigenvalues  of  a skew -Hermitian  matrix  ( and  thus  of  a skew-symmetric 
matrix)  are  pure  imaginary  or  zero. 

(c)  The  eigenvalues  of  a unitary  matrix  ( and  thus  of  an  orthogonal  matrix)  have 
absolute  value  1. 


Illustration  of  Theorem  1 

For  the  matrices  in  Example  2 we  find  by  direct  calculation 


Matrix 

Characteristic  Equation 

Eigenvalues 

A 

Hermitian 

A2  - 11A  + 18  = 0 

9,  2 

B 

Skew-Hermitian 

A2  - 2iA  + 8 = 0 

4 i,  -2  i 

C 

Unitary 

> 

to 

1 

1 

II 

O 

\ V3  + g i,  — g a/3  + g i 

and  |±|  V3  + |(|2  = \ + j = 1. 

We  prove  Theorem  1 . Let  A be  an  eigenvalue  and  x an  eigenvector  of  A.  Multiply  Ax  = Ax 
from  the  left  by  xT,  thus  xTAx  = AxTx,  and  divide  by  xTx  = XiXi  + ■ ■ • + xnxn  = 
Uil2  + ■ • • + \xn\2,  which  is  real  and  not  0 because  x 0.  This  gives 


(1) 


A = 


xtAx 
xTx  ' 


(a)  If  A is  Hermitian,  AT  = A or  AT  = A and  we  show  that  then  the  numerator  in  (1) 
is  real,  which  makes  A real.  xTAx  is  a scalar;  hence  taking  the  transpose  has  no  effect.  Thus 

(2)  xtAx  = (xtAx)t  = xtAtx  = xtAx  = (xTAx). 

Hence,  xTAx  equals  its  complex  conjugate,  so  that  it  must  be  real,  (a  + ib  = a — ib 
implies  b = 0.) 

(b)  If  A is  skew-Hermitian,  AT  = —A  and  instead  of  (2)  we  obtain 

(3)  xtAx  = — (xTAx) 

so  that  xtAx  equals  minus  its  complex  conjugate  and  is  pure  imaginary  or  0. 
(i a + ib  = —(a  — ib)  implies  a = 0.) 

(c)  Let  A be  unitary.  We  take  Ax  = Ax  and  its  conjugate  transpose 

(Ax)t  = (Ax)t  = Axt 
and  multiply  the  two  left  sides  and  the  two  right  sides, 

(Ax)tAx  = AAxtx  = |A|2xtx. 
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THEOREM  2 


PROOF 


DEFINITION 


THEOREM  3 


But  A is  unitary,  AT  = A 1,  so  that  on  the  left  we  obtain 

(Ax  )tAx  = xtATAx  = xtA_iAx  = xTIx  = xTx. 

Together,  xTx  = |A|2xtx.  We  now  divide  by  xTx  (#0)  to  get  | A | 2 = 1.  Hence  | A | = 1. 
This  proves  Theorem  1 as  well  as  Theorems  1 and  5 in  Sec.  8.3. 

Key  properties  of  orthogonal  matrices  (invariance  of  the  inner  product,  orthonormality  of 
rows  and  columns;  see  Sec.  8.3)  generalize  to  unitary  matrices  in  a remarkable  way. 

To  see  this,  instead  of  Rn  we  now  use  the  complex  vector  space  Cn  of  all  complex 
vectors  with  n complex  numbers  as  components,  and  complex  numbers  as  scalars.  For 
such  complex  vectors  the  inner  product  is  defined  by  (note  the  overbar  for  the  complex 
conjugate) 

(4)  a • b = aTb. 

The  length  or  norm  of  such  a complex  vector  is  a real  number  defined  by 

(5)  || a ||  = Va  • a = Va/a  = \Zd\ai  + • ■ ■ + dnan  = \/|ai|2  + • • • + |an|2. 


Invariance  of  Inner  Product 

A unitary  transformation,  that  is,  y = Ax  with  a unitary  matrix  A,  preserves  the 
value  of  the  inner  product  (4),  hence  also  the  norm  (5). 


The  proof  is  the  same  as  that  of  Theorem  2 in  Sec.  8.3,  which  the  theorem  generalizes. 
In  the  analog  of  (9),  Sec.  8.3,  we  now  have  bars, 

u • v = uTv  = (Aa)TAb  = aTATAb  = aTIb  = aTb  = a • b. 

The  complex  analog  of  an  orthonormal  system  of  real  vectors  (see  Sec.  8.3)  is  defined  as 
follows. 


Unitary  System 

A unitary  system  is  a set  of  complex  vectors  satisfying  the  relationships 

fO  if  j ¥=  k 

(6)  a j • afc  = a/afc  = \ 

ll  if  j = k. 


Theorem  3 in  Sec.  8.3  extends  to  complex  as  follows. 


Unitary  Systems  of  Column  and  Row  Vectors 

A complex  square  matrix  is  unitary  if  and  only  if  its  column  vectors  ( and  also  its 
row  vectors)  form  a unitary  system. 
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PROOF 


THEOREM  4 


PROOF 


EXAMPLE  4 


THEOREM  5 


EXAMPLE  5 


The  proof  is  the  same  as  that  of  Theorem  3 in  Sec.  8.3,  except  for  the  bars  required  in 
A = A-1  and  in  (4)  and  (6)  of  the  present  section. 


Determinant  of  a Unitary  Matrix 

Let  A be  a unitary  matrix.  Then  its  determinant  has  absolute  value  one,  that  is, 

| det  A | = 1 . 

Similarly,  as  in  Sec.  8.3,  we  obtain 

1 = det  (A A-1)  = det  (AAT)  = det  A det  A T = det  A det  A 
= det  A det  A = | det  A | 2. 

Hence  | det  A | = 1 (where  det  A may  now  be  complex). 

Unitary  Matrix  Illustrating  Theorems  1c  and  2-4 

For  the  vectors  aT  = [2  — /]  andbT  = [1  + / 4/]  wegetaT  = [2  /]TandaTb  = 2(1  + /)  — 4 = — 2 + 2 i 


0.8/ 

0.6  ' 

i 

-0.8  + 3.2/' 

A = 

0.6 

0.8/ 

also 

Aa  = 

2 

and 

Ab  = 

-2.6  + 0.6/ 

as  one  can  readily  verify.  This  gives  (Aa)TAb  = —2  + 2 i,  illustrating  Theorem  2.  The  matrix  is  unitary.  Its 
columns  form  a unitary  system, 

alai  = — 0.8/  ■ 0.8/  + 0.62  = 1,  a]a2  = -0.8/  • 0.6  + 0.6  • 0.8/  = 0, 
a2a2  = 0.62  + (-0.8/)0.8/  = 1 

and  so  do  its  rows.  Also,  det  A = — 1.  The  eigenvalues  are  0.6  + 0.8/and— 0.6  + 0.8/,  with  eigenvectors  [1  1]T 

and  [1  —1],  respectively. 

Theorem  2 in  Sec.  8.4  on  the  existence  of  an  eigenbasis  extends  to  complex  matrices  as 
follows. 


Basis  of  Eigenvectors 

A Hermitian,  skew-Hermitian,  or  unitary  matrix  has  a basis  of  eigenvectors  for  Cn 
that  is  a unitary  system. 

For  a proof  see  Ref.  [B3],  vol.  1,  pp.  270-272  and  p.  244  (Definition  2). 

Unitary  Eigenbases 

The  matrices  A,  B,  C in  Example  2 have  the  following  unitary  systems  of  eigenvectors,  as  you  should  verify. 


— ' — [1  - 3/ 
V35 

5]t  (A  = 9), 

VI4[1 

- 3/  — 2]t  (A  = 2) 

— — f 1 - 2* 
V30 

-5]t  (A  = -2/), 

— [5 
V30 

1 + 2/]t  (A  = 4/) 

1]T 

V2 

(A  = i(i  + V3)), 

4[1 

-1]T  (A  = |(Z  — V3)). 

■ 
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Hermitian  and  Skew-Hermitian  Forms 

The  concept  of  a quadratic  form  (Sec.  8.4)  can  be  extended  to  complex.  We  call  the 
numerator  xTAx  in  (1)  a form  in  the  components  ;ci,  ■ • ■ , xn  of  x,  which  may  now  be 
complex.  This  form  is  again  a sum  of  n2  terms 

n n 

x Ax  l{ii-  XjX 

j = l k = l 

— r?nx  iXi  + ■■•  + a\nx\xn 

(7)  + a2iV*i  + • • • + a2nx2xn 

+ 

T ci'Yi\X'yiX\  + ■ • • + annxnxn. 

A is  called  its  coefficient  matrix.  The  form  is  called  a Hermitian  or  skew-Hermitian 

form  if  A is  Hermitian  or  skew-Hermitian,  respectively.  The  value  of  a Hermitian  form 
is  real,  and  that  of  a skew-Hermitian  form  is  pure  imaginary  or  zero.  This  can  be  seen 
directly  from  (2)  and  (3)  and  accounts  for  the  importance  of  these  forms  in  physics.  Note 
that  (2)  and  (3)  are  valid  for  any  vectors  because,  in  the  proof  of  (2)  and  (3),  we  did  not 
use  that  x is  an  eigenvector  but  only  that  xTx  is  real  and  not  0. 


Hermitian  Form 

For  A in  Example  2 and,  say,  x = [1  • / 5r’]T  we  get 

4(1  + i)  + (1  - 30  ■ 5/ 
(1  + 3i')(l  + 0 + 7 ■ 5 i 


4 

1 - 3 i 

1 + i 

1 - (' 

-5/] 

1 + 3 i 

7 

5 i 

= [1  - 

i — 5i] 

223.  ■ 


Clearly,  if  A and  x in  (4)  are  real,  then  (7)  reduces  to  a quadratic  form,  as  discussed  in 
the  last  section. 


PROBLEM  SET8.5 


EIGENVALUES  AND  VECTORS 

Is  the  given  matrix  Hermitian?  Skew-Hermitian?  Unitary? 
Find  its  eigenvalues  and  eigenvectors. 


6 

i 

i 

1 + i 

1. 

—i 

6 

2. 

1 ( i 

0 

3. 

"i 

2 

VI 

4. 

0 i 

jVl 

2 

i 0 

i 

0 

o" 

0 

2 + 2 i 

0 

5. 

0 

0 

i 

6. 

2 - 2 i 

0 

2 + 2 i 

0 

i 

0 

0 

2 - 2 i 

0 

7.  Pauli  spin  matrices.  Find  the  eigenvalues  and  eigen- 
vectors of  the  so-called  Pauli  spin  matrices  and  show 
that  SxSy  = iSz,  SySx  = -iSz,  Sx  = = Sf  = I, 

where 


0 

1 

0 

—i 

sx  = 

1 

0 

. sy  = 

i 

0 

1 

o" 

Sz  = 

0 

-1 

8.  Eigenvectors.  Find  eigenvectors  of  A,  B,  C in 
Examples  2 and  3. 
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COMPLEX  FORMS 

Is  the  matrix  A Hermitian  or  skew-Hermitian?  Find  xTAx. 
Show  the  details. 


4 3 - 2 i 

-4  i 

VO 

> 

II 

3 + 2 i -4 

, X = 

2 + 2 i 

i — 2 + 3 i 

21 

10.  A = 

, x = 

2 + 3 i 0 

8 

i 

i 

2 + i 

1 

11.  A = 

-1 

0 

3 i 

, x = 

i 

— 2 + i 

3 i 

i 

—i 

1 

i 

4 

1 

12.  A = 

— i 

3 

0 

, X = 

i 

4 

0 

2 

—i 

GENERAL  PROBLEMS 

13.  Product.  Show  that  (ABC)T  = — C_1BA  for  any 
n X n Hermitian  A,  skew-Hermitian  B,  and  unitary  C. 


14.  Product.  Show  (BA)T  = — AB  for  A and  B in 
Example  2.  For  any  n X n Hermitian  A and 
skew-Hermitian  B. 

15.  Decomposition.  Show  that  any  square  matrix  may  be 
written  as  the  sum  of  a Hermitian  and  a skew-Hermitian 
matrix.  Give  examples. 

16.  Unitary  matrices.  Prove  that  the  product  of  two 
unitary  n X n matrices  and  the  inverse  of  a unitary 
matrix  are  unitary.  Give  examples. 

17.  Powers  of  unitary  matrices  in  applications  may 
sometimes  be  very  simple.  Show  that  C12  = I in 
Example  2.  Find  further  examples. 

18.  Normal  matrix.  This  important  concept  denotes  a 
matrix  that  commutes  with  its  conjugate  transpose, 
AAt  = AtA.  Prove  that  Hermitian,  skew-Hermitian, 
and  unitary  matrices  are  normal.  Give  corresponding 
examples  of  your  own. 

19.  Normality  criterion.  Prove  that  A is  normal  if  and 
only  if  the  Hermitian  and  skew-Hermitian  matrices  in 
Prob.  18  commute. 

20.  Find  a simple  matrix  that  is  not  normal.  Find  a normal 
matrix  that  is  not  Hermitian,  skew-Hermitian,  or 
unitary. 


CffAPTER8=REVIEW  T I O N S AND  PROBLEMS 


1.  In  solving  an  eigenvalue  problem,  what  is  given  and 
what  is  sought? 

2.  Give  a few  typical  applications  of  eigenvalue  problems. 

3.  Do  there  exist  square  matrices  without  eigenvalues? 

4.  Can  a real  matrix  have  complex  eigenvalues?  Can  a 
complex  matrix  have  real  eigenvalues? 

5.  Does  a 5 X 5 matrix  always  have  a real  eigenvalue? 

6.  What  is  algebraic  multiplicity  of  an  eigenvalue?  Defect? 

7.  What  is  an  eigenbasis?  When  does  it  exist?  Why  is  it 
important? 

8.  When  can  we  expect  orthogonal  eigenvectors? 

9.  State  the  definitions  and  main  properties  of  the  three 
classes  of  real  matrices  and  of  complex  matrices  that 
we  have  discussed. 


10.  What  is  diagonalization?  Transformation  to  principal  axes? 


11-15 


SPECTRUM 


Find  the  eigenvalues.  Find  the  eigenvectors. 


2.5 

0.5' 

-7 

4" 

ii 

0.5 

2.5 

12. 

-12 

7 

8 -1 
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SIMILARITY 


Verify  that  A and  A = p 1AP  have  the  same 


spectrum. 
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Summary  of  Chapter  8 
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DIAGONALIZATION 


Find  an  eigenbasis  and  diagonalize. 
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CONIC  SECTIONS.  PRINCIPAL  AXES 


Transform  to  canonical  form  (to  principal  axes).  Express 
[*i  X2]J  in  terms  of  the  new  variables  [yi  >>2]T- 

22.  9x\  — 6x1*2  + 17*1  = 36 

23.  4xf  + 24*!* 2:  — 14*1  = 20 

24.  5xf  + 24x^2  “ 5x|  = 0 

25.  3.7*1  + 3.2* i*2  + 1.3*1  = 4.5 


SUMMARY  OF  CHAPTER  8 

Linear  Algebra:  Matrix  Eigenvalue  Problems 


The  practical  importance  of  matrix  eigenvalue  problems  can  hardly  be  overrated. 
The  problems  are  defined  by  the  vector  equation 

(1)  Ax  = Ax. 

A is  a given  square  matrix.  All  matrices  in  this  chapter  are  square.  A is  a scalar.  To 
solve  the  problem  (1)  means  to  determine  values  of  A,  called  eigenvalues  (or 
characteristic  values)  of  A,  such  that  (1)  has  a nontrivial  solution  x (that  is,  x A 0), 
called  an  eigenvector  of  A corresponding  to  that  A.  An  n X n matrix  has  at  least 
one  and  at  most  n numerically  different  eigenvalues.  These  are  the  solutions  of  the 
characteristic  equation  (Sec.  8.1) 


(2)  D{  A)  = det(A-AI) 


an  A 

fl12 

Cl\n 

^21 

022  — A 

a2n 

^nl 

an2 

Clnn 

/3(A)  is  called  the  characteristic  determinant  of  A.  By  expanding  it  we  get  the 
characteristic  polynomial  of  A,  which  is  of  degree  n in  A.  Some  typical  applications 
are  shown  in  Sec.  8.2. 

Section  8.3  is  devoted  to  eigenvalue  problems  for  symmetric  (AT  = A),  skew- 
symmetric  (At  = —A),  and  orthogonal  matrices  (AT  = A-1).  Section  8.4 
concerns  the  diagonalization  of  matrices  and  the  transformation  of  quadratic  forms 
to  principal  axes  and  its  relation  to  eigenvalues. 

Section  8.5  extends  Sec.  8.3  to  the  complex  analogs  of  those  real  matrices,  called 
Hermitian  (AT  = A),  skew-Hermitian  (AT  = —A),  and  unitary  matrices 
(A T = A-1).  All  the  eigenvalues  of  a Hermitian  matrix  (and  a symmetric  one)  are 
real.  For  a skew-Hermitian  (and  a skew-symmetric)  matrix  they  are  pure  imaginary 
or  zero.  For  a unitary  (and  an  orthogonal)  matrix  they  have  absolute  value  1. 


CHAPTER  9 


Vector  Differential  Calculus. 
Grad,  Div,  Curl 


Engineering,  physics,  and  computer  sciences,  in  general,  but  particularly  solid  mechanics, 
aerodynamics,  aeronautics,  fluid  flow,  heat  flow,  electrostatics,  quantum  physics,  laser 
technology,  robotics  as  well  as  other  areas  have  applications  that  require  an  understanding 
of  vector  calculus.  This  field  encompasses  vector  differential  calculus  and  vector  integral 
calculus.  Indeed,  the  engineer,  physicist,  and  mathematician  need  a good  grounding  in 
these  areas  as  provided  by  the  carefully  chosen  material  of  Chaps.  9 and  10. 

Forces,  velocities,  and  various  other  quantities  may  be  thought  of  as  vectors.  Vectors 
appear  frequently  in  the  applications  above  and  also  in  the  biological  and  social  sciences, 
so  it  is  natural  that  problems  are  modeled  in  3-space.  This  is  the  space  of  three  dimensions 
with  the  usual  measurement  of  distance,  as  given  by  the  Pythagorean  theorem.  Within  that 
realm,  2-space  (the  plane)  is  a special  case.  Working  in  3-space  requires  that  we  extend 
the  common  differential  calculus  to  vector  differential  calculus,  that  is,  the  calculus  that 
deals  with  vector  functions  and  vector  fields  and  is  explained  in  this  chapter. 

Chapter  9 is  arranged  in  three  groups  of  sections.  Sections  9. 1-9.3  extend  the  basic 
algebraic  operations  of  vectors  into  3-space.  These  operations  include  the  inner  product 
and  the  cross  product.  Sections  9.4  and  9.5  form  the  heart  of  vector  differential  calculus. 
Finally,  Secs.  9. 7-9. 9 discuss  three  physically  important  concepts  related  to  scalar  and 
vector  fields:  gradient  (Sec.  9.7),  divergence  (Sec.  9.8),  and  curl  (Sec.  9.9).  They  are 
expressed  in  Cartesian  coordinates  in  this  chapter  and,  if  desired,  expressed  in  curvilinear 
coordinates  in  a short  section  in  App.  A3 .4. 

We  shall  keep  this  chapter  independent  of  Chaps.  1 and  8.  Our  present  approach  is  in 
harmony  with  Chap.  7,  with  the  restriction  to  two  and  three  dimensions  providing  for  a 
richer  theory  with  basic  physical,  engineering,  and  geometric  applications. 

Prerequisite:  Elementary  use  of  second-  and  third-order  determinants  in  Sec.  9.3. 

Sections  that  may  be  omitted  in  a shorter  course:  9.5,  9.6. 

References  and  Answers  to  Problems:  App.  1 Part  B,  App.  2. 

9.1  Vectors  in  2-Space  and  3-Space 

In  engineering,  physics,  mathematics,  and  other  areas  we  encounter  two  kinds  of  quantities. 
They  are  scalars  and  vectors. 

A scalar  is  a quantity  that  is  determined  by  its  magnitude.  It  takes  on  a numerical  value, 
i.e.,  a number.  Examples  of  scalars  are  time,  temperature,  length,  distance,  speed,  density, 
energy,  and  voltage. 
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In  contrast,  a vector  is  a quantity  that  has  both  magnitude  and  direction.  We  can  say 
that  a vector  is  an  arrow  or  a directed  line  segment.  For  example,  a velocity  vector  has 
length  or  magnitude,  which  is  speed,  and  direction,  which  indicates  the  direction  of  motion. 
Typical  examples  of  vectors  are  displacement,  velocity,  and  force,  see  Fig.  164  as  an 
illustration. 

More  formally,  we  have  the  following.  We  denote  vectors  by  lowercase  boldface  letters 
a,  b,  v,  etc.  In  handwriting  you  may  use  arrows,  for  instance,  a (in  place  of  a),  b,  etc. 

A vector  (arrow)  has  a tail,  called  its  initial  point,  and  a tip,  called  its  terminal  point. 
This  is  motivated  in  the  translation  (displacement  without  rotation)  of  the  triangle  in 
Fig.  165,  where  the  initial  point  P of  the  vector  a is  the  original  position  of  a point,  and 
the  terminal  point  Q is  the  terminal  position  of  that  point,  its  position  after  the  translation. 
The  length  of  the  arrow  equals  the  distance  between  P and  Q.  This  is  called  the  length 
(or  magnitude)  of  the  vector  a and  is  denoted  by  |a|.  Another  name  for  length  is  norm 
(or  Euclidean  norm). 

A vector  of  length  1 is  called  a unit  vector. 


Velocity 


Fig.  164.  Force  and  velocity  Fig.  165.  Translation 

Of  course,  we  would  like  to  calculate  with  vectors.  For  instance,  we  want  to  find  the 
resultant  of  forces  or  compare  parallel  forces  of  different  magnitude.  This  motivates  our 
next  ideas:  to  define  components  of  a vector,  and  then  the  two  basic  algebraic  operations 
of  vector  addition  and  scalar  multiplication. 

For  this  we  must  first  define  equality  of  vectors  in  a way  that  is  practical  in  connection 
with  forces  and  other  applications. 


DEFINITION 


Equality  of  Vectors 

Two  vectors  a and  b are  equal,  written  a = b,  if  they  have  the  same  length  and  the 
same  direction  [as  explained  in  Fig.  166;  in  particular,  note  (B)].  Hence  a vector 
can  be  arbitrarily  translated;  that  is,  its  initial  point  can  be  chosen  arbitrarily. 


Equal  vectors, 
a = b 

(A) 


■ b 

Vectors  having 
the  same  length 
but  different 
direction 

(B) 


Vectors  having 
the  same  direction 
but  different 
length 

(C) 


Fig.  166.  (A)  Equal  vectors.  (B)-(D)  Different  vectors 


Vectors  having 
different  length 
and  different 
direction 

(D) 
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EXAMPLE  1 


Components  of  a Vector 

We  choose  an  xyz  Cartesian  coordinate  system1  in  space  (Fig.  167),  that  is,  a usual 
rectangular  coordinate  system  with  the  same  scale  of  measurement  on  the  three  mutually 
perpendicular  coordinate  axes.  Let  a be  a given  vector  with  initial  point  P:  (x  i , yi,  Zi)  and 
terminal  point  Q:  (x2,  >'2,  Z2)-  Then  the  three  coordinate  differences 

(1)  «i  = x2  ~ xu  a2  = y2  ~ >’1,  a3  = z2~  z 1 

are  called  the  components  of  the  vector  a with  respect  to  that  coordinate  system,  and  we 
write  simply  a = [a1;  a2,  a3].  See  Fig.  168. 

The  length  |a|  of  a can  now  readily  be  expressed  in  terms  of  components  because  from 

(1)  and  the  Pythagorean  theorem  we  have 

(2)  | a | = \/flf  + fl|  + a§. 


Components  and  Length  of  a Vector 

The  vector  a with  initial  point  P:  (4,  0,  2)  and  terminal  point  Q\  (6,  —1,  2)  has  the  components 
ai  = 6 - 4 = 2,  a2  = -1  - 0 = -1,  a3  = 2-2  = 0. 

Hence  a = [2,  —1,  0].  (Can  you  sketch  a,  as  in  Fig.  168?)  Equation  (2)  gives  the  length 

|a|  = V22  + (-1)2  + 02  = V5. 

If  we  choose  (—1,5,  8)  as  the  initial  point  of  a,  the  corresponding  terminal  point  is  (1,  4,  8). 

If  we  choose  the  origin  (0,  0,  0)  as  the  initial  point  of  a,  the  corresponding  terminal  point  is  (2,  —1,  0);  its 
coordinates  equal  the  components  of  a.  This  suggests  that  we  can  determine  each  point  in  space  by  a vector, 
called  the  position  vector  of  the  point,  as  follows. 


A Cartesian  coordinate  system  being  given,  the  position  vector  r of  a point  A:  (x,  y,  z ) 
is  the  vector  with  the  origin  (0,  0,  0)  as  the  initial  point  and  A as  the  terminal  point  (see 
Fig.  169).  Thus  in  components,  r = [x,y,z].  This  can  be  seen  directly  from  (1)  with 
*1  = yi  = zi  = 0. 


Fig.  167.  Cartesian 
coordinate  system 


Fig.  168.  Components 
of  a vector 


Fig.  169.  Position  vector  r 
of  a point  A:  (x,  y,  z) 


1Named  after  the  French  philosopher  and  mathematician  RENATUS  CARTESIUS,  latinized  for  RENE 

DESCARTES  (1596-1650),  who  invented  analytic  geometry.  His  basic  work  Geometrie  appeared  in  1637,  as 
an  appendix  to  his  Discours  de  la  methode. 
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Furthermore,  if  we  translate  a vector  a,  with  initial  point  P and  terminal  point  Q.  then 
corresponding  coordinates  of  P and  Q change  by  the  same  amount,  so  that  the  differences 
in  (1)  remain  unchanged.  This  proves 


THEOREM  1 


Vectors  as  Ordered  Triples  of  Real  Numbers 

A fixed  Cartesian  coordinate  system  being  given,  each  vector  is  uniquely  determined 
by  its  ordered  triple  of  corresponding  components.  Conversely,  to  each  ordered 
triple  of  real  numbers  (a\,  a 2,  a3)  there  corresponds  precisely  one  vector 
a = [«i,  a2,  «s].  with  (0,  0,  0)  corresponding  to  the  zero  vector  0,  which  has  length 
0 and  no  direction. 

Hence  a vector  equation  a = b is  equivalent  to  the  three  equations  a\  = b\, 
a2  = l}2-  a'A  = h?,  for  the  components. 


We  now  see  that  from  our  “geometric”  definition  of  a vector  as  an  arrow  we  have  arrived 
at  an  “algebraic”  characterization  of  a vector  by  Theorem  1.  We  could  have  started  from 
the  latter  and  reversed  our  process.  This  shows  that  the  two  approaches  are  equivalent. 


Vector  Addition,  Scalar  Multiplication 

Calculations  with  vectors  are  very  useful  and  are  almost  as  simple  as  the  arithmetic  for 
real  numbers.  Vector  arithmetic  follows  almost  naturally  from  applications.  We  first  define 
how  to  add  vectors  and  later  on  how  to  multiply  a vector  by  a number. 


DEFINITION 


Fig.  170.  Vector 
addition 


Addition  of  Vectors 

The  sum  a + b of  two  vectors  a = [«j,  a2,  a3]  and  b = [/q,  b2,  b3]  is  obtained  by 
adding  the  corresponding  components, 

(3)  a + b = [«!  + b±,  a2  + b2,  a3  + b3]. 

Geometrically,  place  the  vectors  as  in  Fig.  170  (the  initial  point  of  b at  the  terminal 
point  of  a);  then  a + b is  the  vector  drawn  from  the  initial  point  of  a to  the  terminal 
point  of  b. 


For  forces,  this  addition  is  the  parallelogram  law  by  which  we  obtain  the  resultant  of  two 
forces  in  mechanics.  See  Fig.  171. 

Figure  172  shows  (for  the  plane)  that  the  “algebraic”  way  and  the  “geometric  way”  of 
vector  addition  give  the  same  vector. 


Fig.  171.  Resultant  of  two  forces  (parallelogram  law) 
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Basic  Properties  of  Vector  Addition.  Familiar  laws  for  real  numbers  give  immediately 


(a) 

a + b = b + a 

( Commutativity ) 

(b) 
(4) 

(c) 

(u  + v)  + w = u + (v  + w) 

( Associativity ) 

a+0=0+a=a 

(d) 

a + (-a)  = 0. 

Properties  (a)  and  (b)  are  verified  geometrically  in  Figs.  173  and  174.  Furthermore,  —a 
denotes  the  vector  having  the  length  |a|  and  the  direction  opposite  to  that  of  a. 

Fig.  172.  Vector  addition  Fig.  173.  Cummutativity 

of  vector  addition 


Fig.  174.  Associativity 
of  vector  addition 


In  (4b)  we  may  simply  write  u + v + w,  and  similarly  for  sums  of  more  than  three 
vectors.  Instead  of  a + a we  also  write  2a,  and  so  on.  This  (and  the  notation  —a  used 
just  before)  motivates  defining  the  second  algebraic  operation  for  vectors  as  follows. 


DEFINITION 

/ //  / 

a 2a  -a  --a 

2 

Fig.  175.  Scalar 
multiplication 
[multiplication  of 
vectors  by  scalars 
(numbers)] 


Scalar  Multiplication  (Multiplication  by  a Number) 

The  product  ca  of  any  vector  a = [a1;  a2,  <33]  and  any  scalar  c (real  number  c)  is 
the  vector  obtained  by  multiplying  each  component  of  a by  c, 

(5)  ca  = [ca  1,  ca2,  ca2]. 

Geometrically,  if  a ¥=  0,  then  ca  with  c > 0 has  the  direction  of  a and  with  c < 0 
the  direction  opposite  to  a.  In  any  case,  the  length  of  ca  is  |ca|  = |c|  |a|,  and  ca  = 0 
if  a = 0 or  c = 0 (or  both).  (See  Fig.  175.) 


Basic  Properties  of  Scalar  Multiplication.  From  the  definitions  we  obtain  directly 


(a) 

c(a  + b) 

= ca  + cb 

(b) 

(c  + k) a 

= ca  + ka 

(c) 

c(ka) 

= (ck)a 

(d) 

la 

= a. 

(6) 


(written  cka) 
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You  may  prove  that  (4)  and  (6)  imply  for  any  vector  a 


(7) 


(a)  Oa  = 0 

(b)  (-l)a=-a. 


Instead  of  b + (—a)  we  simply  write  b — a (Fig.  176). 

EXAMPLE  2 Vector  Addition.  Multiplication  by  Scalars 

With  respect  to  a given  coordinate  system,  let 

a = [4,0,1]  and  b = [2,  —5,  §]. 
Then -a  = [-4,0, -1],  7a  = [28,  0,  7],  a + b = [6,  -5,  |],  and 


2(a  - b)  = 2 [2,  5,  §]  = [4,  10,  f ] = 2a  - 2b. 


Unit  Vectors  i,  j,  k.  Besides  a = [a1;  a2,  <23]  another  popular  way  of  writing  vectors  is 

(8)  a = aii  + a2  j + fl^k. 

In  this  representation,  i,  j,  k are  the  unit  vectors  in  the  positive  directions  of  the  axes  of 
a Cartesian  coordinate  system  (Fig.  177).  Hence,  in  components, 

(9)  i = [1,0,0],  j = [0,1,0],  k = [0,0,1] 

and  the  right  side  of  (8)  is  a sum  of  three  vectors  parallel  to  the  three  axes. 

ijk  Notation  for  Vectors 

In  Example  2 we  have  a = 4i  + k,  b = 2i  — 5j  + 5 k,  and  so  on. 


All  the  vectors  a = [«1;  a2,  c/pj  = «]i  + a2 j + 03k  (with  real  numbers  as  components) 
form  the  real  vector  space  R3  with  the  two  algebraic  operations  of  vector  addition  and 
scalar  multiplication  as  just  defined.  R 3 has  dimension  3.  The  triple  of  vectors  i,  j,  k 
is  called  a standard  basis  of  R3.  Given  a Cartesian  coordinate  system,  the  representation 
(8)  of  a given  vector  is  unique. 

Vector  space  R3  is  a model  of  a general  vector  space,  as  discussed  in  Sec.  7.9,  but  is 
not  needed  in  this  chapter. 


Fig.  176. 


Difference  of  vectors 


Fig.  177.  The  unit  vectors  i,  j,  k 
and  the  representation  (8) 
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COMPONENTS  AND  LENGTH 


Find  the  components  of  the  vector  v with  initial  point  P 
and  terminal  point  Q.  Find  |v|.  Sketch  |v|.  Find  the  unit 
vector  u in  the  direction  of  v. 


1.  P:  (1,  1,  0),  Q:  (6,  2,  0) 

2.  P:  (1,1,1),  Q:( 2,2,0) 

3.  P:  (-3.0,  4,0,  -0.5),  Q:  (5.5,  0,  1.2) 

4.  P:  (1,  4,  2),  Q:(- 1, -4, -2) 

5.  P:  (0,  0,  0),  Q : (2,  1,  -2) 


6-10 


Find  the  terminal  point  Q of  the  vector  v with 


components  as  given  and  initial  point  P.  Find  |v|. 


6.  4,  0,  0;  P:  (0,  2,  13) 

7.  I,  3,-|;  P:&,~ 3,|) 

8.  13.1,0.8, -2.0;  P:  (0,0,0) 

9.  6,  1, -4;  P:  (—6,  —1,  —4) 

10.  0,  -3,  3;  P:  (0,  3,  -3) 


11-18  ADDITION,  SCALAR  MULTIPLICATION 
Let  a = [3,  2,  0]  = 3i  + 2j;  b = [-4,  6,  0]  = 4i  + 6j, 
c = [5,  -1,  8]  = 5i  - j + 8k,  d = [0,  0,  4]  = 4k. 
Find: 

11.  2a,  |a,  -a 

12.  (a  + b)  + c,  a + (b  + c) 

13.  b + c,  c + b 


14.  3c  - 6d,  3(c  - 2d) 

15.  7(c  - b),  7c  - 7b 

16.  ga  — 3c,  9 (g a — gc) 

17.  (7  - 3)  a,  7a  - 3a 

18.  4a  + 3b,  -4a  - 3b 

19.  What  laws  do  Probs.  12-16  illustrate? 

20.  Prove  Eqs.  (4)  and  (6). 
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FORCES,  VELOCITIES 


26.  Equilibrium.  Find  v such  that  p,  q,  u in  Prob.  21  and 
v are  in  equilibrium. 


27.  Find  p such  that  u,  v,  w in  Prob.  23  and  p are  in 
equilibrium. 


28.  Unit  vector.  Find  the  unit  vector  in  the  direction  of 
the  resultant  in  Prob.  24. 


29.  Restricted  resultant.  Find  all  v such  that  the  resultant 
of  v,  p,  q,  u with  p,  q,  u as  in  Prob.  21  is  parallel  to 
the  xv-plane. 

30.  Find  v such  that  the  resultant  of  p,  q,  u,  v with  p, 
q,  u as  in  Prob.  24  has  no  components  in  x-  and 
v-directions. 

31.  For  what  k is  the  resultant  of  [2,  0,  —7],  [1,  2,  —3],  and 
[0,  3,  k\  parallel  to  the  jcy-plane? 

32.  If  |p|  =6  and  |q|  =4,  what  can  you  say  about  the 
magnitude  and  direction  of  the  resultant?  Can  you  think 
of  an  application  to  robotics? 

33.  Same  question  as  in  Prob.  32  if  |p|  = 9,  |q|  = 6, 
|u|  = 3. 

34.  Relative  velocity.  If  airplanes  A and  B are  moving 
southwest  with  speed  |vaI  = 550  mph,  and  north- 
west with  speed  |vB|  = 450  mph,  respectively,  what 
is  the  relative  velocity  v = vB  — vA  of  B with  respect 
to  A? 


35.  Same  question  as  in  Prob.  34  for  two  ships  moving 
northeast  with  speed  |vaI  =22  knots  and  west  with 
speed  |vB|  = 19  knots. 

36.  Reflection.  If  a ray  of  light  is  reflected  once  in  each 
of  two  mutually  perpendicular  mirrors,  what  can  you 
say  about  the  reflected  ray? 

37.  Force  polygon.  Truss.  Find  the  forces  in  the  system 
of  two  rods  (truss)  in  the  figure,  where  |p|  = 1000  nt. 
Hint.  Forces  in  equilibrium  form  a polygon,  the  force 
polygon. 


21-25 


FORCES,  RESULTANT 


Find  the  resultant  in  terms  of  components  and  its 
magnitude. 


21.  p = [2,  3,  0],  q = [0,  6,  1],  u = [2,  0,  -4] 

22.  p = [1, -2,  3],  q = [3,  21, -16], 
u = [-4,  -19,  13] 


23.  u = [8, -1,0],  v = [g,  0,  §],  w = [-£,l,¥] 

24.  p = [-1,2, -3],  q = [1,1,1],  u = [1,  -2,2] 

25.  u = [3,  1,  -6],  v=  [0,2,5],  w = [3, -1, -13] 


Truss  Force  polygon 


Problem  37 
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38.  TEAM  PROJECT.  Geometric  Applications.  To 

increase  your  skill  in  dealing  with  vectors,  use  vectors 
to  prove  the  following  (see  the  figures). 

(a)  The  diagonals  of  a parallelogram  bisect  each  other. 

(b)  The  line  through  the  midpoints  of  adjacent  sides 
of  a parallelogram  bisects  one  of  the  diagonals  in  the 
ratio  1:3. 

(c)  Obtain  (b)  from  (a). 

(d)  The  three  medians  of  a triangle  (the  segments 
from  a vertex  to  the  midpoint  of  the  opposite  side) 
meet  at  a single  point,  which  divides  the  medians  in 
the  ratio  2:1. 

(e)  The  quadrilateral  whose  vertices  are  the  mid- 
points of  the  sides  of  an  arbitrary  quadrilateral  is  a 
parallelogram. 

(f)  The  four  space  diagonals  of  a parallelepiped  meet 
and  bisect  each  other. 

(g)  The  sum  of  the  vectors  drawn  from  the  center  of 
a regular  polygon  to  its  vertices  is  the  zero  vector. 


Team  Project  38(d) 


A “ 

Team  Project  38(e) 


9.2  Inner  Product  (Dot  Product) 

Orthogonality 

The  inner  product  or  dot  product  can  be  motivated  by  calculating  work  done  by  a constant 
force,  determining  components  of  forces,  or  other  applications.  It  involves  the  length  of 
vectors  and  the  angle  between  them.  The  inner  product  is  a kind  of  multiplication  of  two 
vectors,  defined  in  such  a way  that  the  outcome  is  a scalar.  Indeed,  another  term  for  inner 
product  is  scalar  product,  a term  we  shall  not  use  here.  The  definition  of  the  inner  product 
is  as  follows. 


DEFINITION 


Inner  Product  (Dot  Product)  of  Vectors 

The  inner  product  or  dot  product  a • b (read  “a  dot  b”)  of  two  vectors  a and  b is 
the  product  of  their  lengths  times  the  cosine  of  their  angle  (see  Fig.  178), 


(1) 


a • b = | a|  | b | cos  y 

a • b = 0 


if  a + 0,  b # 0 
if  a = 0 or  b = 0. 


The  angle  y,  0 y Si  7 r,  between  a and  b is  measured  when  the  initial  points  of 
the  vectors  coincide,  as  in  Fig.  178.  In  components,  a = [fly,  a 2,  03],  b = [bi,  h2,  />;], 
and 


(2) 


a • b = aibi  + fl2/j2  + 03^3- 
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THEOREM  1 


EXAMPLE  1 


The  second  line  in  (1)  is  needed  because  y is  undefined  when  a = 0 or  b = 0.  The 
derivation  of  (2)  from  (1)  is  shown  below. 


b b b 

a»b  > 0 a«b  = 0 a«b  < 0 


(orthogonality) 

Fig.  178.  Angle  between  vectors  and  value  of  inner  product 

Orthogonality.  Since  the  cosine  in  (1)  may  be  positive,  0,  or  negative,  so  may  be  the 
inner  product  (Fig.  178).  The  case  that  the  inner  product  is  zero  is  of  particular  practical 
interest  and  suggests  the  following  concept. 

A vector  a is  called  orthogonal  to  a vector  b if  a • b = 0.  Then  b is  also  orthogonal 
to  a,  and  we  call  a and  b orthogonal  vectors.  Clearly,  this  happens  for  nonzero  vectors 
if  and  only  if  cos  y = 0;  thus  y = 7t/2  (90°).  This  proves  the  important 


Orthogonality  Criterion 

The  inner  product  of  two  nonzero  vectors  is  0 if  and  only  if  these  vectors  are 
perpendicular. 


Length  and  Angle.  Equation  (1)  with  b = a gives  a • a = |a|  . Hence 
(3)  |a|  = Va  • a. 

From  (3)  and  (1)  we  obtain  for  the  angle  y between  two  nonzero  vectors 


(4) 


cos  y = 


a • b 


a • b 


alibi  Va  • aVb  • b 


Inner  Product.  Angle  Between  Vectors 

Find  the  inner  product  and  the  lengths  of  a = [1,  2,  0]  and  b = [3,  — 2,  1]  as  well  as  the  angle  between  these 
vectors. 

Solution.  a*b  = 1*3  + 2*  (— 2)  + 0 ■ 1 = —1,  |a|  = Va  • a = V5,  |b|  = Vb  • b = Vl4,  and  (4) 
gives  the  angle 


y = arccos 


a • b 

MW 


= arccos  (-0.1 1952)  = 1.69061  = 96.865°. 
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From  the  definition  we  see  that  the  inner  product  has  the  following  properties.  For  any 
vectors  a,  b,  c and  scalars  q\,  q2. 


(a) 

(gia  + q2b)  • c = q ia  • c + gib  • c 

( Linearity ) 

(5) 

(b) 

a • b = b • a 

( Symmetry ) 

(c) 

a • a g 0 | 

a • a = 0 if  and  only  if  a = 0 J 

(Posi  tive-defin  iteness). 

Hence  dot  multiplication  is  commutative  as  shown  by  (5b).  Furthermore,  it  is  distributive 
with  respect  to  vector  addition.  This  follows  from  (5a)  with  c/j  = 1 and  q2  = 1 : 

(5a*) 

(a  + b)*c  = a*c  + b*c 

(Distributivity). 

Furthermore,  from  (1)  and  |cos  y\  Si  1 we  see  that 
(6)  | a • b | Si  |a||b|  (Cauchy-Schwarz  inequality). 


Using  this  and  (3),  you  may  prove  (see  Prob.  16) 

(7)  a + b|  g |a|  + |b|  (Triangle  inequality). 


Geometrically,  (7)  with  < says  that  one  side  of  a triangle  must  be  shorter  than  the  other 
two  sides  together;  this  motivates  the  name  of  (7). 

A simple  direct  calculation  with  inner  products  shows  that 

(8)  ja  + b|2  + |a  — b!  = 2(|a|2  + |b|2)  ( Parallelogram  equality). 

Equations  (6)-(8)  play  a basic  role  in  so-called  Hilbert  spaces,  which  are  abstract  inner 
product  spaces.  Hilbert  spaces  form  the  basis  of  quantum  mechanics,  for  details  see 
[GenRef7]  listed  in  App.  1. 

Derivation  of  (2)  from  (1).  We  write  a = aii  + a2j  + «3k  and  b = b{\  + hz,\  + />4k, 
as  in  (8)  of  Sec.  9.1.  If  we  substitute  this  into  a • b and  use  (5a*),  we  first  have  a sum  of 
3X3  = 9 products 


a • b = fli^ii  • i + a\b2i  • j + • ■ ■ + a^b^is.  • k. 

Now  i,  j,  k are  unit  vectors,  so  that  i*i=j*j  = k«k  = lby  (3).  Since  the  coordinate 
axes  are  perpendicular,  so  are  i,  j,  k,  and  Theorem  1 implies  that  the  other  six  of  those 
nine  products  are  0,  namely,  i,j=j,i=j,k  = k*j  = k,i  = i,k  = 0.  But  this 
reduces  our  sum  for  a • b to  (2). 
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EXAMPLE  2 


EXAMPLE  3 


Applications  of  Inner  Products 

Typical  applications  of  inner  products  are  shown  in  the  following  examples  and  in 
Problem  Set  9.2. 

Work  Done  by  a Force  Expressed  as  an  Inner  Product 

This  is  a major  application.  It  concerns  a body  on  which  a constant  force  p acts.  (For  a variable  force,  see 
Sec.  10.1.)  Let  the  body  be  given  a displacement  d.  Then  the  work  done  by  p in  the  displacement  is  defined  as 

(9)  W = |p|  | d | cos  a = p • d, 

that  is,  magnitude  | p | of  the  force  times  length  | d | of  the  displacement  times  the  cosine  of  the  angle  a between 
p and  d (Fig.  179).  If  a < 90°,  as  in  Fig.  179,  then  W > 0.  If  p and  d are  orthogonal,  then  the  work  is  zero 
(why?).  If  a > 90°,  then  W < 0,  which  means  that  in  the  displacement  one  has  to  do  work  against  the  force. 
For  example,  think  of  swimming  across  a river  at  some  angle  a against  the  current. 


d 

Fig.  179.  Work  done  by  a force 


Component  of  a Force  in  a Given  Direction 

What  force  in  the  rope  in  Fig.  180  will  hold  a car  of  5000  lb  in  equilibrium  if  the  ramp  makes  an  angle  of  25° 
with  the  horizontal? 

Solution.  Introducing  coordinates  as  shown,  the  weight  is  a = [0,  —5000]  because  this  force  points 
downward,  in  the  negative  y-direction.  We  have  to  represent  a as  a sum  (resultant)  of  two  forces,  a = c + p, 
where  c is  the  force  the  car  exerts  on  the  ramp,  which  is  of  no  interest  to  us,  and  p is  parallel  to  the  rope.  A 
vector  in  the  direction  of  the  rope  is  (see  Fig.  180) 

b = [— 1,  tan  25°]  = [-1,0.46631],  thus  |b|  = 1.10338, 

The  direction  of  the  unit  vector  u is  opposite  to  the  direction  of  the  rope  so  that 


Since  |u| 


1 

u — b = [0.90631,  -0.42262]. 

I b | 


and  cos  y > 0,  we  see  that  we  can  write  our  result  as 


a-b  5000-0.46631 

p|=(|a|cosy)|u|=a.u  = ~=  ^ 


= 2113  [lb]. 


We  can  also  note  that  y = 90°  — 25°  = 65°  is  the  angle  between  a and  p so  that 
|p|  = | a | cos  y = 5000  cos  65°  = 2113  [lb]. 


Answer:  About  2100  lb. 
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EXAMPLE  4 


Example  3 is  typical  of  applications  that  deal  with  the  component  or  projection  of  a 
vector  a in  the  direction  of  a vector  b (=£  0).  If  we  denote  by  p the  length  of  the  orthogonal 
projection  of  a on  a straight  line  / parallel  to  b as  shown  in  Fig.  181,  then 

(10)  p = | a | cos  y. 

Here  p is  taken  with  the  plus  sign  if  pb  has  the  direction  of  b and  with  the  minus  sign  if 
pb  has  the  direction  opposite  to  b. 


Fig.  181.  Component  of  a vector  a in  the  direction  of  a vector  b 


Multiplying  (10)  by  |b|/|b|  = 1,  we  have  a • b in  the  numerator  and  thus 


(ID 


P = 


a • b 

TbT 


(b  + 0). 


If  b is  a unit  vector,  as  it  is  often  used  for  fixing  a direction,  then  (11)  simply  gives 

(12)  p = a-b  (|b|  = 1). 

Figure  182  shows  the  projection  p of  a in  the  direction  of  b (as  in  Fig.  181)  and  the 
projection  q = |b|  cos  y of  b in  the  direction  of  a. 


a 


P 

Fig.  182.  Projections  p of  a on  b and  q of  b on  a 


Orthonormal  Basis 

By  definition,  an  orthonormal  basis  for  3-space  is  a basis  {a,  b,  c}  consisting  of  orthogonal  unit  vectors.  It  has 
the  great  advantage  that  the  determination  of  the  coefficients  in  representations  v = /j_a  + /2b  + /3C  of  a given 
vector  v is  very  simple.  We  claim  that  /1  = a • v,  I2  = b • v,  /3  — c • v.  Indeed,  this  follows  simply  by  taking 
the  inner  products  of  the  representation  with  a,  b,  c,  respectively,  and  using  the  orthonormality  of  the  basis, 
a • v = /]_a  • a + /2a  • b + /3a  • c = /1,  etc. 

For  example,  the  unit  vectors  i,  j,  k in  (8),  Sec.  9.1,  associated  with  a Cartesian  coordinate  system  form  an 
orthonormal  basis,  called  the  standard  basis  with  respect  to  the  given  coordinate  system. 
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EXAMPLE  5 


EXAMPLE  6 


Orthogonal  Straight  Lines  in  the  Plane 

Find  the  straight  line  Li  through  the  point  P:  (1,  3)  in  the  xy-plane  and  perpendicular  to  the  straight  line 
L<l  \x  — 2y  + 2 = 0;  see  Fig.  183. 

Solution.  The  idea  is  to  write  a general  straight  line  Li : ci±x  + a^y  = casa*r  = c with  a = [ai,  =£  0 
and  r = [x,  y],  according  to  (2).  Now  the  line  L*  through  the  origin  and  parallel  to  Li  is  a • r = 0.  Hence,  by 
Theorem  1,  the  vector  a is  perpendicular  to  r.  Hence  it  is  perpendicular  to  L\  and  also  to  Li  because  Li  and 
L*  are  parallel,  a is  called  a normal  vector  of  Li  (and  of  L*). 

Now  a normal  vector  of  the  given  line  x — 2y  + 2 = 0 is  b = [1,-2].  Thus  Li  is  perpendicular  to  L2 
if  b • a = fli  — 2^2  — 0,  for  instance,  if  a = [2,  1].  Hence  L\  is  given  by  2x  + y = c.  It  passes  through 
P:  (1, 3)  when  2 - 1 + 3 = c = 5.  Answer:  y = —2x  + 5.  Show  that  the  point  of  intersection  is 
(. x,y ) = (1.6,  1.8). 


Normal  Vector  to  a Plane 

Find  a unit  vector  perpendicular  to  the  plane  4x  + 2y  + 4z  = —7. 

Solution.  Using  (2),  we  may  write  any  plane  in  space  as 

(13)  a • r = a±x  + #2 y + a3 z — c 

where  a = \a\,  a2,  a 3]  ^ 0 and  r = [x,  y,  z\.  The  unit  vector  in  the  direction  of  a is  (Fig.  184) 

1 

11  = — a. 

|a| 


Dividing  by  |a|,  we  obtain  from  (13) 

c 

(14)  n • r = p where  p = — . 

|a| 

From  (12)  we  see  that  p is  the  projection  of  r in  the  direction  of  n.  This  projection  has  the  same  constant  value 
c/|a|  for  the  position  vector  r of  any  point  in  the  plane.  Clearly  this  holds  if  and  only  if  n is  perpendicular  to 
the  plane,  n is  called  a unit  normal  vector  of  the  plane  (the  other  being  — n). 

Furthermore,  from  this  and  the  definition  of  projection,  it  follows  that  \p\  is  the  distance  of  the  plane  from 
the  origin.  Representation  (14)  is  called  Hesse’s2  normal  form  of  a plane.  In  our  case,  a = [4,  2,  4],  c = —7, 
| a | = 6,  n = ga  = [3,3,5],  and  the  plane  has  the  distance  g from  the  origin. 


2LUDWIG  OTTO  HESSE  (1811-1874),  German  mathematician  who  contributed  to  the  theory  of  curves  and 
surfaces. 
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1-10 


Let  a = [1,  -3,5] 
Find: 


INNER  PRODUCT 

b = [4,  0,  8], 


c = [-2,9,  1], 


1.  a • b,  b • a,  b • c 

2.  (—3a  + 5c)  • b,  15(a  - c)  • b 

3.  |a|,  1 2b | , | -c| 

4.  |a  + b | , |a|  + |b| 

5.  |b  + c|,  |b|  + |c| 

6.  |a  + c|2  + |a  - c|2  - 2(|a|2  + |c|2) 

7.  |a  • c|,  |a| |c| 

8.  5a  • 13b,  65a  • b 

9.  15a  • b + 15a  • c,  15a  • (b  + c) 

10.  a • (b  - c),  (a  - b)  • c 


11-16 


GENERAL  PROBLEMS 


11.  What  laws  do  Probs.  1 and  4-7  illustrate? 

12.  What  does  u • v = u • w imply  if  u = 0?  If  u # 0? 

13.  Prove  the  Cauchy-Schwarz  inequality. 

14.  Verify  the  Cauchy-Schwarz  and  triangle  inequalities 
for  the  above  a and  b. 


15.  Prove  the  parallelogram  equality.  Explain  its  name. 

16.  Triangle  inequality.  Prove  Eq.  (7).  Hint.  Use  Eq.  (3) 
for  | a + b | and  Eq.  (6)  to  prove  the  square  of  Eq.  (7), 
then  take  roots. 


17-20 


WORK 


Find  the  work  done  by  a force  p acting  on  a body  if  the 
body  is  displaced  along  the  straight  segment  AB  from  A to 
B.  Sketch  AB  and  p.  Show  the  details. 

17.  p = [2,  5,  0],  A:  (1,3,  3),  B:  ( 3,5,5) 

18.  p = [-1,  -2,4],  A:  (0,0,0),  B:  (6,  7,  5) 

19.  p = [0,  4,  3],  A:  (4,  5,-1),  B:  (1,3,0) 

20.  p = [6, -3, -3],  A:  (1,5,  2),  B:  (3,  4,  1) 


21.  Resultant.  Is  the  work  done  by  the  resultant  of  two 
forces  in  a displacement  the  sum  of  the  work  done 
by  each  of  the  forces  separately?  Give  proof  or 
counterexample. 


22-30 


ANGLE  BETWEEN  VECTORS 


Let  a = [1,  1,  0],  b = [3,  2,  1],  and  c = [1,  0,  2],  Find  the 
angle  between: 

22.  a,  b 

23.  b,  c 

24.  a + c,  b + c 


25.  What  will  happen  to  the  angle  in  Prob.  24  if  we  replace 
c by  nc  with  larger  and  larger  nl 

26.  Cosine  law.  Deduce  the  law  of  cosines  by  using 
vectors  a,  b,  and  a — b. 

27.  Addition  law.  cos  ( a — f3)  = cos  a cos  /3  + sin  a 
sin  /3.  Obtain  this  by  using  a = [cos  a,  sin  a], 
b = [cos  /S,  sin  /3]  where  0 S a S /3  S 27 r. 

28.  Triangle.  Find  the  angles  of  the  triangle  with  vertices 
A:  (0,  0,  2),  B:  (3,  0,  2),  and  C:  (1,  1,  1).  Sketch  the 
triangle. 

29.  Parallelogram.  Find  the  angles  if  the  vertices  are 
(0,  0),  (6,  0),  (8,  3),  and  (2,  3). 

30.  Distance.  Find  the  distance  of  the  point  A:  (1,  0,  2) 
from  the  plane  P:  3x  + y + z — 9.  Make  a sketch. 


31-35  ORTHOGONALITY  is  particularly  important, 
mainly  because  of  orthogonal  coordinates,  such  as  Cartesian 
coordinates,  whose  natural  basis  [Eq.  (9),  Sec.  9.1],  consists 
of  three  orthogonal  unit  vectors. 


31.  For  what  values  of  a\  are  [«i,  4,  3]  and  [3,  —2,  12] 
orthogonal? 

32.  Planes.  For  what  c are  3jc  + z = 5 and  8 x — y + 
cz  — 9 orthogonal? 

33.  Unit  vectors.  Find  all  unit  vectors  a = [a^  a2\  in  the 
plane  orthogonal  to  [4,  3]. 

34.  Corner  reflector.  Find  the  angle  between  a light  ray 
and  its  reflection  in  three  orthogonal  plane  mirrors, 
known  as  corner  reflector. 

35.  Parallelogram.  When  will  the  diagonals  be  ortho- 
gonal? Give  a proof. 


36-40 


COMPONENT  IN  THE  DIRECTION 
OF  A VECTOR 


Find  the  component  of  a in  the  direction  of  b.  Make  a 

sketch. 

36.  a = [1,  1,  1],  b = [2,  1,  3] 

37.  a = [3,  4,  0],  b = [4,  -3,  2] 

38.  a = [8,  2,  0],  b = [-4, -1,0] 

39.  When  will  the  component  (the  projection)  of  a in  the 
direction  of  b be  equal  to  the  component  (the 
projection)  of  b in  the  direction  of  a?  First  guess. 

40.  What  happens  to  the  component  of  a in  the  direction 
of  b if  you  change  the  length  of  b? 
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9.3  Vector  Product  (Cross  Product) 

We  shall  define  another  form  of  multiplication  of  vectors,  inspired  by  applications,  whose 
result  will  be  a vector.  This  is  in  contrast  to  the  dot  product  of  Sec.  9.2  where  multiplication 
resulted  in  a scalar.  We  can  construct  a vector  v that  is  perpendicular  to  two  vectors  a 
and  b,  which  are  two  sides  of  a parallelogram  on  a plane  in  space  as  indicated  in  Fig.  185, 
such  that  the  length  |v|  is  numerically  equal  to  the  area  of  that  parallelogram.  Here  then 
is  the  new  concept. 


DEFINITION 


Vector  Product  (Cross  Product,  Outer  Product)  of  Vectors 

The  vector  product  or  cross  product  a x b (read  “a  cross  b”)  of  two  vectors  a 
and  b is  the  vector  v denoted  by 

v = a x b 

I.  If  a = 0 or  b = 0,  then  we  define  v = a x b = 0. 

II.  If  both  vectors  are  nonzero  vectors,  then  vector  v has  the  length 

(1)  |v|  = |a  x b|  = |a||b|  sin  y, 

where  y is  the  angle  between  a and  b as  in  Sec.  9.2. 

Furthermore,  by  design,  a and  b form  the  sides  of  a parallelogram  on  a plane 
in  space.  The  parallelogram  is  shaded  in  blue  in  Fig.  185.  The  area  of  this  blue 
parallelogram  is  precisely  given  by  Eq.  (1),  so  that  the  length  |v|  of  the  vector 
v is  equal  to  the  area  of  that  parallelogram. 

III.  If  a and  b lie  in  the  same  straight  line,  i.e.,  a and  b have  the  same  or  opposite 
directions,  then  y is  0°  or  180°  so  that  sin  y = 0.  In  that  case  |v|  = 0 so  that 
v = a x b = 0. 

IV.  If  cases  I and  III  do  not  occur,  then  v is  a nonzero  vector.  The  direction  of 
v = a X b is  perpendicular  to  both  a and  b such  that  a,  b,  v — precisely  in  this 
order  (!) — form  a right-handed  triple  as  shown  in  Figs.  185-187  and  explained 
below. 

Another  term  for  vector  product  is  outer  product. 


Remark.  Note  that  I and  III  completely  characterize  the  exceptional  case  when  the  cross 
product  is  equal  to  the  zero  vector,  and  II  and  IV  the  regular  case  where  the  cross  product 
is  perpendicular  to  two  vectors. 

lust  as  we  did  with  the  dot  product,  we  would  also  like  to  express  the  cross  product  in 
components.  Let  a = [ay,  az,  £73]  and  b = [£q,  bz,  £>3].  Then  v = [iq,  Vz,  U3]  = a X b has 
the  components 

(2)  Vi  = a2bz  ~ a%bz,  Vz  = a->,b\  — a\h->„  v->,  = a \ b2  — (izb\ ■ 

Here  the  Cartesian  coordinate  system  is  right-handed,  as  explained  below  (see  also 
Fig.  188).  (For  a left-handed  system,  each  component  of  v must  be  multiplied  by  — 1. 
Derivation  of  (2)  in  App.  4.) 
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Right-Handed  Triple.  A triple  of  vectors  a,  b,  v is  right-handed  if  the  vectors  in  the 
given  order  assume  the  same  sort  of  orientation  as  the  thumb,  index  finger,  and  middle 
finger  of  the  right  hand  when  these  are  held  as  in  Fig.  186.  We  may  also  say  that  if  a is 
rotated  into  the  direction  of  b through  the  angle  y (<7T),  then  v advances  in  the  same 
direction  as  a right-handed  screw  would  if  turned  in  the  same  way  (Fig.  187). 


Right-Handed  Cartesian  Coordinate  System.  The  system  is  called  right-handed  if 

the  corresponding  unit  vectors  i,  j,  k in  the  positive  directions  of  the  axes  (see  Sec.  9.1) 
form  a right-handed  triple  as  in  Fig.  188a.  The  system  is  called  left-handed  if  the  sense 
of  k is  reversed,  as  in  Fig.  188b.  In  applications,  we  prefer  right-handed  systems. 


(a)  Right-handed 


z 

(b)  Left-handed 


Fig.  188.  The  two  types  of  Cartesian  coordinate  systems 


How  to  Memorize  (2).  If  you  know  second-  and  third-order  determinants,  you  see  that 
(2)  can  be  written 


a* 

«3 

ai 

«3 

03 

d\ 

a\ 

Cl2 

, v2=  ~ 

= + 

, V3  = 

^2 

t>3 

h 

b3 

t>3 

h 

^2 

(2*)  V! 
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and  v = [i>i,  i>2,  U:sl  = U]i  + v2\  + t'.-jk  is  the  expansion  of  the  following  symbolic 
determinant  by  its  first  row.  (We  call  the  determinant  “symbolic”  because  the  first  row 
consists  of  vectors  rather  than  of  numbers.) 


i 

j 

k 

a 2 

a3 

at 

a3 

fli 

v = a x b = 

at 

«2 

a3 

= 

i — 

j + 

^2 

b3 

b\ 

b3 

b\ 

t>2 

b\ 

b3 

t>3 

For  a left-handed  system  the  determinant  has  a minus  sign  in  front. 

EXAMPLE  Vector  Product 

For  the  vector  product  v = a X b of  a = [1,  1,0]  and  b = [3,  0,  0]  in  right-handed  coordinates  we  obtain 
from  (2) 


Vi  = 0,  v2  = 0, 


v3  = 1 ■ 0 - 1 ■ 3 = -3. 


We  confirm  this  by  (2**): 


v 


a x b 


i j k 


1 0 

1 0 

1 1 

1 1 0 

= 

i — 

j + 

0 0 

3 0 

3 0 

3 0 0 


-3k  = [0,  0,  -3], 


To  check  the  result  in  this  simple  case,  sketch  a,  b,  and  v.  Can  you  see  that  two  vectors  in  the  xy-plane  must 
always  have  their  vector  product  parallel  to  the  --axis  (or  equal  to  the  zero  vector)? 


Vector  Products  of  the  Standard  Basis  Vectors 


(3) 


i x j = k,  j x k = i. 

j x i = -k,  k x j = -i. 


We  shall  use  this  in  the  next  proof. 


k X i = j 
i X k = -j. 


THEOREM  1 


Fig.  189. 

Anticommutativity 
of  cross 
multiplication 


General  Properties  of  Vector  Products 

(a)  For  every  scalar  l, 

(4)  (/a)  x b = /(a  x b)  = a x (lb). 

(b)  Cross  multiplication  is  distributive  with  respect  to  vector  addition;  that  is, 

(a)  a x (b  + c)  = (a  x b)  4-  (a  x c), 

(5) 

(J3)  (a  + b)  x c = (a  x c)  + (b  x c). 

(c)  Cross  multiplication  is  not  commutative  but  anticommutative;  that  is, 

(6)  b x a = -(a  x b)  (Fig.  189). 
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PROOF 


EXAMPLE  3 


(d)  Cross  multiplication  is  not  associative',  that  is,  in  general, 
(7)  a x (b  x c)  ¥=  (a  x b)  x c 

so  that  the  parentheses  cannot  be  omitted. 


Equation  (4)  follows  directly  from  the  definition.  In  (5a),  formula  (2*)  gives  for  the  first 
component  on  the  left 


«2  «3 

b2  + c2  b3  + c3 


o2(^3  + c3)  - a3(b2  + c2) 


= (a2b3  “ a3bZ)  + (o2c3  - a3c2) 


a2 

a3 

Cl2 

«3 

+ 

b3 

C2 

C3 

By  (2*)  the  sum  of  the  two  determinants  is  the  first  component  of  (a  X b)  + (a  X c),  the 
right  side  of  (5a).  For  the  other  components  in  (5a)  and  in  5(/3),  equality  follows  by  the 
same  idea. 

Anticommutativity  (6)  follows  from  (2**)  by  noting  that  the  interchange  of  Rows  2 
and  3 multiplies  the  determinant  by  —1.  We  can  confirm  this  geometrically  if  we  set 
a X b = v and  b X a = w;  then  | v|  = | w|  by  (1),  and  for  b,  a,  w to  form  a right-handed 
triple,  we  must  have  w = — v. 

Finally,  i X (i  X j)  = i X k = — j,  whereas  (ixi)xj=0xj  = 0 (see  Example  2). 
This  proves  (7). 

Typical  Applications  of  Vector  Products 

Moment  of  a Force 

In  mechanics  the  moment  m of  a force  p about  a point  Q is  defined  as  the  product  m = |p| d,  where  d is  the 
(perpendicular)  distance  between  Q and  the  line  of  action  L of  p (Fig.  190).  If  r is  the  vector  from  Q to  any 
point  A on  L,  then  d = |r|  sin  y,  as  shown  in  Fig.  190,  and 

m = | r 1 1 p|  sin  y. 

Since  y is  the  angle  between  r and  p,  we  see  from  (1)  that  m = |r  X p|.  The  vector 

(8)  m = r x p 


is  called  the  moment  vector  or  vector  moment  of  p about  Q.  Its  magnitude  is  m.  If  m ^ 0,  its  direction  is 
that  of  the  axis  of  the  rotation  about  Q that  p has  the  tendency  to  produce.  This  axis  is  perpendicular  to  both 
r and  p. 


Fig.  190.  Moment  of  a force  p 
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EXAMPLE  4 


EXAMPLE  5 


Moment  of  a Force 

Find  the  moment  of  the  force  p about  the  center  Q of  a wheel,  as  given  in  Fig.  191. 

Solution.  Introducing  coordinates  as  shown  in  Fig.  191,  we  have 

p = [1000  cos  30°,  1000  sin  30°,  0]  = [866,  500,  0],  r = [0,  1.5,  0]. 
(Note  that  the  center  of  the  wheel  is  at  y — —1.5  on  the  y-axis.)  Hence  (8)  and  (2**)  give 


k = [0,0,  -1299]. 


i 

j 

k 

0 

1.5 

m = r x p = 

0 

1.5 

0 

= 0i  - Oj  + 

866 

500 

866 

500 

0 

This  moment  vector  m is  normal,  i.e.,  perpendicular  to  the  plane  of  the  wheel.  Hence  it  has  the  direction  of  the 
axis  of  rotation  about  the  center  Q of  the  wheel  that  the  force  p has  the  tendency  to  produce.  The  moment  m 
points  in  the  negative  z-direction,  This  is,  the  direction  in  which  a right-handed  screw  would  advance  if  turned 
in  that  way. 


Velocity  of  a Rotating  Body 

A rotation  of  a rigid  body  B in  space  can  be  simply  and  uniquely  described  by  a vector  w as  follows.  The 
direction  of  w is  that  of  the  axis  of  rotation  and  such  that  the  rotation  appears  clockwise  if  one  looks  from  the 
initial  point  of  w to  its  terminal  point.  The  length  of  w is  equal  to  the  angular  speed  cu(>0)  of  the  rotation, 
that  is,  the  linear  (or  tangential)  speed  of  a point  of  B divided  by  its  distance  from  the  axis  of  rotation. 

Let  P be  any  point  of  B and  cl  its  distance  from  the  axis.  Then  P has  the  speed  cod.  Let  r be  the  position  vector 
of  P referred  to  a coordinate  system  with  origin  0 on  the  axis  of  rotation.  Then  d = |r|  sin  y,  where  y is  the 
angle  between  w and  r.  Therefore, 

cod  = | w 1 1 r | sin  y = |w  x r|. 

From  this  and  the  definition  of  vector  product  we  see  that  the  velocity  vector  v of  P can  be  represented  in  the 
form  (Fig.  192) 

(9)  v = w X r. 

This  simple  formula  is  useful  for  determining  v at  any  point  of  B. 


Fig.  192.  Rotation  of  a rigid  body 
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THEOREM  2 


PROOF 


Scalar  Triple  Product 

Certain  products  of  vectors,  having  three  or  more  factors,  occur  in  applications.  The  most 
important  of  these  products  is  the  scalar  triple  product  or  mixed  product  of  three  vectors 

a,  b,  c. 

(10*)  (a  b c)  = a • (b  x c). 

The  scalar  triple  product  is  indeed  a scalar  since  (10*)  involves  a dot  product,  which  in 
turn  is  a scalar.  We  want  to  express  the  scalar  triple  product  in  components  and  as  a third- 
order  determinant.  To  this  end,  let  a = [a±,  a2,  03],  b = [/q,  b2,  £>3],  and  c = [ci,  C2,  C3]. 
Also  set  b X c = v = [i>i,  v2,  U3].  Then  from  the  dot  product  in  components  [formula 
(2)  in  Sec.  9.2]  and  from  (2*)  with  b and  c instead  of  a and  b we  first  obtain 

a • (b  X c)  = a • v = a\V\  + a2v2  + 03^3 


b2  />>, 

b3  bx 

bx  b2 

fll 

+ a2 

+ «3 

C2  C3 

C3  Ci 

ci  c2 

The  sum  on  the  right  is  the  expansion  of  a third-order  determinant  by  its  first  row.  Thus 
we  obtain  the  desired  formula  for  the  scalar  triple  product,  that  is, 


(10) 


(a  b c)  = a • (b  x c)  = 


Clg  CI3 

b\  b2  b2l 

Cl  c2  C3 


The  most  important  properties  of  the  scalar  triple  product  are  as  follows. 


Properties  and  Applications  of  Scalar  Triple  Products 

(a)  In  (10)  the  dot  and  cross  can  be  interchanged: 

(11)  (a  b c)  = a • (b  x c)  = (a  x b)  • c. 

(b)  Geometric  interpretation.  The  absolute  value  | (a  b c)|  of  (10)  is  the 
volume  of  the  parallelepiped  (oblique  box)  with  a,  b,  c as  edge  vectors  (Fig.  193). 

(c)  Linear  independence.  Three  vectors  in  R3  are  linearly  independent  if 
and  only  if  their  scalar  triple  product  is  not  zero. 


(a)  Dot  multiplication  is  commutative,  so  that  by  (10) 


(a  x b)  • c = c • (a  x b) 


Cl 

C2 

C3 

Cl\ 

#2 

«3 

bx 

^2 

b3 
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From  this  we  obtain  the  determinant  in  (10)  by  interchanging  Rows  1 and  2 and  in  the 
result  Rows  2 and  3.  But  this  does  not  change  the  value  of  the  determinant  because  each 
interchange  produces  a factor  — 1,  and  (— 1)(— 1)  = 1.  This  proves  (11). 

(b)  The  volume  of  that  box  equals  the  height  h = a |cos  y (Fig.  193)  times  the  area 
of  the  base,  which  is  the  area  |b  X c|  of  the  parallelogram  with  sides  b and  c.  Hence  the 
volume  is 


|a||b  x c|  |cosy|  = | a • (b  x c)|  (Fig.  193) 

as  given  by  the  absolute  value  of  (11). 

(c)  Three  nonzero  vectors,  whose  initial  points  coincide,  are  linearly  independent  if  and 
only  if  the  vectors  do  not  lie  in  the  same  plane  nor  lie  on  the  same  straight  line. 

This  happens  if  and  only  if  the  triple  product  in  (b)  is  not  zero,  so  that  the  independence 
criterion  follows.  (The  case  of  one  of  the  vectors  being  the  zero  vector  is  trivial.) 


Fig.  193.  Geometric  interpretation  of  a scalar  triple  product 


EXAMPLE  6 Tetrahedron 

A tetrahedron  is  determined  by  three  edge  vectors  a,  b,  c,  as  indicated  in  Fig.  194.  Find  the  volume  of  the  tetrahedron 
in  Fig.  194,  when  a = [2,  0,  3],  b = [0,  4,  1],  c = [5,  6,  0]. 

Solution.  The  volume  V of  the  parallelepiped  with  these  vectors  as  edge  vectors  is  the  absolute  value  of  the 
scalar  triple  product 


(a  b c)  = 


2 0 3 


4 1 

0 4 

0 4 1 

= 2 

+ 3 

6 0 

5 6 

5 6 0 


= -12  - 60  = -72. 


Hence  V = 72.  The  minus  sign  indicates  that  if  the  coordinates  are  right-handed,  the  triple  a,  b,  c is  left-handed. 
The  volume  of  a tetrahedron  is  g of  that  of  the  parallelepiped  (can  you  prove  it?),  hence  12. 

Can  you  sketch  the  tetrahedron,  choosing  the  origin  as  the  common  initial  point  of  the  vectors?  What  are  the 
coordinates  of  the  four  vertices? 


3 

This  is  the  end  of  vector  algebra  (in  space  R and  in  the  plane).  Vector  calculus 
(differentiation)  begins  in  the  next  section. 


GENERAL  PROBLEMS 

1.  Give  the  details  of  the  proofs  of  Eqs.  (4)  and  (5). 

2.  What  does  a x b = a x c with  a A 0 imply? 

3.  Give  the  details  of  the  proofs  of  Eqs.  (6)  and  (11). 


4.  Lagrange’s  identity  for  |a  x b|.  Verify  it  for 
a = [3,  4,  2]  and  b = [1,  0,  2].  Prove  it,  using 
sin2  y = 1 — cos2  y.  The  identity  is 


(12)  | a x b|  = V(a  • a)  (b  • b)  - (a  • b)2. 
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5.  What  happens  in  Example  3 of  the  text  if  you  replace 
p by  -p? 

6.  What  happens  in  Example  5 if  you  choose  a P at 
distance  2d  from  the  axis  of  rotation? 


7.  Rotation.  A wheel  is  rotating  about  the  y-axis  with 
angular  speed  a>  = 20  sec-1.  The  rotation  appears 
clockwise  if  one  looks  from  the  origin  in  the  positive 
y-direction.  Find  the  velocity  and  speed  at  the  point 
[8,  6,  0],  Make  a sketch. 

8.  Rotation.  What  are  the  velocity  and  speed  in  Prob.  7 
at  the  point  (4,  2,  —2)  if  the  wheel  rotates  about  the 
line  y = x,  z — 0 with  co  = 10  sec-1? 

9.  Scalar  triple  product.  What  does  (a  b c)  = 0 imply 
with  respect  to  these  vectors? 

10.  WRITING  REPORT.  Summarize  the  most  important 
applications  discussed  in  this  section.  Give  examples. 
No  proofs. 


11-23 


VECTOR  AND  SCALAR 
TRIPLE  PRODUCTS 


With  respect  to  right-handed  Cartesian  coordinates, 
let  a = [2,1,0],  b =[-3,2,0],  c = 1 1 , 4,  -2],  and 
d = [5,  —1,  3].  Showing  details,  find: 

11.  a x b,  b x a,  a • b 

12.  3c  x 5d,  15d  x c,  15d  • c,  15c  • d 

13.  c x (a  + b),  a x c + b x c 

14.  4b  x 3c  + 12c  x b 

15.  (a  + d)  x (d  + a) 

16.  (b  x c)  • d.  b • (c  x d) 

17.  (b  x c)  x d,  b x (c  x d) 

18.  (a  x b)  x a,  a x (b  x a) 

19.  (i  j k),  (i  k j) 

20.  (a  x b)  x (c  x d),  (a  b d)c  — (a  b c)d 

21.  4b  x 3c,  12|b  x c|,  12|c  x b| 

22.  (a  - b c - b d - b),  (a  c d) 

23.  b x b,  (b  — c)  x (c  — b),  b • b 

24.  TEAM  PROJECT.  Useful  Formulas  for  Three  and 
Four  Vectors.  Prove  (13)— (16),  which  are  often  useful 
in  practical  work,  and  illustrate  each  formula  with  two 


examples.  Hint.  For  (13)  choose  Cartesian  coordinates 
such  that  d = [d\,  0,  0]  and  c = [cq,  c2,  0].  Show  that 
each  side  of  (13)  then  equals  [— b\C2d\,  0],  and 
give  reasons  why  the  two  sides  are  then  equal  in  any 
Cartesian  coordinate  system.  For  (14)  and  (15)  use  (13). 

(13)  b x (c  x d)  = (b  • d)c  - (b  • c)d 

(14)  (a  x b)  x (c  x d)  = (a  b d)c  - (a  b c)d 


(15)  (a  x b)  • (c  x d)  = (a  • c)(b  • d)  - (a  • d)(b  • c) 

(16)  (a  b c)  = (b  c a)  = (c  a b) 


25-35 


= — (c  b a)  = —(a  c b) 

APPLICATIONS 


25.  Moment  m of  a force  p.  Find  the  moment  vector  m 
and  m of  p = [2,  3,  0]  about  Q:  (2,  1,  0)  acting  on  a 
line  through  A:  (0,  3,  0).  Make  a sketch. 

26.  Moment.  Solve  Prob.  25  if  p = [1,  0,  3],  Q:  (2,  0,  3), 
and  A:  (4,3,5). 

27.  Parallelogram.  Find  the  area  if  the  vertices  are  (4,  2, 
0),  (10,  4,  0),  (5,  4,  0),  and  (11,  6,  0).  Make  a sketch. 

28.  A remarkable  parallelogram.  Find  the  area  of  the 
quadrangle  Q whose  vertices  are  the  midpoints  of 
the  sides  of  the  quadrangle  P with  vertices  A:  (2,  1,  0), 
B:  (5,-1.  0),  C:  (8,  2,  0),  and  D:  (4,  3,  0).  Verify  that 
Q is  a parallelogram. 

29.  Triangle.  Find  the  area  if  the  vertices  are  (0,  0,  1), 
(2,  0,  5),  and  (2,  3,  4). 

30.  Plane.  Find  the  plane  through  the  points  A:  (1,  2,  4), 
B:  (4,  2,  -2),  and  C:  (0,  8,  4). 

31.  Plane.  Find  the  plane  through  (1,  3,  4),  (1,  —2,  6),  and 
(4,  0,  7). 

32.  Parallelepiped.  Find  the  volume  if  the  edge  vectors 
are  i + j,  — 2i  + 2k,  and  — 2i  — 3k.  Make  a sketch. 

33.  Tetrahedron.  Find  the  volume  if  the  vertices  are 
(1,  1,  1),  (5,  -7,  3),  (7,  4,  8),  and  (10,  7,  4). 

34.  Tetrahedron.  Find  the  volume  if  the  vertices  are 
(1,  3,  6),  (3,  7,  12),  (8,  8,  9),  and  (2,  2,  8). 

35.  WRITING  PROJECT.  Applications  of  Cross 
Products.  Summarize  the  most  important  applications 
we  have  discussed  in  this  section  and  give  a few  simple 
examples.  No  proofs. 


9.4  Vector  and  Scalar  Functions  and  Their  Fields. 
Vector  Calculus:  Derivatives 


Our  discussion  of  vector  calculus  begins  with  identifying  the  two  types  of  functions  on  which 
it  operates.  Let  P be  any  point  in  a domain  of  definition.  Typical  domains  in  applications 
are  three-dimensional,  or  a surface  or  a curve  in  space.  Then  we  define  a vector  function 
v,  whose  values  are  vectors,  that  is, 


v = y(P)  = [Vl(P),  v2(P),  v3{P)} 
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that  depends  on  points  P in  space.  We  say  that  a vector  function  defines  a vector  field  in 
a domain  of  definition.  Typical  domains  were  just  mentioned.  Examples  of  vector  fields 
are  the  field  of  tangent  vectors  of  a curve  (shown  in  Fig.  195),  normal  vectors  of  a surface 
(Fig.  196),  and  velocity  field  of  a rotating  body  (Fig.  197).  Note  that  vector  functions  may 
also  depend  on  time  t or  on  some  other  parameters. 

Similarly,  we  define  a scalar  function  /,  whose  values  are  scalars,  that  is, 

f=m 

that  depends  on  F.  We  say  that  a scalar  function  defines  a scalar  field  in  that  three- 
dimensional  domain  or  surface  or  curve  in  space.  Two  representative  examples  of  scalar 
fields  are  the  temperature  field  of  a body  and  the  pressure  field  of  the  air  in  Earth’s 
atmosphere.  Note  that  scalar  functions  may  also  depend  on  some  parameter  such  as 
time  t. 

Notation.  If  we  introduce  Cartesian  coordinates  x,  y,  z,  then,  instead  of  writing  \(P)  for 
the  vector  function,  we  can  write 

\(x,  y,  z)  = [ui(x,  y,  z),  v2(x,  y,  z),  v3(x,  y,  z)]. 


Fig.  195.  Field  of  tangent 
vectors  of  a curve 


Fig.  196.  Field  of  normal 
vectors  of  a surface 


We  have  to  keep  in  mind  that  the  components  depend  on  our  choice  of  coordinate  system, 
whereas  a vector  field  that  has  a physical  or  geometric  meaning  should  have  magnitude 
and  direction  depending  only  on  P,  not  on  the  choice  of  coordinate  system. 

Similarly,  for  a scalar  function,  we  write 

m = fix,  y,  Z). 

We  illustrate  our  discussion  of  vector  functions,  scalar  functions,  vector  fields,  and  scalar 
fields  by  the  following  three  examples. 

Scalar  Function  (Euclidean  Distance  in  Space) 

The  distance /(P)  of  any  point  P from  a fixed  point  Pq  in  space  is  a scalar  function  whose  domain  of  definition 
is  the  whole  space.  /(P)  defines  a scalar  field  in  space.  If  we  introduce  a Cartesian  coordinate  system  and  Pq 
has  the  coordinates  jcq,  yo>  Zo>  then /is  given  by  the  well-known  formula 

f(P)  = f(x , 3-,  z)  = V(x  - x0f  + (y  - y0f  + {z  - z0f 

where  x,  y , z are  the  coordinates  of  P.  If  we  replace  the  given  Cartesian  coordinate  system  with  another  such 
system  by  translating  and  rotating  the  given  system,  then  the  values  of  the  coordinates  of  P and  Pq  will  in  general 
change,  but  /(P)  will  have  the  same  value  as  before.  Hence /(P)  is  a scalar  function.  The  direction  cosines  of 
the  straight  line  through  P and  Pq  are  not  scalars  because  their  values  depend  on  the  choice  of  the  coordinate 
system. 
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EXAMPLE  2 


EXAMPLE  3 


Vector  Field  (Velocity  Field) 

At  any  instant  the  velocity  vectors  \(P)  of  a rotating  body  B constitute  a vector  field,  called  the  velocity  field 
of  the  rotation.  If  we  introduce  a Cartesian  coordinate  system  having  the  origin  on  the  axis  of  rotation,  then  (see 
Example  5 in  Sec.  9.3) 


(1)  v(x,  y,  z)  = w x r = w x [x,  y,  z]  = w x (xi  + yj  + jk) 

where  x,  y,  z are  the  coordinates  of  any  point  P of  B at  the  instant  under  consideration.  If  the  coordinates  are 
such  that  the  z-axis  is  the  axis  of  rotation  and  w points  in  the  positive  ^-direction,  then  w = ink  and 


= to[-y,  x,  0]  = &)(— yi  + xj). 


j k 

0 0 w 

x y z 

An  example  of  a rotating  body  and  the  corresponding  velocity  field  are  shown  in  Fig.  197. 


& 


Fig.  197.  Velocity  field  of  a rotating  body 

Vector  Field  (Field  of  Force,  Gravitational  Field) 

Let  a particle  A of  mass  M be  fixed  at  a point  Pq  and  let  a particle  B of  mass  m be  free  to  take  up  various  positions 
P in  space.  Then  A attracts  B.  According  to  Newton’s  law  of  gravitation  the  corresponding  gravitational  force  p 
is  directed  from  P to  Pq,  and  its  magnitude  is  proportional  to  1/r2 , where  r is  the  distance  between  P and  Pq,  say, 


(2) 


IpI  = • 


c = GMm. 


Here  G = 6.67  • 10-8cm3/(g  • sec2)  is  the  gravitational  constant.  Hence  p defines  a vector  field  in  space.  If 
we  introduce  Cartesian  coordinates  such  that  Pq  has  the  coordinates  jcq,  yo>  Zo  and  P has  the  coordinates  x,  y,  z, 
then  by  the  Pythagorean  theorem, 


(i=0). 


r = V(x  - x0f  + (y  - y0f  + (z  ~ z0f 

Assuming  that  r > 0 and  introducing  the  vector 

r = [x  - x0,  y - y0,  z ~ Zo\  = (x  - x0)i  + (y  - y0)j  + (z  ~ Zo)k, 

we  have  |r|  = r,  and  (—  l/r)r  is  a unit  vector  in  the  direction  of  p;  the  minus  sign  indicates  that  p is  directed 
from  P to  Pq  (Fig.  198).  From  this  and  (2)  we  obtain 


(3) 


' 1 ^ 

C 

* 

1 

o 

y ~ yo 

z ~ Zo 

p - IpI 

v'7rJ 

= “7r  = 

C ^ . 

C r3  . 

1 

c> 

CO 

* - *o 


y - yo 


-J  - C- 


‘ ZQ 


This  vector  function  describes  the  gravitational  force  acting  on  B. 
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? 


p 


1 


\ 


Fig.  198.  Gravitational  field  in  Example  3 


Vector  Calculus 

The  student  may  be  pleased  to  learn  that  many  of  the  concepts  covered  in  (regular) 
calculus  carry  over  to  vector  calculus.  Indeed,  we  show  how  the  basic  concepts  of 
convergence,  continuity,  and  differentiability  from  calculus  can  be  defined  for  vector 
functions  in  a simple  and  natural  way.  Most  important  of  these  is  the  derivative  of  a 
vector  function. 

Convergence.  An  infinite  sequence  of  vectors  a^,  n = 1 , 2,  • • ■ , is  said  to  converge  if 
there  is  a vector  a such  that 

(4)  lim  |a(n)  - a|  =0. 

n—>  oo 

a is  called  the  limit  vector  of  that  sequence,  and  we  write 

(5)  lim  a(n)  = a. 

71— >00 

If  the  vectors  are  given  in  Cartesian  coordinates,  then  this  sequence  of  vectors  converges 
to  a if  and  only  if  the  three  sequences  of  components  of  the  vectors  converge  to  the 
corresponding  components  of  a.  We  leave  the  simple  proof  to  the  student. 

Similarly,  a vector  function  v(f)  of  a real  variable  t is  said  to  have  the  limit  l as  t 
approaches  t0,  if  v(f)  is  defined  in  some  neighborhood  of  t0  (possibly  except  at  t0)  and 

(6)  lim  |v(t)  /|  = 0. 

t—*to 

Then  we  write 

(7)  lim  v(0  = /. 

t—>iQ 

Here,  a neighborhood  of  t0  is  an  interval  (segment)  on  the  r-axis  containing  f0  as  an  interior 
point  (not  as  an  endpoint). 

Continuity.  A vector  function  v(f)  is  said  to  be  continuous  at  t = t0  if  it  is  defined  in 
some  neighborhood  of  t0  (including  at  t0  itself!)  and 

lim  v(f)  = v(f0). 

t—*to 


(8) 
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If  we  introduce  a Cartesian  coordinate  system,  we  may  write 

v(0  = [i>i(f),  v2 (t),  i>3(0]  = fi(f)i  + v2(0 j + v3(t)k. 

Then  v(?)  is  continuous  at  f0  if  and  only  if  its  three  components  are  continuous  at  f0. 
We  now  state  the  most  important  of  these  definitions. 


DEFINITION 


Derivative  of  a Vector  Function 

A vector  function  v(f)  is  said  to  be  differentiable  at  a point  t if  the  following  limit 
exists: 

v(r  + At)-  v(f) 

(9)  v (t)  = lim  • 

At^o  Ar 

This  vector  \ (t)  is  called  the  derivative  of  v(t).  See  Fig.  199. 


Fig.  199. 

In  components  with  respect  to 

(10)  v'(t) 


Derivative  of  a vector  function 
a given  Cartesian  coordinate  system, 
= [vl(t),  vi(t),  v£(t)  J. 


Hence  the  derivative  \'(t)  is  obtained  by  differentiating  each  component  separately.  For 
instance,  if  v = [t,  t2,  0],  then  v/  = [1,  2 1,  0]. 

Equation  (10)  follows  from  (9)  and  conversely  because  (9)  is  a “vector  form”  of  the 
usual  formula  of  calculus  by  which  the  derivative  of  a function  of  a single  variable  is 
defined.  [The  curve  in  Fig.  199  is  the  locus  of  the  terminal  points  representing  v(f)  for 
values  of  the  independent  variable  in  some  interval  containing  t and  t + At  in  (9)].  It 
follows  that  the  familiar  differentiation  rules  continue  to  hold  for  differentiating  vector 
functions,  for  instance, 


(cv/  = c\' 

(u  + v)'  = u'  + v' 


(c  constant), 


and  in  particular 


(ID 

(12) 

(13) 


(u  • v/  = u'  • V + u • v/ 

(u  x V)'  = u'  x V + u x v' 

(u  V w/  = (V  v w)  + (u  v/  w)  + (u  V w'). 
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The  simple  proofs  are  left  to  the  student.  In  (12),  note  the  order  of  the  vectors  carefully 
because  cross  multiplication  is  not  commutative. 

Derivative  of  a Vector  Function  of  Constant  Length 

Let  v(t)  be  a vector  function  whose  length  is  constant,  say,  v(  f)  = c.  Then  v|2  - V • v = c2,  and 
(v  • v;  = 2v  • = 0,  by  differentiation  [see  (11)].  This  yields  the  following  result.  The  derivative  of  a vector 

function  v(t)  of  constant  length  is  either  the  zero  vector  or  is  perpendicular  to  \(t). 

Partial  Derivatives  of  a Vector  Function 

Our  present  discussion  shows  that  partial  differentiation  of  vector  functions  of  two  or  more 
variables  can  be  introduced  as  follows.  Suppose  that  the  components  of  a vector  function 

v = [ur,  v2,  u3]  = Vii  + v2j  + tt3k 

are  differentiable  functions  of  n variables  t\,  ■ ■ ■ , tn.  Then  the  partial  derivative  of  v with 
respect  to  tm  is  denoted  by  d\/dtm  and  is  defined  as  the  vector  function 

dv  di>i  . , dv2  . . dv3 

= 1 + - j H k. 

dtm 

Similarly,  second  partial  derivatives  are 

->2  ->2  ->2  -.2 

d V d Vi  dV2  dv3 

— — — - — 1 + j + — — - — k, 

dtidtm  dtidtm  dtidtm 


and  so  on. 

EXAMPLE  Partial  Derivatives 

flr  dr  _ 

Let  r(t x,  t2)  = a cos  ti  i + a sin  t j j + t2  k.  Then  = — a sin  ti  i + a cos  t ! j and  — = k. 

St  1 dt2 

Various  physical  and  geometric  applications  of  derivatives  of  vector  functions  will  be 
discussed  in  the  next  sections  as  well  as  in  Chap.  10. 


PROBLEM— SET“9^ 


1-8 


SCALAR  FIELDS  IN  THE  PLANE 


Let  the  temperature  T in  a body  be  independent  of  z so  that 
it  is  given  by  a scalar  function  T = T(x,  t).  Identify  the 
isotherms  T(x,  y)  = const.  Sketch  some  of  them. 

1 . T = x2  - y2  2.  T = xy 

3.  T = 3x  — 4y  4.  T = arctan  (y/x) 

5.  T = y/(x2  + y2)  6.  T = x/{x2  + y2) 

l.T=  9xz  + 4y2 


8.  CAS  PROJECT.  Scalar  Fields  in  the  Plane.  Sketch 
or  graph  isotherms  of  the  following  fields  and  describe 
what  they  look  like. 


(a)  x2  - 4x  - y2 
(c)  cos  x sinh  y 
(e)  ex  sin  y 
(g)  x4  - 6x2y2  + y4 


(b)  x2y  - y3/3 
(d)  sin  x sinh  y 
(f)  e2x  cos  2y 
(h)  xz  — 2x  — y2 


9-14 


SCALAR  FIELDS  IN  SPACE 


What  kind  of  surfaces  are  the  level  surfaces  /( x,  y,  z)  = 
const? 


9.  / = 4*  — 3y  + 2z 
11.  / = 5*2  + 2y2 
13.  f = z — (x2  + y2) 


10.  / = 9(x2  + y2)  + z2 


12.  f=z  ~ V. : 
14.  f — x ~ y2 


x2  + y2 
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15-20 


VECTOR  FIELDS 


Sketch  figures  similar  to  Fig.  198.  Try  to  interpet  the  field 
of  v as  a velocity  field. 


15.  v = i + j 
17.  v = xj 
19.  v = xi  — yj 


16.  v = — vi  + xj 

18.  v = xi  + yj 

20.  v = yi  — xj 


21.  CAS  PROJECT.  Vector  Fields.  Plot  by  arrows: 
(a)  v = [x,  x2]  (b)  v = [1/y,  1/x] 

(c)  v = [cos  x,  sin  x]  (d)  v = e-fcr  +v  1 [x,  — y] 


22-25 


DIFFERENTIATION 


22.  Find  the  first  and  second  derivatives  of  r = [3  cos  2 f, 
3 sin  2 1,  4f]. 


23.  Prove  ( 1 1) — (13).  Give  two  typical  examples  for  each 
formula. 


24.  Find  the  first  partial  derivatives  of  Vi  = [ex  cos  y, 
ex  sin  y]  and  \2  = [cos  x cosh  y,  —sin  x sinh  y], 

25.  WRITING  PROJECT.  Differentiation  of  Vector 
Functions.  Summarize  the  essential  ideas  and  facts 
and  give  examples  of  your  own. 


Curves.  Arc  Length.  Curvature.  Torsion 

Vector  calculus  has  important  applications  to  curves  (Sec.  9.5)  and  surfaces  (to  be  covered 
in  Sec.  10.5)  in  physics  and  geometry.  The  application  of  vector  calculus  to  geometry  is 
a field  known  as  differential  geometry.  Differential  geometric  methods  are  applied 
to  problems  in  mechanics,  computer-aided  as  well  as  traditional  engineering  design, 
geodesy,  geography,  space  travel,  and  relativity  theory.  For  details,  see  [GenRef8]  and 
[GenRef9]  in  App.  1. 

Bodies  that  move  in  space  form  paths  that  may  be  represented  by  curves  C.  This  and 
other  applications  show  the  need  for  parametric  representations  of  C with  parameter  r, 
which  may  denote  time  or  something  else  (see  Fig.  200).  A typical  parametric  representation 
is  given  by 


(1)  r(0  = [x(0,  y(t),  z(?)]  = x(t)  i + y (r)j  + z(f)k. 


Fig.  200.  Parametric  representation  of  a curve 

Here  t is  the  parameter  and  x,  y,  z are  Cartesian  coordinates,  that  is,  the  usual  rectangular 
coordinates  as  shown  in  Sec.  9.1.  To  each  value  t = t0,  there  corresponds  a point  of  C 
with  position  vector  r(f0)  whose  coordinates  are  x(t0),  y(t0),  z(to)-  This  is  illustrated 
in  Figs.  201  and  202. 

The  use  of  parametric  representations  has  key  advantages  over  other  representations 
that  involve  projections  into  the  xy-plane  and  xz-plane  or  involve  a pair  of  equations  with 
y or  with  z as  independent  variable.  The  projections  look  like  this: 


(2) 


y = fix). 


Z = g(x). 
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EXAMPLE  1 


EXAMPLE  2 


EXAMPLE  3 


The  advantages  of  using  (1)  instead  of  (2)  are  that,  in  (1),  the  coordinates  x,  y,  z all 
play  an  equal  role,  that  is,  all  three  coordinates  are  dependent  variables.  Moreover,  the 
parametric  representation  (1)  induces  an  orientation  on  C.  This  means  that  as  we 
increase  t,  we  travel  along  the  curve  C in  a certain  direction.  The  sense  of  increasing 
t is  called  the  positive  sense  on  C.  The  sense  of  decreasing  t is  then  called  the  negative 
sense  on  C,  given  by  (1). 

Examples  I -4  give  parametric  representations  of  several  important  curves. 

Circle.  Parametric  Representation.  Positive  Sense 

The  circle  x2  + y = 4,  z = 0 in  the  xy-plane  with  center  0 and  radius  2 can  be  represented  parametrically  by 

r (?)  = [2  cos  t , 2 sin  t,  0]  or  simply  by  r(f)  = [2  cos  t,  2 sin  t ] (Fig.  201) 

where  0 ^ t ^ 27T.  Indeed,  x2  + y2  = (2  cos  t)2  + (2  sin  t )2  = 4 (cos2  t + sin2  t)  — 4,  For  t = 0 we  have 
r(0)  = [2,  0],  for  t = \tt  we  get  y(^tt)  = [0,  2],  and  so  on.  The  positive  sense  induced  by  this  representation 
is  the  counterclockwise  sense. 

If  we  replace  t with  f*  = —t,  we  have  t = —t*  and  get 

r *((*)  = [2  cos  (— t*),  2 sin  (—£*)]  = [2  cos  f*,  —2  sin  f*]. 

This  has  reversed  the  orientation,  and  the  circle  is  now  oriented  clockwise. 

Ellipse 

The  vector  function 

(3)  r ( t ) = [a  cos  t,  b sin  t,  0]  = a cos  t i + b sin  t j (Fig.  202) 

represents  an  ellipse  in  the  xy-plane  with  center  at  the  origin  and  principal  axes  in  the  direction  of  the  x-  and 
y-axes.  In  fact,  since  cos2  t + sin2 1 = 1,  we  obtain  from  (3) 


If  b = a,  then  (3)  represents  a circle  of  radius  a. 


Straight  Line 

A straight  line  L through  a point  A with  position  vector  a in  the  direction  of  a constant  vector  b (see  Fig.  203) 
can  be  represented  parametrically  in  the  form 

(4)  r(f)  = a + fb  = [a\  + tbi,  a2  + tb^,  a%  + tb^]. 
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EXAMPLE  4 


If  b is  a unit  vector,  its  components  are  the  direction  cosines  of  L.  In  this  case,  1 1 | measures  the  distance  of  the 
points  of  L from  A.  For  instance,  the  straight  line  in  the  xy-plane  through  A:  (3,  2)  having  slope  1 is  (sketch  it) 

r(f)  = [3,  2,  0]  + f[l,  1,  0]  = [3  + t,  2 + t,  0], 


Fig.  203.  Parametric  representation  of  a straight  line 

A plane  curve  is  a curve  that  lies  in  a plane  in  space.  A curve  that  is  not  plane  is  called 
a twisted  curve.  A standard  example  of  a twisted  curve  is  the  following. 

Circular  Helix 

The  twisted  curve  C represented  by  the  vector  function 

(5)  r(r)  = [a  cos  t,  a sin  f,  ct]  = a cos  ti  + a sin  fj  + cfk  (c  A 0) 

is  called  a circular  helix.  It  lies  on  the  cylinder  x2  + y2  = a2.  If  c > 0,  the  helix  is  shaped  like  a right-handed 
screw  (Fig.  204).  If  c < 0,  it  looks  like  a left-handed  screw  (Fig.  205).  If  c = 0,  then  (5)  is  a circle. 


Fig.  204  Right-handed  circular  helix 


A simple  curve  is  a curve  without  multiple  points,  that  is,  without  points  at  which  the 
curve  intersects  or  touches  itself.  Circle  and  helix  are  simple  curves.  Figure  206  shows 
curves  that  are  not  simple.  An  example  is  [sin  2 1,  cos  t,  0].  Can  you  sketch  it? 

An  arc  of  a curve  is  the  portion  between  any  two  points  of  the  curve.  For  simplicity, 
we  say  “curve”  for  curves  as  well  as  for  arcs. 


><3  cxd  $ 

Fig.  206.  Curves  with  multiple  points 
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EXAMPLE  5 


Tangent  to  a Curve 

The  next  idea  is  the  approximation  of  a curve  by  straight  lines,  leading  to  tangents  and 
to  a definition  of  length.  Tangents  are  straight  lines  touching  a curve.  The  tangent  to  a 
simple  curve  C at  a point  P of  C is  the  limiting  position  of  a straight  line  L through  P 
and  a point  Q of  C as  Q approaches  P along  C.  See  Fig.  207. 

Let  us  formalize  this  concept.  If  C is  given  by  r(f),  and  P and  Q correspond  to  t and 
t + A?,  then  a vector  in  the  direction  of  L is 


(6) 


“~[r(f  + Ar)-r(r)]. 

At 


In  the  limit  this  vector  becomes  the  derivative 

(7)  r '(f)  = lim  -J-[r(r  + At)  - r(f)], 

At^O  At 

provided  r(f)  is  differentiable,  as  we  shall  assume  from  now  on.  If  v'  (t)  # 0,  we  call  r'(t) 
a tangent  vector  of  C at  P because  it  has  the  direction  of  the  tangent.  The  corresponding 
unit  vector  is  the  unit  tangent  vector  (see  Fig.  207) 


Note  that  both  r’  and  u point  in  the  direction  of  increasing  t.  Hence  their  sense  depends 
on  the  orientation  of  C.  It  is  reversed  if  we  reverse  the  orientation. 

It  is  now  easy  to  see  that  the  tangent  to  C at  P is  given  by 

(9)  q(w)  = r + wr1  (Fig.  208). 

This  is  the  sum  of  the  position  vector  r of  P and  a multiple  of  the  tangent  vector  rr  of  C 
at  P.  Both  vectors  depend  on  P.  The  variable  w is  the  parameter  in  (9). 


Fig.  207.  Tangent  to  a curve  Fig.  208.  Formula  (9)  for  the  tangent  to  a curve 


Tangent  to  an  Ellipse 

Find  the  tangent  to  the  ellipse  jjt2  + y2  = 1 at  P:  (V2,  1/V2). 

Solution.  Equation  (3)  with  semi-axes  a = 2 and  b = 1 gives  r(f)  = [2  cos  t , sin  f].  The  derivative  is 
r'  (l)  = [—2  sin  f,  cos  t\.  Now  P corresponds  to  t = tt/4  because 


r(Tr/4)  = [2  cos  (tt/4),  sin  (tt/4)]  = [V2,  1/V2]. 
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Hence  r’ (tt/4)  = [— V2,  1/V2],  From  (9)  we  thus  get  the  answer 

q(w)  = [V2,  1/V2]  + w[-V 2,  1/V2]  = [V2(l  - w),  (1/V2)(1  + w)]. 

To  check  the  result,  sketch  or  graph  the  ellipse  and  the  tangent. 


Length  of  a Curve 

We  are  now  ready  to  define  the  length  / of  a curve.  / will  be  the  limit  of  the  lengths  of 
broken  lines  of  n chords  (see  Fig.  209,  where  n = 5)  with  larger  and  larger  n.  For  this, 
let  r(f),  (tSi§  /?,  represent  C.  For  each  n = 1,  2,  ■ • • , we  subdivide  (“partition”)  the 
interval  a t = b by  points 

r0(=  a),  1 tn- 1,  tn(=  b),  where  t0  < t1  < ■■■  < tn. 


This  gives  a broken  line  of  chords  with  endpoints  r(fo),  ■ ■ • , r(tn).  We  do  this  arbitrarily 
but  so  that  the  greatest  A trn  = \tm  — tm_il  approaches  0 as  The  lengths 

li,  I2,  • • • of  these  chords  can  be  obtained  from  the  Pythagorean  theorem.  If  r(r)  has  a 
continuous  derivative  r (r),  it  can  be  shown  that  the  sequence  l\,  I2,  ■ • • has  a limit,  which 
is  independent  of  the  particular  choice  of  the  representation  of  C and  of  the  choice  of 
subdivisions.  This  limit  is  given  by  the  integral 


(10) 


rb 


l = 


V7 


r dt 


l is  called  the  length  of  C,  and  C is  called  rectifiable.  Formula  (10)  is  made  plausible 
in  calculus  for  plane  curves  and  is  proved  for  curves  in  space  in  [GenRef8]  listed  in  App.  1 . 
The  actual  evaluation  of  the  integral  (10)  will,  in  general,  be  difficult.  However,  some 
simple  cases  are  given  in  the  problem  set. 


Arc  Length  s of  a Curve 

The  length  (10)  of  a curve  C is  a constant,  a positive  number.  But  if  we  replace  the  fixed 
b in  (TO)  with  a variable  t,  the  integral  becomes  a function  of  t,  denoted  by  s(t)  and  called 
the  arc  length  function  or  simply  the  arc  length  of  C.  Thus 


(ID 


sit) 


rt 

Vr'  • r'  dT 


Here  the  variable  of  integration  is  denoted  by  t because  t is  now  used  in  the  upper  limit. 

Geometrically,  s (to)  with  some  to  > a is  the  length  of  the  arc  of  C between  the  points 
with  parametric  values  a and  to-  The  choice  of  a (the  point  s = 0)  is  arbitrary;  changing 
a means  changing  s by  a constant. 


Fig.  209  Length  of  a curve 
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Linear  Element  ds.  If  we  differentiate  (11)  and  square,  we  have 


(12) 


dr  dr 

• 

dt  dt 


Ir'(r)|2 


It  is  customary  to  write 

(13*)  dr  = [ dx , dy,  dz]  = dx i + dyj  + cfek 

and 


(13) 


ds2  = dr  • dr  = dx2  + dy2  + dz2. 


ds  is  called  the  linear  element  of  C. 


Arc  Length  as  Parameter.  The  use  of  s in  (1)  instead  of  an  arbitrary  t simplifies  various 
formulas.  For  the  unit  tangent  vector  (8)  we  simply  obtain 


(14)  u(s)  = r'(s). 

Indeed,  |r,(.s,)|  = ( ds/ds ) = 1 in  (12)  shows  that  r,(.v)  is  a unit  vector.  Even  greater 
simplifications  due  to  the  use  of  s will  occur  in  curvature  and  torsion  (below). 


Circular  Helix.  Circle.  Arc  Length  as  Parameter 

The  helix  r(f)  = [a  cos  t,  a sin  t,  ct ] in  (5)  has  the  derivative  r'  ( t ) = [—a  sin  t,  a cos  t,  c\.  Hence 
r • rf  = a2  + c2,  a constant,  which  we  denote  by  K2.  Hence  the  integrand  in  (11)  is  constant,  equal  to  K, 
and  the  integral  is  s = Kt.  Thus  t = s/K,  so  that  a representation  of  the  helix  with  the  arc  length  s as 
parameter  is 


(15) 


s s cs 

a cos—,  a sin — , — 
K K K 


K = Va2  + c2. 


A circle  is  obtained  if  we  set  c = 0.  Then  K = a,  t = s/ a,  and  a representation  with  arc  length  s as  parameter  is 


Curves  in  Mechanics.  Velocity.  Acceleration 

Curves  play  a basic  role  in  mechanics,  where  they  may  serve  as  paths  of  moving  bodies. 
Then  such  a curve  C should  be  represented  by  a parametric  representation  r(f)  with  time 
t as  parameter.  The  tangent  vector  (7)  of  C is  then  called  the  velocity  vector  v because, 
being  tangent,  it  points  in  the  instantaneous  direction  of  motion  and  its  length  gives  the 
speed  | v | = r ' = VV  • r = ds/dt\  see  (12).  The  second  derivative  of  r(t)  is  called 
the  acceleration  vector  and  is  denoted  by  a.  Its  length  |a|  is  called  the  acceleration  of 
the  motion.  Thus 


(16) 


v(0  = r’(t). 


a (t)  = \\t)  = r "(f). 
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EXAMPLE  7 


Tangential  and  Normal  Acceleration.  Whereas  the  velocity  vector  is  always  tangent 
to  the  path  of  motion,  the  acceleration  vector  will  generally  have  another  direction.  We 
can  split  the  acceleration  vector  into  two  directional  components,  that  is, 

(17)  ® — **tan  ®norm> 

where  the  tangential  acceleration  vector  atan  is  tangent  to  the  path  (or,  sometimes,  0) 
and  the  normal  acceleration  vector  anorm  is  normal  (perpendicular)  to  the  path  (or, 
sometimes,  0). 

Expressions  for  the  vectors  in  (17)  are  obtained  from  (16)  by  the  chain  rule.  We  first  have 


v(f) 


dr  _ dr  ds 
dt  ds  dt 


u(s) 


ds 

dt 


where  u(.v)  is  the  unit  tangent  vector  (14).  Another  differentiation  gives 

2 


(18) 


, . d\  d ( ds 


_ du(  ds 
ds\dt 


4-  ( \ d2s 

+ «<»>**• 


Since  the  tangent  vector  u(.v)  has  constant  length  (length  one),  its  derivative  du/ds  is 
perpendicular  to  u(.s),  from  the  result  in  Example  4 in  Sec.  9.4.  Hence  the  first  term  on 
the  right  of  (18)  is  the  normal  acceleration  vector,  and  the  second  term  on  the  right  is  the 
tangential  acceleration  vector,  so  that  (18)  is  of  the  form  (17). 

Now  the  length  | atan  I is  the  absolute  value  of  the  projection  of  a in  the  direction  of  v, 
given  by  (11)  in  Sec.  9.2  with  b = v;  that  is,  |atanl  = |a  * v|/|v|.  Hence  atan  is  this 
expression  times  the  unit  vector  (l/|v|)v  in  the  direction  of  v,  that  is, 


(18*) 


a • v 

atan  — y . y v- 


Also,  anorm  a 3tan- 


We  now  turn  to  two  examples  that  are  relevant  to  applications  in  space  travel. 
They  deal  with  the  centripetal  and  centrifugal  accelerations,  as  well  as  the  Coriolis 
acceleration. 


Centripetal  Acceleration.  Centrifugal  Force 

The  vector  function 

r (t)  = [/?  cos  cot,  R sin  cot]  = R cos  cot  i + R sin  cot  j (Fig.  210) 

(with  fixed  i and  j)  represents  a circle  C of  radius  R with  center  at  the  origin  of  the  xy-plane  and  describes  the 
motion  of  a small  body  B counterclockwise  around  the  circle.  Differentiation  gives  the  velocity  vector 

v = v'  = \ —Rco  sin  cot,  Rco  cos  cot]  = -Rco  sin  coti  + Rco  cos  cot  j (Fig.  210) 

v is  tangent  to  C.  Its  magnitude,  the  speed,  is 

|v|  = |r' | = vV  • r'  = Rco. 

Hence  it  is  constant.  The  speed  divided  by  the  distance  R from  the  center  is  called  the  angular  speed.  It  equals 
co,  so  that  it  is  constant,  too.  Differentiating  the  velocity  vector,  we  obtain  the  acceleration  vector 

(19)  a = v'  = [—Rco2  cos  cot,  —Rco2  sin  cot]  = —Rco2  cos  cot  i — Rco2  sin  cot  j. 
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This  shows  that  a = — ia2r  (Fig.  210),  so  that  there  is  an  acceleration  toward  the  center,  called  the  centripetal 
acceleration  of  the  motion.  It  occurs  because  the  velocity  vector  is  changing  direction  at  a constant  rate.  Its 
magnitude  is  constant,  |a|  = cu2|r|  = co2R.  Multiplying  a by  the  mass  m of  B,  we  get  the  centripetal  force  ma. 
The  opposite  vector  — ma  is  called  the  centrifugal  force.  At  each  instant  these  two  forces  are  in  equilibrium. 

We  see  that  in  this  motion  the  acceleration  vector  is  normal  (perpendicular)  to  C;  hence  there  is  no  tangential 
acceleration. 


Superposition  of  Rotations.  Coriolis  Acceleration 

A projectile  is  moving  with  constant  speed  along  a meridian  of  the  rotating  earth  in  Fig.  211.  Find  its  acceleration. 


Fig.  211.  Example  8.  Superposition  of  two  rotations 


Solution.  Let  x , y,  z be  a fixed  Cartesian  coordinate  system  in  space,  with  unit  vectors  i,  j,  k in  the  directions 
of  the  axes.  Let  the  Earth,  together  with  a unit  vector  b,  be  rotating  about  the  z-axis  with  angular  speed  co  > 0 
(see  Example  7).  Since  b is  rotating  together  with  the  Earth,  it  is  of  the  form 

b (t)  = cos  cot  i + sin  cur  j. 

Let  the  projectile  be  moving  on  the  meridian  whose  plane  is  spanned  by  b and  k (Fig.  211)  with  constant  angular 
speed  co  > 0.  Then  its  position  vector  in  terms  of  b and  k is 


r (t)  = R cos  yt  b (t)  + R sin  yt  k 


( R = Radius  of  the  Earth). 
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We  have  finished  setting  up  the  model.  Next,  we  apply  vector  calculus  to  obtain  the  desired  acceleration  of  the 
projectile.  Our  result  will  be  unexpected — and  highly  relevant  for  air  and  space  travel.  The  first  and  second 
derivatives  of  b with  respect  to  t are 

b ' (t)  = ~co  sin  cot  i + (o  cos  cot  j 

(20) 

b n{t)  = — (o2cos(oti  — ct)2sinw/j  = — co2b(t). 

The  first  and  second  derivatives  of  r (f)  with  respect  to  t are 

v = r ' (t)  = R cos  yt  W — yR  sin  yt  b + yR  cos  yt  k 

(21)  a = v'  = R cos  yt  b"  — 2y R sin  yt  b'  — y2R  cos  yt  b — y2R  sin  yt  k 

= R cos  yt  b"  — 2yR  sin  ytb'  — y2r. 

By  analogy  with  Example  7 and  because  of  b”  = —co2 b in  (20)  we  conclude  that  the  first  term  in  a (involving  co 
in  br  !)  is  the  centripetal  acceleration  due  to  the  rotation  of  the  Earth.  Similarly,  the  third  term  in  the  last  line  (involving 
y\)  is  the  centripetal  acceleration  due  to  the  motion  of  the  projectile  on  the  meridian  M of  the  rotating  Earth. 

The  second,  unexpected  term  — 2yR  sin  yt  b'  in  a is  called  the  Coriolis  acceleration3  (Fig.  211)  and  is 
due  to  the  interaction  of  the  two  rotations.  On  the  Northern  Hemisphere,  sin  yt  > 0 (for  t > 0;  also  y > 0 
by  assumption),  so  that  acor  has  the  direction  of  — b\  that  is,  opposite  to  the  rotation  of  the  Earth.  |acor| 
is  maximum  at  the  North  Pole  and  zero  at  the  equator.  The  projectile  B of  mass  ihq  experiences  a force 
— wioacor  opposite  to  ffloacor,  which  tends  to  let  B deviate  from  M to  the  right  (and  in  the  Southern 
Hemisphere,  where  sin  yt  < 0,  to  the  left).  This  deviation  has  been  observed  for  missiles,  rockets,  shells, 
and  atmospheric  airflow. 

Curvature  and  Torsion.  Optional 

This  last  topic  of  Sec.  9.5  is  optional  but  completes  our  discussion  of  curves  relevant  to 
vector  calculus. 

The  curvature  k(s)  of  a curve  C:  r (s)  (s  the  arc  length)  at  a point  P of  C measures  the 
rate  of  change  | u*  (s)  | of  the  unit  tangent  vector  u (v)  at  P.  Hence  k(s)  measures  the  deviation 
of  C at  P from  a straight  line  (its  tangent  at  P).  Since  u (s)  = r (.v),  the  definition  is 

(22)  k(s)  = |u'(s)|  = |r"(.y)  ('  = d/ds). 

The  torsion  t (5)  of  C at  P measures  the  rate  of  change  of  the  osculating  plane  O of 
curve  C at  point  P.  Note  that  this  plane  is  spanned  by  u and  u and  shown  in  Fig.  212. 
Hence  r (.y)  measures  the  deviation  of  C at  P from  a plane  (from  O at  P ).  Now  the  rate 
of  change  is  also  measured  by  the  derivative  b'  of  a normal  vector  b at  O.  By  the  definition 
of  vector  product,  a unit  normal  vector  of  O is  b = u X ( I /k)u  = u X p.Herep  = (1  /k)u 
is  called  the  unit  principal  normal  vector  and  b is  called  the  unit  binormal  vector  of  C 
at  P.  The  vectors  are  labeled  in  Fig.  212.  Here  we  must  assume  that  k j=  0;  hence  k > 0. 
The  absolute  value  of  the  torsion  is  now  defined  by 

(23*}  \t(s)\  = |b'(*)|. 


Whereas  k(s)  is  nonnegative,  it  is  practical  to  give  the  torsion  a sign,  motivated  by 
“right-handed”  and  “left-handed”  (see  Figs.  204  and  205).  This  needs  a little  further 
calculation.  Since  b is  a unit  vector,  it  has  constant  length.  Hence  b'  is  perpendicular 


3GUSTAVE  GASPARD  CORIOLIS  (1792-1843),  French  engineer  who  did  research  in  mechanics. 
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Fig.  212.  Trihedron.  Unit  vectors  u,  p,  b and  planes 

to  b (see  Example  4 in  Sec.  9.4).  Now  b'  is  also  perpendicular  to  u because,  by  the 
definition  of  vector  product,  we  have  b • u = 0,  b • =0.  This  implies 

(b  • u/  = 0;  that  is,  b'  • u + b • u'  = b'  • u + 0 = 0. 

Hence  if  b'  + 0 at  P,  it  must  have  the  direction  of  p or  — p,  so  that  it  must  be  of  the  form 
b = — rp.  Taking  the  dot  product  of  this  by  p and  using  p • p = 1 gives 

(23)  t(s)  = -p(j)  • b'Cs). 

The  minus  sign  is  chosen  to  make  the  torsion  of  a right-handed  helix  positive  and  that  of 
a left-handed  helix  negative  (Figs.  204  and  205).  The  orthonormal  vector  triple  u,  p,  b is 
called  the  trihedron  of  C.  Figure  212  also  shows  the  names  of  the  three  straight  lines  in 
the  directions  of  u,  p,  b,  which  are  the  intersections  of  the  osculating  plane,  the  normal 
plane,  and  the  rectifying  plane. 


PROBLEM  SET  9.5 


1-10 


PARAMETRIC  REPRESENTATIONS 


What  curves  are  represented  by  the  following? 
Sketch  them. 

1.  [3  + 2 cos  t,  2 sin  t,  0] 

2.  [a  + t,  b + 3 1,  c — 5f] 

3.  [0 ,t,t3] 

4.  [ — 2,  2 + 5 cos  t,  — 1 + 5 sin  t] 

5.  [2  + 4 cos  t,  I + sin  t,  0] 

6.  [a  + 3 cos  7 77,  b — 2 sin  7 Tt,  0] 

7.  [4  cos  f,  4 sin  t,  3?] 

8.  [cosh  f,  sinh  t,  2] 

9.  [cos  f,  sin  2 1,  0] 

10.  [r,  2,  1/t] 


11-20 


FIND  A PARAMETRIC  REPRESENTATION 


11.  Circle  in  the  plane  z = 1 with  center  (3,  2)  and  passing 
through  the  origin. 

12.  Circle  in  the  yz-plane  with  center  (4,  0)  and  passing 
through  (0,  3).  Sketch  it. 

13.  Straight  line  through  (2,  1,  3)  in  the  direction  of  i + 2j. 

14.  Straight  line  through  (1,  1,  1)  and  (4,  0,  2).  Sketch  it. 

15.  Straight  line  y = 4x  — 1 , z — 5x. 

16.  The  intersection  of  the  circular  cylinder  of  radius  1 
about  the  z-axis  and  the  plane  z — y. 

17.  Circle  \x2  + y2  = 1,  z = y. 

18.  Helix  x2  + y2  — 25,  z — 2 arctan  (y/x). 

19.  Hyperbola  4.v2  — 3y2  = 4,  z = — 2. 
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20.  Intersection  of  2x  — y + 3z  = 2 and  x + 2y  — z = 3. 

21.  Orientation.  Explain  why  setting  t = — t*  reverses 
the  orientation  of  [a  cos  t,  a sin  t,  0], 

22.  CAS  PROJECT.  Curves.  Graph  the  following  more 
complicated  curves: 

(a)  r(f)  = [2  cos  t + cos  2 t,  2 sin  t — sin  2 1]  ( Steiner’s 
hypocycloid). 

(b)  r (f)  = [cos  t + k cos  2 1,  sin  t — k sin  2 1]  with  k = 

10,  2,  1,§,0,  -§,  -1. 

(c)  r(f)  = [cos  f,  sin  5/]  (a  Lissajous  curve). 

(d)  r(f)  = [cosf,  sin  kt].  For  what  k's  will  it  be  closed? 

(e)  r(f)  = [A  sin  cot  + coRt,  R cos  cot  + R]  {cycloid). 

23.  CAS  PROJECT.  Famous  Curves  in  Polar  Form. 
Use  your  CAS  to  graph  the  following  curves4  given  in 
polar  form  p = p{6),  p2  = x2  + y2,  tan  9 = y/x,  and 
investigate  their  form  depending  on  parameters  a and  b. 


p = ad  Spired  of  Archimedes 
p — aebH  Logarithmic  spiral 
2 a sin2  6 

p = Cissoid  of  Diocles 

cos  6 


p = 1-  b Conchoid  of  Nicomedes 

cos  9 

p = a/ 6 Hyperbolic  spiral 

3a  sin  29  _ , _ „ 

p = — Folium  oj  Descartes 

cos3  6 + sin3  9 


p = 2 a Maclaurin  ’s  trisectrix 

sin  29 


24-28 


p = 2 a cos  9 + b 

TANGENT 


Pascal’s  snail 


Given  a curve  C:  r(f),  find  a tangent  vector  r \t),  a unit 
tangent  vector  u;(f),  and  the  tangent  of  C at  P.  Sketch  curve 
and  tangent. 


24.  r(t)  = \t,\t2,  1],  P:{ 2,2,1) 

25.  r (t)  = [10  cos  t,  1,  10  sin  f],  P:  (6,  1,  8) 

26.  r(f)  = [cos  t,  sin  t,  9t],  P:  (1,  0,  187t) 

27.  r(f)  = [t,  \/t,  0],  P:  (2,  §,  0) 

28.  r(f)  = [f,  f2,  f3],  P:  (1,1,1) 


29-32 


LENGTH 


Find  the  length  and  sketch  the  curve. 

29.  Catenary  r(f)  = [f,  cosh  t]  from  t = 0 to  t = 1. 

30.  Circular  helix  r(t)  — [4  cos  i.  4 sin  t.  5/]  from  (4,  0,  0) 
to  (4,  0,  107r). 


31.  Circle  r (f)  = [a  cos  t,  a sin  t ] from  (a,  0)  to  (0,  a). 

32.  Hypocycloid  r(f)  = [a  cos3  i,  a sin3  f],  total  length. 

33.  Plane  curve.  Show  that  Eq.  (10)  implies 
€ = J*V  1 + y2  dx  for  the  length  of  a plane  curve 

C:  y = f(x),  z — 0,  and  a = x = b. 

34.  Polar  coordinates  p = Vv2  + y2,  9 = arctan  (y/x) 
give 


where  p'  = dp/d9.  Derive  this.  Use  it  to  find  the  total 
length  of  the  cardioid  p = a(l  — cos  9).  Sketch  this 
curve.  Hint.  Use  (10)  in  App.  3.1. 


35-46 


CURVES  IN  MECHANICS 


Forces  acting  on  moving  objects  (cars,  airplanes,  ships,  etc.) 
require  the  engineer  to  know  corresponding  tangential  and 
normal  accelerations.  In  Probs.  35-38  find  them,  along 
with  the  velocity  and  speed.  Sketch  the  path. 


35.  Parabola  r(t)  = [f,  t2.  0].  Find  v and  a. 


36.  Straight  line  r(t)  = [8/,  6t,  0],  Find  v and  a. 

37.  Cycloid  r(f)  = ( R sin  cot  + Rt)  i + (R  cos  cot  + R)  j. 
This  is  the  path  of  a point  on  the  rim  of  a wheel  of 
radius  R that  rolls  without  slipping  along  the  x-axis. 
Find  v and  a at  the  maximum  y-values  of  the  curve. 

38.  Ellipse  r = [cos  t,  2 sin  t,  0], 


39—42  THE  USE  OF  A CAS  may  greatly  facilitate  the 
investigation  of  more  complicated  paths,  as  they  occur  in 
gear  transmissions  and  other  constructions.  To  grasp  the 
idea,  using  a CAS,  graph  the  path  and  find  velocity,  speed, 
and  tangential  and  normal  acceleration. 


39.  r (t)  = [cos  t + cos  2t,  sin  t — sin  2 1] 

40.  r (t)  = [2  cos  t + cos  2 1,  2 sin  t — sin  2t\ 

41.  r (t)  = [cos  t,  sin  2f,  cos  2t] 

42.  r (t)  — [ct  cos  t,  ct  sin  t,  cf]  (c  # 0) 

43.  Sun  and  Earth.  Find  the  acceleration  of  the  Earth 
toward  the  sun  from  (19)  and  the  fact  that  Earth 
revolves  about  the  sun  in  a nearly  circular  orbit  with 
an  almost  constant  speed  of  30  km/ s. 

44.  Earth  and  moon.  Find  the  centripetal  acceleration 
of  the  moon  toward  Earth,  assuming  that  the  orbit 
of  the  moon  is  a circle  of  radius  239,000  miles  = 
3.85  • 108  m,  and  the  time  for  one  complete  revolution 
is  27.3  days  = 2.36  • 106  s. 


4Named  after  ARCHIMEDES  (c.  287-212  B.C.),  DESCARTES  (Sec.  9.1),  DIOCLES  (200  B.C.), 

MACLAURIN  (Sec.  15.4),  NICOMEDES  (250?  B.c.)  ETIENNE  PASCAL  (1588-1651),  father  of  BLAISE 

PASCAL  (1623-1662). 
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45.  Satellite.  Find  the  speed  of  an  artificial  Earth  satellite 
traveling  at  an  altitude  of  80  miles  above  Earth’s 
surface,  where  g = 31  ft/sec2.  (The  radius  of  the  Earth 
is  3960  miles.) 

46.  Satellite.  A satellite  moves  in  a circular  orbit 
450  miles  above  Earth’s  surface  and  completes 
1 revolution  in  100  min.  Find  the  acceleration  of  gravity 
at  the  orbit  from  these  data  and  from  the  radius  of  Earth 
(3960  miles). 


47-55 


CURVATURE  AND  TORSION 


47.  Circle.  Show  that  a circle  of  radius  a has  curvature 
1/a. 

48.  Curvature.  Using  (22),  show  that  if  C is  represented 
by  r(f)  with  arbitrary  t,  then 


(22*) 


k(J)  = 


A/(r'  • r'Xr"  • r")  — (r'  • r")2 
(r'-r')3/2 


50.  Torsion.  Using  b = u x p and  (23),  show  that  (when 
k > 0) 

(23**)  r(i)  = (u  p p')  = (r'  r"  r "')/k2. 

51.  Torsion.  Show  that  if  C is  represented  by  r (f)  with 
arbitrary  parameter  t , then,  assuming  k > 0 as  before, 


(23***)  t (?)  = 


(r' 


ft  ftt 


rrr  \ 

r ) 


(r' . r')(r"  • r")  - (r'  • r"): 


it. 2 ' 


52.  Helix.  Show  that  the  helix  [a  cos  t,  a sin  t,  ct\  can 
be  represented  by  [a  cos  ( s/K ),  a sin  (s/K),  cs/K\ , 
where  K = V a2  + c2  and  5 is  the  arc  length.  Show 
that  it  has  constant  curvature  k = a/K2  and  torsion 
r = c/K2. 

53.  Find  the  torsion  of  C:  r(f)  = [t,  t2,  t3],  which  looks 
similar  to  the  curve  in  Fig.  212. 

54.  Frenet5  formulas.  Show  that 


49.  Plane  curve.  Using  (22*),  show  that  for  a curve 

T = /(*). 


(22**) 


k(x)  = 


1/1 

(1  + y'2)3/2 


u;  = k p,  p’  = — ku  + Tb,  br  = — rp. 

55.  Obtain  k and  r in  Prob.  52  from  (22*)  and  (23***) 
and  the  original  representation  in  Prob.  54  with 
parameter  t. 


9.6  Calculus  Review: 

Functions  of  Several  Variables.  Optional 

The  parametric  representations  of  curves  C required  vector  functions  that  depended  on  a 
single  variable  x,  s,  or  t.  We  now  want  to  systematically  cover  vector  functions  of  several 
variables.  This  optional  section  is  inserted  into  the  book  for  your  convenience  and  to  make 
the  book  reasonably  self-contained.  Go  onto  Sec.  9.7  and  consult  Sec.  9.6  only  when 
needed.  For  partial  derivatives,  see  App.  A3. 2. 

Chain  Rules 

Figure  213  shows  the  notations  in  the  following  basic  theorem. 


5JEAN-FREDERIC  FRENET  (1816-1900),  French  mathematician. 
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THEOREM  1 


Chain  Rule 

Let  w = fix,  y,  z ) be  continuous  and  have  continuous  first  partial  derivatives  in  a 
domain  D in  xyz-space.  Let  x = x(u,  v ),  y = y{u,  v),  z = z(u,  v ) be  functions  that 
are  continuous  and  have  first  partial  derivatives  in  a domain  B in  the  uv  -plane, 
where  B is  such  that  for  every  point  ( u , u)  in  B,  the  corresponding  point  [x(u,  v), 
y(u,  v),  z(u,  u)]  lies  in  D.  See  Fig.  213.  Then  the  function 

w = f(x(u , v ),  yiu,  v),  z(u,  u)) 

is  defined  in  B,  has  first  partial  derivatives  with  respect  to  u and  v in  B,  and 

dw  _ dw  dx  dw  By  Bw  Bz 

Bu  Bx  Bu  By  Bu  Bz  Bu 

(1) 

Bw  _ Bw  Bx  Bw  By  Bw  Bz 

Bv  Bx  Bv  By  Bv  Bz  Bv 


In  this  theorem,  a domain  D is  an  open  connected  point  set  in  xyz-space,  where 
“connected”  means  that  any  two  points  of  D can  be  joined  by  a broken  line  of  finitely 
many  linear  segments  all  of  whose  points  belong  to  D.  “Open”  means  that  every  point  P 
of  D has  a neighborhood  (a  little  ball  with  center  P)  all  of  whose  points  belong  to  D.  For 
example,  the  interior  of  a cube  or  of  an  ellipsoid  (the  solid  without  the  boundary  surface) 
is  a domain. 

In  calculus,  x,  y,  z are  often  called  the  intermediate  variables,  in  contrast  with  the 

independent  variables  u,  v and  the  dependent  variable  w. 


Special  Cases  of  Practical  Interest 

If  w = fix,  y)  and  x = x(m,  v),  y = y(u,  v ) as  before,  then  (1)  becomes 


(2) 


Bw  _ Bw  Bx  Bw  By 

Bu  Bx  Bu  By  Bu 

Bw  Bw  Bx  Bw  By 

Bu  Bx  Bu  By  Bu 


If  w = f{x,  y,  z)  and  x = x(f),  y = yit),  z = z(t),  then  (1)  gives 


dw  _ Bw  dx  Bw  dy  Bw  dz 

dt  Bx  dt  By  dt  Bz  dt 


(3) 
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EXAMPLE  1 


EXAMPLE  2 


If  w = fix,  y)  and  x = x{t),  y = y(t),  then  (3)  reduces  to 


(4) 


dw  _ dw  dx  dw  dy 

dt  dx  dt  By  dt 


Finally,  the  simplest  case  w = fix ),  x = x(t)  gives 


dw  _ dw  dx 
dt  dx  dt 


Chain  Rule 

If  w = x2  — y2  and  we  define  polar  coordinates  r.  0 by  x r cos  6,y  = r sin  B,  then  (2)  gives 

dw  0 0 

— = 2x  cos  6 — 2y  sin  6 = 2 r cosz  6 — 2r  sinz  6 = 2 r cos  26 
dr 

d\V  n 9 o h 

- — = 2 x(—r  sin  6)  — 2y(r  cos  6)  = —2 r cos  6 sin  6 — 2 r sin  6 cos  6 = —2 r sin  26. 


Partial  Derivatives  on  a Surface  z = g(x,y) 

Let  w = fix , y,  z)  and  let  z = g (x,  y)  represent  a surface  S in  space.  Then  on  S the  function 
becomes 


w{x,  y)  = fix,  y,g{x,y)). 

Hence,  by  (1),  the  partial  derivatives  are 

dw  = df  } Bfdg  d^=dl+dfdg_ 

dx  dx  d z dx  ’ dy  dy  dz  dy 

We  shall  need  this  formula  in  Sec.  10.9. 

Partial  Derivatives  on  Surface 

Let  w = f = x3  + y3  + z3  and  let  z = g = xz  + y2.  Then  (6)  gives 


[Z  = gix,  v)]. 


— = 3x2  + 3z2  ■ 2v  = 3v2  + 3(x2  + y2)2  ■ 2x, 
dx 

— = 3/  + 3z2  ■ 2y  = 3v2  + 3(x2  + v2)2  ■ 2y. 
dy 


3 3 2 2 3 

We  confirm  this  by  substitution,  using  w{x,  y)  = x + y +(x  +y),  that  is, 


dW  9 o 9 9 d\V  o O 9 9 

— = 3jc2  + 3(x2  + y2)2  • 2x,  — = 3y2  + 3(x2  4-  y2)2  • 2y. 

dx  dy 
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Mean  Value  Theorems 


THEOREM  2 


Mean  Value  Theorem 

Let  fix,  y,  z ) be  continuous  and  have  continuous  first  partial  derivatives  in  a 
domain  D in  xyz-space.  Let  If.  (a'o,  vq,  Zo)  and  P:  (aq  + h,  >’o  + k,  zo  + /)  be 
points  in  D such  that  the  straight  line  segment  PqP  joining  these  points  lies  entirely 
in  D.  Then 

(7)  fix  o + h,  j0  + k,z0  + l)  ~ f(xo,  Jo,  Zo)  = hf  + kff  + Iff-, 

ax  dj  az 

the  partial  derivatives  being  evaluated  at  a suitable  point  of  that  segment. 


Special  Cases 

For  a function  fix,  y)  of  two  variables  (satisfying  assumptions  as  in  the  theorem),  formula 
(7)  reduces  to  (Fig.  214) 


(8)  fix o + h,  y0  + k)  - /(.r0,  Jo)  = h-^_  + k~, 

and,  for  a function  fix)  of  a single  variable,  (7)  becomes 


(9)  /(x0  + h)  — fix0)  = h—, 

where  in  (9),  the  domain  D is  a segment  of  the  A-axis  and  the  derivative  is  taken  at  a 
suitable  point  between  xq  and  a<>  + h. 


Fig.  214.  Mean  value  theorem  for  a function  of  two  variables  [Formula  (8)] 


Gradient  of  a Scalar  Field. 
Directional  Derivative 


We  shall  see  that  some  of  the  vector  fields  that  occur  in  applications — not  all  of  them! — 
can  be  obtained  from  scalar  fields.  Using  scalar  fields  instead  of  vector  fields  is  of  a 
considerable  advantage  because  scalar  fields  are  easier  to  use  than  vector  fields.  It  is  the 
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“gradient”  that  allows  us  to  obtain  vector  fields  from  scalar  fields,  and  thus  the  gradient 
is  of  great  practical  importance  to  the  engineer. 


DEFINITION  1 


Gradient 

The  setting  is  that  we  are  given  a scalar  function  fix,  y,  z ) that  is  defined  and 
differentiable  in  a domain  in  3-space  with  Cartesian  coordinates  x,  y,  z.  We  denote 
the  gradient  of  that  function  by  grad  / or  V/  (read  nabla  /).  Then  the  qradient  of 
fix,  y,  z)  is  defined  as  the  vector  function 


(1) 


grad  / = V/  = 


dx’  dy’  dz. 


df.  df.  df , 

— i + — j H k. 

dx  dy J dz 


Remarks.  For  a definition  of  the  gradient  in  curvilinear  coordinates,  see  App.  3.4. 
As  a quick  example,  if  f(x,y,  z)  = 2 y3  + 4 xz  + 3x,  then  grad  /=  [4z  + 3,  6 y2,  4x], 
Furthermore,  we  will  show  later  in  this  section  that  (1)  actually  does  define  a vector. 
The  notation  V/  is  suggested  by  the  differential  operator  V (read  nabla ) defined  by 


(1*) 


+ 


d_ 

dz 


k. 


Gradients  are  useful  in  several  ways,  notably  in  giving  the  rate  of  change  of  fix , y,  z) 
in  any  direction  in  space,  in  obtaining  surface  normal  vectors,  and  in  deriving  vector  fields 
from  scalar  fields,  as  we  are  going  to  show  in  this  section. 

Directional  Derivative 

From  calculus  we  know  that  the  partial  derivatives  in  (1)  give  the  rates  of  change  of 
fix,  y,  z)  in  the  directions  of  the  three  coordinate  axes.  It  seems  natural  to  extend  this  and 
ask  for  the  rate  of  change  of  fin  an  arbitrary  direction  in  space.  This  leads  to  the  following 
concept. 


DEFINITION  2 


Directional  Derivative 

The  directional  derivative  I\f  or  df/ ds  of  a function  fix,  y,  z)  at  a point  P in  the 
direction  of  a vector  b is  defined  by  (see  Fig.  215) 


(2) 


df 

Dbf  = ~T  = hm 
ds 


/(g)  -m 

s 


Here  Q is  a variable  point  on  the  straight  line  L in  the  direction  of  b,  and  |s|  is  the 
distance  between  P and  Q.  Also,  .v  > 0 if  Q lies  in  the  direction  of  b (as  in  Fig.  215), 
s < 0 if  Q lies  in  the  direction  of  — b,  and  s = 0 if  Q = P. 
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The  next  idea  is  to  use  Cartesian  ryz-coordinates  and  for  b a unit  vector.  Then  the  line  L 
is  given  by 

(3)  r(s)  = x(s)i  + yO)j  + z(s)k  = p0  + sb  (|b|  = 1) 

where  p0  the  position  vector  of  P.  Equation  (2)  now  shows  that  Dbf  = df/ds  is  the 
derivative  of  the  function  y(s),  z(s ))  with  respect  to  the  arc  length  s of  L.  Hence, 

assuming  that /has  continuous  partial  derivatives  and  applying  the  chain  rule  [formula 
(3)  in  the  previous  section],  we  obtain 


(4) 


A,/ 


df 

ds 


. V , 
+ y 

dy 


where  primes  denote  derivatives  with  respect  to  s (which  are  taken  at  .v  = 0).  But  here, 
differentiating  (3)  gives  r = xi  + yj+zk  = b.  Hence  (4)  is  simply  the  inner  product 
of  grad /and  b [see  (2),  Sec.  9.2];  that  is, 


(5) 

ATTENTION! 

(5*) 


df 

Dbf  = — = b • grad  / 
as 


(|b| 


If  the  direction  is  given  by  a vector  a of  any  length  (=£0),  then 


df  l 

Daf  = ~ = — a • grad/. 
ds  a 


1). 


Gradient.  Directional  Derivative 

Find  the  directional  derivative  of  f(x,  y,  z)  = 2x2  + 3y2  + z2  at  P:  (2,  1,  3)  in  the  direction  of  a = [1,  0,  —2], 

Solution.  grad/=  [4.v,  6v,  2 z]  gives  at  P the  vector  grad  /(P)  = [8,  6,  6],  From  this  and  (5*)  we  obtain, 
since  |a|  = V5, 


DJ(P)  = [1,  0,  -2]  • [8,  6,  6] 

V5 


1 

(8  + 0 - 12)  = 

V5 


4 

V5 


-1.789. 


The  minus  sign  indicates  that  at  P the  function  / is  decreasing  in  the  direction  of  a. 
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THEOREM  1 


PROOF 


Gradient  Is  a Vector.  Maximum  Increase 

Here  is  a finer  point  of  mathematics  that  concerns  the  consistency  of  our  theory:  grad  / 
in  (1)  looks  like  a vector — after  all,  it  has  three  components!  But  to  prove  that  it  actually 
is  a vector,  since  it  is  defined  in  terms  of  components  depending  on  the  Cartesian 
coordinates,  we  must  show  that  grad / has  a length  and  direction  independent  of  the  choice 
of  those  coordinates.  See  proof  of  Theorem  1 . In  contrast,  [df/ dx,  23// dy,  3// 3z]  also  looks 
like  a vector  but  does  not  have  a length  and  direction  independent  of  the  choice  of  Cartesian 
coordinates. 

Incidentally,  the  direction  makes  the  gradient  eminently  useful:  grad  / points  in  the 
direction  of  maximum  increase  off. 


Use  of  Gradient:  Direction  of  Maximum  Increase 

Letf(P)  = f(x,  y,  z ) be  a scalar  function  having  continuous  first  partial  derivatives 
in  some  domain  B in  space.  Then  grad  f exists  in  B and  is  a vector , that  is,  its  length 
and  direction  are  independent  of  the  particular  choice  of  Cartesian  coordinates.  If 
grad  f(P)  + 0 at  some  point  P,  it  has  the  direction  of  maximum  increase  of  f at  P. 


From  (5)  and  the  definition  of  inner  product  [(1)  in  Sec.  9.2]  we  have 

(6)  Dbf=  | b 1 1 grad /|  cos  y = |grad/|  cos  y 

where  y is  the  angle  between  b and  grad/.  Now  /is  a scalar  function.  Hence  its  value  at 
a point  P depends  on  P but  not  on  the  particular  choice  of  coordinates.  The  same  holds 
for  the  arc  length  s of  the  line  L in  Fig.  215,  hence  also  for  Db  f Now  (6)  shows  that  Dbf 
is  maximum  when  cos  y = 1,  y = 0,  and  then  Dbf  = | grad / 1 . It  follows  that  the  length 
and  direction  of  grad  / are  independent  of  the  choice  of  coordinates.  Since  y = 0 if  and 
only  if  b has  the  direction  of  grad  f the  latter  is  the  direction  of  maximum  increase  off 
at  P,  provided  grad  / T 0 at  P.  Make  sure  that  you  understood  the  proof  to  get  a good 
feel  for  mathematics. 

Gradient  as  Surface  Normal  Vector 

Gradients  have  an  important  application  in  connection  with  surfaces,  namely,  as  surface 
normal  vectors,  as  follows.  Let  S be  a surface  represented  by/(jt,  y,  z)  — c = const,  where 
/is  differentiable.  Such  a surface  is  called  a level  surface  of / and  for  different  c we  get 
different  level  surfaces.  Now  let  C be  a curve  on  S through  a point  P of  S.  As  a curve  in 
space,  C has  a representation  r(?)  = [x(t),  y(t),  z(t)].  For  C to  lie  on  the  surface  S,  the 
components  of  r it)  must  satisfy  fix,  y,  z)  = c,  that  is, 

(7)  f(xlt),  y(t),  z(t)  = c. 

Now  a tangent  vector  of  C is  r '(f)  = [x'  it),  y' it),  z.' it)\.  And  the  tangent  vectors  of  all 
curves  on  S passing  through  P will  generally  form  a plane,  called  the  tangent  plane  of  S 
at  P.  (Exceptions  occur  at  edges  or  cusps  of  S,  for  instance,  at  the  apex  of  the  cone  in 
Fig.  217.)  The  normal  of  this  plane  (the  straight  line  through  P perpendicular  to  the  tangent 
plane)  is  called  the  surface  normal  to  S at  P.  A vector  in  the  direction  of  the  surface 
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THEOREM  2 


EXAMPLE  2 


normal  is  called  a surface  normal  vector  of  S at  P.  We  can  obtain  such  a vector  quite 
simply  by  differentiating  (7)  with  respect  to  t.  By  the  chain  rule, 

df  , df  , df  , 

— x + —y  + — z = (grad/)  • r =0. 
dx  dy  dz 

Hence  grad  /is  orthogonal  to  all  the  vectors  r'  in  the  tangent  plane,  so  that  it  is  a normal 
vector  of  S at  P.  Our  result  is  as  follows  (see  Fig.  216). 


Fig.  216.  Gradient  as  surface  normal  vector 


Gradient  as  Surface  Normal  Vector 

Let  f be  a differentiable  scalar  function  in  space.  Letf(x,  y,  z)  = c = const  represent 
a surface  S.  Then  if  the  gradient  offat  a point  P of  S is  not  the  zero  vector,  it  is  a 
normal  vector  of  S at  P. 


Gradient  as  Surface  Normal  Vector.  Cone 

Find  a unit  normal  vector  n of  the  cone  of  revolution  :2  = 4(.r2  + y2)  at  the  point  P:  (1,  0,  2). 
Solution.  The  cone  is  the  level  surface/  = 0 of  fix,  y,  z)  = 4(.v2  + y2)  — z2.  Thus  (Fig.  217) 


grad/=  [8a\  8v,  — 2z],  grad  f(P)  = [8.  0,  -4] 


= ^radWgrad/(P)^ 


' 2 
Vf’ 


0, 


1 ' 

Vf  ' 


n points  downward  since  it  has  a negative  z-component.  The  other  unit  normal  vector  of  the  cone  at  P is  — n. 


Fig.  217.  Cone  and  unit  normal  vector  n 
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THEOREM  3 


PROOF 


Vector  Fields  That  Are  Gradients 
of  Scalar  Fields  (“Potentials”) 

At  the  beginning  of  this  section  we  mentioned  that  some  vector  fields  have  the  advantage 
that  they  can  be  obtained  from  scalar  fields,  which  can  be  worked  with  more  easily.  Such 
a vector  field  is  given  by  a vector  function  v(P),  which  is  obtained  as  the  gradient  of  a 
scalar  function,  say,  v( P)  = grad  f(P).  The  function /(P)  is  called  a potential  function  or 
a potential  of  v (P).  Such  a v (P)  and  the  corresponding  vector  field  are  called  conservative 
because  in  such  a vector  field,  energy  is  conserved;  that  is,  no  energy  is  lost  (or  gained) 
in  displacing  a body  (or  a charge  in  the  case  of  an  electrical  field)  from  a point  P to  another 
point  in  the  field  and  back  to  P.  We  show  this  in  Sec.  10.2. 

Conservative  fields  play  a central  role  in  physics  and  engineering.  A basic  application 
concerns  the  gravitational  force  (see  Example  3 in  Sec.  9.4)  and  we  show  that  it  has  a 
potential  which  satisfies  Laplace’s  equation,  the  most  important  partial  differential 
equation  in  physics  and  its  applications. 


Gravitational  Field.  Laplace’s  Equation 

The  force  of  attraction 

c 

[x  - x0  y - yo  z - zo] 

(8)  p = - — r = -c 

3 ’ 3 ’ 3 

r 

L r r r J 

between  two  particles  at  points  Pq\  (xo,  Vo,  Zo)  and  P:  (x,  y,  z)  (as  given  by  Newton’s 

law  of  gravitation)  has  the  potential  /(x,  y,  z)  = c/r,  where  r (>  0)  is  the  distance 

between  Pq  and  P. 

Thus  p = grad/  = grad  (c/r).  This  potential  f is  a solution  q/Laplace’s  equation 

2 df 

a2/  a2/ 

(9)  V2/=  — 

H 2 H 2 ~ °- 

dx2 

ay2  dz2 

[V2/'  (read  nabla  squared  f)  is  called  the  Laplacian  of /.] 

That  distance  is  r = ((x  — x0)2  + (y  — y0)2  + (z  ~ z2)2)'  2-  The  key  observation  now 
is  that  for  the  components  of  p = [pl5  p2,  p->,}  we  obtain  by  partial  differentiation 


(10a) 


d / 1\  _ — 2(x  — xo)  _ x — Xo 

dx  \r/  2[(x  - x0)2  + (y  ~ yo )2  + (z  - Zo)¥/2  r3 


and  similarly 


(10b) 


A(T\  = _ y ~ yo 

dy\rj  r3 

¥)=  -sp. 
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From  this  we  see  that,  indeed,  p is  the  gradient  of  the  scalar  function/  = c/r.  The  second 
statement  of  the  theorem  follows  by  partially  differentiating  (10),  that  is, 


d2  A \ = _ J_  3(x  - x0f 

dx2\rj  r3  r5 

dy  \rj  r3  r5 

jVA  1 | 3(z  — zp)2 

dz2\rj  r3  r5 


and  then  adding  these  three  expressions.  Their  common  denominator  is  r5.  Hence  the 
three  terms  — 1 /r3  contribute  —3 r2  to  the  numerator,  and  the  three  other  terms  give 
the  sum 


3(x  - x0)2  + 3(y  - y0)2  + 3(z  - z0f  = 3 r2. 


so  that  the  numerator  is  0,  and  we  obtain  (9). 

V2f  is  also  denoted  by  A/.  The  differential  operator 


(ID 


V2 


d2  d2  d2 

+ + — 

-.2  2 -,2 
ax  dy  dz 


(read  “nabla  squared”  or  “delta”)  is  called  the  Laplace  operator.  It  can  be  shown  that  the 
field  of  force  produced  by  any  distribution  of  masses  is  given  by  a vector  function  that  is 
the  gradient  of  a scalar  function/,  and / satisfies  (9)  in  any  region  that  is  free  of  matter. 

The  great  importance  of  the  Laplace  equation  also  results  from  the  fact  that  there  are 
other  laws  in  physics  that  are  of  the  same  form  as  Newton’s  law  of  gravitation.  For  instance, 
in  electrostatics  the  force  of  attraction  (or  repulsion)  between  two  particles  of  opposite  (or 
like)  charge  Q1  and  Q 2 is 


(12) 


(Coulomb’s  law6). 


Laplace’s  equation  will  be  discussed  in  detail  in  Chaps.  12  and  18. 

A method  for  finding  out  whether  a given  vector  field  has  a potential  will  be  explained 
in  Sec.  9.9. 


6CHARLES  AUGUSTIN  DE  COULOMB  (1 736-1 S06),  French  physicist  and  engineer.  Coulomb’s  law  was 
derived  by  him  from  his  own  very  precise  measurements. 
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1-6 


CALCULATION  OF  GRADIENTS 


Find  grad  /.  Graph  some  level  curves  / = const.  Indicate 
V/ by  arrows  at  some  points  of  these  curves. 

1 ./=(*+  l)(2y  - 1) 

2.  / = 9x2  + 4y2 

3.  / = y/x 

4.  (y  + 6)2  + (x  - 4)2 


5./=x4  + y4 
6 ./=(x2-y2)/(x2  + y2) 


7-10 


USEFUL  FORMULAS  FOR  GRADIENT 


AND  LAPLACIAN 


Prove  and  illustrate  by  an  example. 
7.  V(/n)  = nfn-^f 


8-  V(/g)  = /Vg  + gV/ 

9-  V(//g)  = (l/g2)(gV/-/Vg) 

10.  V2(/g)  = gV2/  + 2V/-  Vg  +/V2g 


11-15 


USE  OF  GRADIENTS.  ELECTRIC  FORCE 


The  force  in  an  electrostatic  field  given  by/(x,  y,  z)  has  the 
direction  of  the  gradient.  Find  V/  and  its  value  at  P. 


11.  /=  xy,  P:  (—4,  5) 

12-  / = x/(x2  + y2),  P:  (1,  1) 

13.  /=  ln(x2  + y2),  P:  (8,  6) 

14.  / = (x2  + y2  + z2)_1/2  P ■ (12,  0,  16) 

15.  / = 4x2  + 9y2  + zz,  P:(5, -1,-11) 

16.  For  what  points  P:  (x,  y,  z)  does  Vf  with 
/ = 25x2  + 9y2  + 16c2  have  the  direction  from  P to 
the  origin? 

17.  Same  question  as  in  Prob.  16  when/=  25x2  + 4y2. 


18-23 


VELOCITY  FIELDS 


Given  the  velocity  potential /of  a flow,  find  the  velocity 
v = V/ of  the  field  and  its  value  \(P)  at  P.  Sketch  \(P) 
and  the  curve  / = const  passing  through  P. 

18.  /=  x2  - 6x  - y2,  P:  (—1,  5) 

19.  / = cos  x cosh  y,  P:  (|  7 r,  In  2) 

20.  / = x(l  + (x2  + y2)-1),  P:  (1,  1) 

21.  f = ex  cos  y,  P:(l,§7r) 


22.  At  what  points  is  the  flow  in  Prob.  21  directed  vertically 
upward? 


23.  At  what  points  is  the  flow  in  Prob.  21  horizontal? 


24-27 


HEAT  FLOW 


Experiments  show  that  in  a temperature  field,  heat  flows  in 
the  direction  of  maximum  decrease  of  temperature  T.  Find 
this  direction  in  general  and  at  the  given  point  P.  Sketch 
that  direction  at  P as  an  arrow. 

24.  T = 3x2  - 2y2,  P:  (2.5,  1.8) 

25.  T = z/(x2  + y2),  P:  (0,1,2) 

26.  T = x2  + y2  + 4c2,  P:  (2,-1,  2) 

27.  CAS  PROJECT.  Isotherms.  Graph  some  curves  of 
constant  temperature  (“isotherms”)  and  indicate 
directions  of  heat  flow  by  arrows  when  the  temperature 
equals  (a)  x3  — 3xv2,  (b)  sin  x sinh  y,  and  (c)  excos  y. 

28.  Steepest  ascent.  If  c(x,  y)  = 3000  - x2  - 9y2 
[meters]  gives  the  elevation  of  a mountain  at  sea  level, 
what  is  the  direction  of  steepest  ascent  at  P:  (4,  1)? 


29.  Gradient.  What  does  it  mean  if  | V/(P)  | > | V/(g)  I at 
two  points  P and  Q in  a scalar  field? 


9.8  Divergence  of  a Vector  Field 

Vector  calculus  owes  much  of  its  importance  in  engineering  and  physics  to  the  gradient, 
divergence,  and  curl.  From  a scalar  field  we  can  obtain  a vector  field  by  the  gradient 
(Sec.  9.7).  Conversely,  from  a vector  field  we  can  obtain  a scalar  field  by  the  divergence 
or  another  vector  field  by  the  curl  (to  be  discussed  in  Sec.  9.9).  These  concepts  were 
suggested  by  basic  physical  applications.  This  will  be  evident  from  our  examples. 

To  begin,  let  v(x,  y,  z ) be  a differentiable  vector  function,  where  x,  y,  z are  Cartesian 
coordinates,  and  let  1^,  v2,  be  the  components  of  v.  Then  the  function 


dUj 

dx 


+ 


dv2 

dy 


+ 


dV3 

dz 


(1) 


div  v = 
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is  called  the  divergence  of  v or  the  divergence  of  the  vector  field  defined  by  v.  For 
example,  if 

v = [3 xz,  2 xy,  — yz2]  = 3xzi  + 2xyj  — yz2 k,  then  div  y = 3z  + 2x  — 2 yz. 
Another  common  notation  for  the  divergence  is 


div  v 


V • v = 


'_a_  A A' 

dx  ’ dy  ’ dz 


( d 

d 

d , 

— 

i + — 

j + — k 

\dx 

dy 

dz  . 

dv1 

dv2 

K 

<■0 

— 

+ 

+ , 

dx 

dy 

dz 

• [Ui,  V2,  U3] 

(uii  + u2j  + u3k) 


with  the  understanding  that  the  “product”  {d/dx)vi  in  the  dot  product  means  the  partial 
derivative  dvi/dx,  etc.  This  is  a convenient  notation,  but  nothing  more.  Note  that  V • v 
means  the  scalar  div  v,  whereas  V/  means  the  vector  grad /defined  in  Sec.  9.7. 

In  Example  2 we  shall  see  that  the  divergence  has  an  important  physical  meaning. 
Clearly,  the  values  of  a function  that  characterizes  a physical  or  geometric  property  must 
be  independent  of  the  particular  choice  of  coordinates.  In  other  words,  these  values  must 
be  invariant  with  respect  to  coordinate  transformations.  Accordingly,  the  following 
theorem  should  hold. 


THEOREM  1 


Invariance  of  the  Divergence 

The  divergence  div  v is  a scalar  function,  that  is,  its  values  depend  only  on  the 
points  in  space  {and,  of  course,  on  v)  but  not  on  the  choice  of  the  coordinates  in 
(1),  so  that  with  respect  to  other  Cartesian  coordinates  x*,  y*,  z*  and  corresponding 
components  vf*,  v2*,  U3*  o/v, 


(2) 


dV$  dv2  dV3 

div  v = 1 1 . 

dx*  dy*  dz* 


We  shall  prove  this  theorem  in  Sec.  10.7,  using  integrals. 

Presently,  let  us  turn  to  the  more  immediate  practical  task  of  gaining  a feel  for  the 
significance  of  the  divergence.  Letf(x,  y,  z)  be  a twice  differentiable  scalar  function.  Then 
its  gradient  exists. 


df  df  df 
— i + — j + — k 
dx  dy  dz 

and  we  can  differentiate  once  more,  the  first  component  with  respect  to  x,  the  second  with 
respect  to  y,  the  third  with  respect  to  z,  and  then  form  the  divergence, 


v = grad/  = 


V V V 

dx’  dy’  dz 


a2/  a2/ 

dx2  dy2  dz2 ' 


div  v = div  (grad/) 
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EXAMPLE  1 


EXAMPLE  2 


Hence  we  have  the  basic  result  that  the  divergence  of  the  gradient  is  the  Laplacian 
(Sec.  9.7), 

(3)  div  (grad/)  = V2/. 


Gravitational  Force.  Laplace’s  Equation 

The  gravitational  force  p in  Theorem  3 of  the  last  section  is  the  gradient  of  the  scalar  function /(x,  y,  z)  = c/r, 
which  satisfies  Laplaces  equation  V y = 0.  According  to  (3)  this  implies  that  div  p = 0 (r  > 0). 

The  following  example  from  hydrodynamics  shows  the  physical  significance  of  the 
divergence  of  a vector  field.  We  shall  get  back  to  this  topic  in  Sec.  10.8  and  add  further 
physical  details. 

Flow  of  a Compressible  Fluid.  Physical  Meaning  of  the  Divergence 

We  consider  the  motion  of  a fluid  in  a region  R having  no  sources  or  sinks  in  R,  that  is,  no  points  at  which 
fluid  is  produced  or  disappears.  The  concept  of  fluid  state  is  meant  to  cover  also  gases  and  vapors.  Fluids  in 
the  restricted  sense,  or  liquids,  such  as  water  or  oil,  have  very  small  compressibility,  which  can  be  neglected  in 
many  problems.  In  contrast,  gases  and  vapors  have  high  compressibility.  Their  density  p (=  mass  per  unit  volume) 
depends  on  the  coordinates  x,  y,  z in  space  and  may  also  depend  on  time  t.  We  assume  that  our  fluid  is 
compressible.  We  consider  the  flow  through  a rectangular  box  B of  small  edges  Ax,  Ay,  A z parallel  to  the 
coordinate  axes  as  shown  in  Fig.  218.  (Here  A is  a standard  notation  for  small  quantities  and,  of  course,  has 
nothing  to  do  with  the  notation  for  the  Laplacian  in  (1 1)  of  Sec.  9.7.)  The  box  B has  the  volume  AV  = Ax  Ay  A z. 
Let  v = [vi,  v2, 1^3]  = i^ii  + i>2j  + be  the  velocity  vector  of  the  motion.  We  set 

(4)  u = pv  = [mi,  u2,  u3\  = u ii  + u2 j + u3k 

and  assume  that  u and  v are  continuously  differentiable  vector  functions  of  x,  y,  z,  and  t,  that  is,  they  have  first 
partial  derivatives  which  are  continuous.  Let  us  calculate  the  change  in  the  mass  included  in  B by  considering 
the  flux  across  the  boundary,  that  is,  the  total  loss  of  mass  leaving  B per  unit  time.  Consider  the  flow  through 
the  left  of  the  three  faces  of  B that  are  visible  in  Fig.  218,  whose  area  is  Ax  A z.  Since  the  vectors  Uji  and  v3  k 
are  parallel  to  that  face,  the  components  Vi  and  v3  of  v contribute  nothing  to  this  flow.  Hence  the  mass  of  fluid 
entering  through  that  face  during  a short  time  interval  A?  is  given  approximately  by 

( pv2)y  Ax  A z At  = ( u2)y  Ax  A z At, 

where  the  subscript  y indicates  that  this  expression  refers  to  the  left  face.  The  mass  of  fluid  leaving  the  box  B 
through  the  opposite  face  during  the  same  time  interval  is  approximately  (u2)y+^y  Ax  A z At,  where  the  subscript 
y + Ay  indicates  that  this  expression  refers  to  the  right  face  (which  is  not  visible  in  Fig.  218).  The  difference 

A u2 

Au2  Ax  A z At  = — — AV  At  [A u2  = (u2)y+Ay  - (u2)y] 

Ay 

is  the  approximate  loss  of  mass.  Two  similar  expressions  are  obtained  by  considering  the  other  two  pairs  of 
parallel  faces  of  B.  If  we  add  these  three  expressions,  we  find  that  the  total  loss  of  mass  in  B during  the  time 
interval  A?  is  approximately 


/ Aui  A u2  Au3\ 

+ + AV  At, 

\ Ax  Ay  A z J 


where 


A «i  = («i)*+a*  - (Ml)*  and  A u3  = ( u3)z+Az  - («3)z. 

This  loss  of  mass  in  B is  caused  by  the  time  rate  of  change  of  the  density  and  is  thus  equal  to 


dp 

— AVAf. 
dt 
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Fig.  218.  Physical  interpretation  of  the  divergence 

If  we  equate  both  expressions,  divide  the  resulting  equation  by  AV  At,  and  let  Ax,  Ay,  A z,  and  At  approach 
zero,  then  we  obtain 

dp 

div  u = div  (pv)  = — — 


or 

dp 

(5)  — + div  (pv)  = 0. 

at 

This  important  relation  is  called  the  condition  for  the  conservation  of  mass  or  the  continuity  equation  of  a 
compressible  fluid  flow. 

If  the  flow  is  steady,  that  is,  independent  of  time,  then  dp/ dt  = 0 and  the  continuity  equation  is 

(6)  div  (pv)  = 0. 

If  the  density  p is  constant,  so  that  the  fluid  is  incompressible,  then  equation  (6)  becomes 

(7)  div  v = 0. 

This  relation  is  known  as  the  condition  of  incompressibility.  It  expresses  the  fact  that  the  balance  of  outflow 
and  inflow  for  a given  volume  element  is  zero  at  any  time.  Clearly,  the  assumption  that  the  flow  has  no  sources 
or  sinks  in  R is  essential  to  our  argument,  v is  also  referred  to  as  solenoidal. 

From  this  discussion  you  should  conclude  and  remember  that,  roughly  speaking,  the  divergence  measures 
outflow  minus  inflow. 

Comment.  The  divergence  theorem  of  Gauss,  an  integral  theorem  involving  the 
divergence,  follows  in  the  next  chapter  (Sec.  10.7). 


CALCULATION  OF  THE  DIVERGENCE 
Find  div  v and  its  value  at  P. 

1.  v = [x2,  4y2,  9z\  P:  (-1,0,  J] 

2.  v = [0,  cos  xyz,  sin  xyz],  P:  (2,  \ tt,  0] 

3.  v = (x2  + v2rW] 

4.  v = [ux(y,  z),  v2(z,  x ),  v3(x,  y)],  P:  (3,  1,  -1)] 


5.  v = x2y2z2[x,  y,  z],  P:  (3,-1,  4) 

6.  v = (jc2  + y2  + z2r3/2[x,  y,  z] 

7.  For  what  v3  is  v = [ex  cos  y,  ex  sin  y,  U3]  solenoidal? 

8.  Let  v = [x,  y,  u3].  Find  a v3  such  that  (a)  div  v > 0 
everywhere,  (b)  div  v > 0 if  |z|  < 1 and  div  v < 0 if 
Izl  > 1. 
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9.  PROJECT.  Useful  Formulas  for  the  Divergence. 

Prove 

(a)  div  (fcv)  = k div  v ( k constant) 

(b)  div  (/v)  = /div  v + v • V/ 

(c)  div  (/Vg)  = /V2g  + V/*  Vg 

(d)  div  (/Vg)  - div  (gV/)  =/V2g  - gV2/ 

Verify  (b)  for  / = and  v = ax i + by j + czk. 
Obtain  the  answer  to  Prob.  6 from  (b).  Verify  (c)  for 
/ = x2  — y2  and  g = ex+v.  Give  examples  of  your 
own  for  which  (a)-(d)  are  advantageous. 

10.  CAS  EXPERIMENT.  Visualizing  the  Divergence. 
Graph  the  given  velocity  field  v of  a fluid  flow  in  a 
square  centered  at  the  origin  with  sides  parallel  to  the 
coordinate  axes.  Recall  that  the  divergence  measures 
outflow  minus  inflow.  By  looking  at  the  flow  near  the 
sides  of  the  square,  can  you  see  whether  div  v must 
be  positive  or  negative  or  may  perhaps  be  zero?  Then 
calculate  div  v.  First  do  the  given  flows  and  then  do 
some  of  your  own.  Enjoy  it. 

(a)  v = i 

(b)  v = xi 

(c)  V = x\  - yj 

(d)  v = xi  + yj 

(e)  v = -xi  - yj 

(f)  v = (x2  + y2)_1(-yi  + xj) 

11.  Incompressible  flow.  Show  that  the  flow  with  velocity 
vector  v = yi  is  incompressible.  Show  that  the  particles 


that  at  time  t = 0 are  in  the  cube  whose  faces  are 
portions  of  the  planes  x = 0,  x = l,y  = 0,  y = 1, 
z = 0,  z — 1 occupy  at  t = 1 the  volume  1. 

12.  Compressible  flow.  Consider  the  flow  with  velocity 
vector  v = xi.  Show  that  the  individual  particles  have 
the  position  vectors  r(f)  = cyeH  + c2j  + c3k  with 
constant  ci,  c2,  c3.  Show  that  the  particles  that  at 
t — 0 are  in  the  cube  of  Prob.  1 1 at  t = 1 occupy  the 
volume  e. 

13.  Rotational  flow.  The  velocity  vector  v ( x , y,  z)  of  an 
incompressible  fluid  rotating  in  a cylindrical  vessel  is 
of  the  form  v = w X r,  where  w is  the  (constant) 
rotation  vector;  see  Example  5 in  Sec.  9.3.  Show  that 
div  v = 0.  Is  this  plausible  because  of  our  present 
Example  2? 

14.  Does  div  u = div  v imply  u = v or  u = v + k 
(k  constant)?  Give  reason. 
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LAPLACIAN 


Calculate  V2/'  by  Eq.  (3).  Check  by  direct  differentiation. 
Indicate  when  (3)  is  simpler.  Show  the  details  of  your  work. 

15.  / = cos2  x + sin2  y 

16. /=  exyz 


17.  / = In  (xz  + y2) 

18. /  = z - Vx2  + y2 

19.  / = l/(x2  + y2  + z2) 

20.  / = e2x  cosh  2y 


Curl  of  a Vector  Field 


The  concepts  of  gradient  (Sec.  9.7),  divergence  (Sec.  9.8),  and  curl  are  of  fundamental 
importance  in  vector  calculus  and  frequently  applied  in  vector  fields.  In  this  section 
we  define  and  discuss  the  concept  of  the  curl  and  apply  it  to  several  engineering 
problems. 

Let  v(x,y,  z ) = [i1 1,  v2,  t'3]  = U]i  + u2j  + u3k  be  a differentiable  vector  function  of 
the  Cartesian  coordinates  x,  y,  z.  Then  the  curl  of  the  vector  function  v or  of  the  vector 
field  given  by  v is  defined  by  the  “symbolic”  determinant 


curl  v 


V x v = 


d_ 

dx 


Vl 


j 

_a_ 

dy 

V2 


k 

_d_ 

dz 

V3 


/dv3  dv2\  /dUi  dl>3\  fdv2  dfi\ 
\ dy  dz  J1  V dz  dx  P V dx  dy  J 


(1) 
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EXAMPLE  1 


EXAMPLE  2 


THEOREM  1 


This  is  the  formula  when  x,  y,  z are  right-handed.  If  they  are  left-handed,  the  determinant 
has  a minus  sign  in  front  (just  as  in  (2**)  in  Sec.  9.3). 

Instead  of  curl  v one  also  uses  the  notation  rot  v.  This  is  suggested  by  “rotation,” 
an  application  explored  in  Example  2.  Note  that  curl  v is  a vector,  as  shown  in 
Theorem  3. 


Curl  of  a Vector  Function 


Let  v = [yz , 3 zx,  z ] = yz\  + 3zxj  + zk  with  right-handed  x,  y,  z.  Then  (1)  gives 


1 

i 

k 

a 

a 

a 

dx 

dy 

dz 

yz 

3 zx 

z 

-3xi  + yj  + (3 z ~ z) k = -3xi  + yj  + 2zk. 


The  curl  has  many  applications.  A typical  example  follows.  More  about  the  nature  and 
significance  of  the  curl  will  be  considered  in  Sec.  10.9. 


Rotation  of  a Rigid  Body.  Relation  to  the  Curl 

We  have  seen  in  Example  5,  Sec.  9.3,  that  a rotation  of  a rigid  body  B about  a fixed  axis  in  space  can  be 
described  by  a vector  w of  magnitude  co  in  the  direction  of  the  axis  of  rotation,  where  co  (>0)  is  the  angular 
speed  of  the  rotation,  and  w is  directed  so  that  the  rotation  appears  clockwise  if  we  look  in  the  direction  of  w. 
According  to  (9),  Sec.  9.3,  the  velocity  field  of  the  rotation  can  be  represented  in  the  form 

v = w x r 


where  r is  the  position  vector  of  a moving  point  with  respect  to  a Cartesian  coordinate  system  having  the  origin 
on  the  axis  of  rotation.  Let  us  choose  right-handed  Cartesian  coordinates  such  that  the  axis  of  rotation  is  the 
z-axis.  Then  (see  Example  2 in  Sec.  9.4) 

w = [0,  0,  co]  = cuk,  v = w X r = [— coy,  cox,  0]  = - coy\  + cox]. 

Hence 


curl  v = 


a 

dx 


-coy 


a 


— [0,  0,  2 co\  = look.  = 2w. 


This  proves  the  following  theorem. 


Rotating  Body  and  Curl 

The  curl  of  the  velocity  field  of  a rotating  rigid  body  has  the  direction  of 
the  axis  of  the  rotation,  and  its  magnitude  equals  twice  the  angular  speed  of  the 
rotation. 


Next  we  show  how  the  grad,  div,  and  curl  are  interrelated,  thereby  shedding  further  light 
on  the  nature  of  the  curl. 
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THEOREM  2 


Grad,  Div,  Curl 

Gradient  fields  are  irrotational . That  is,  if  a continuously  differentiable  vector 
function  is  the  gradient  of  a scalar  function  f then  its  curl  is  the  zero  vector, 

(2)  curl  (grad/)  = 0. 

Furthermore,  the  divergence  of  the  curl  of  a twice  continuously  differentiable  vector 
function  v is  zero, 

(3)  div  (curl  v)  = 0. 


PROOF 


Both  (2)  and  (3)  follow  directly  from  the  definitions  by  straightforward  calculation.  In  the 
proof  of  (3)  the  six  terms  cancel  in  pairs.  ■ 


EXAMPLE  3 Rotational  and  Irrotational  Fields 

The  field  in  Example  2 is  not  irrotational.  A similar  velocity  field  is  obtained  by  stirring  tea  or  coffee  in  a cup. 
The  gravitational  field  in  Theorem  3 of  Sec.  9.7  has  curl  p = 0.  It  is  an  irrotational  gradient  field. 


The  term  “irrotational”  for  curl  v = 0 is  suggested  by  the  use  of  the  curl  for  characterizing 
the  rotation  in  a field.  If  a gradient  field  occurs  elsewhere,  not  as  a velocity  field,  it  is 
usually  called  conservative  (see  Sec.  9.7).  Relation  (3)  is  plausible  because  of  the 
interpretation  of  the  curl  as  a rotation  and  of  the  divergence  as  a flux  (see  Example  2 in 
Sec.  9.8). 

Finally,  since  the  curl  is  defined  in  terms  of  coordinates,  we  should  do  what  we  did  for 
the  gradient  in  Sec.  9.7,  namely,  to  find  out  whether  the  curl  is  a vector.  This  is  true,  as 
follows. 


THEOREM  3 


Invariance  of  the  Curl 

curl  v is  a vector.  It  has  a length  and  a direction  that  are  independent  of  the  particular 
choice  of  a Cartesian  coordinate  system  in  space. 


PROOF  The  proof  is  quite  involved  and  shown  in  App.  4. 

We  have  completed  our  discussion  of  vector  differential  calculus.  The  companion 
Chap.  10  on  vector  integral  calculus  follows  and  makes  use  of  many  concepts  covered 
in  this  chapter,  including  dot  and  cross  products,  parametric  representation  of  curves  C, 
along  with  grad,  div,  and  curl. 


PR  Q BEEMSET  9 9 


1.  WRITING  REPORT.  Grad,  div,  curl.  List  the 
definitions  and  most  important  facts  and  formulas  for 
grad,  div,  curl,  and  V1 2.  Use  your  list  to  write  a 
corresponding  report  of  3^4  pages,  with  examples  of 

your  own.  No  proofs. 


2.  (a)  What  direction  does  curl  v have  if  v is  parallel 
to  the  yz-plane?  (b)  If,  moreover,  v is  independent 
of  xl 

3.  Prove  Theorem  2.  Give  two  examples  for  (2)  and  (3) 
each. 


Chapter  9 Review  Questions  and  Problems 
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CALCULUTION  OF  CURL 


Find  curl  v for  v given  with  respect  to  right-handed 
Cartesian  coordinates.  Show  the  details  of  your  work. 

4.  v = [2y2,  5x,  0] 

5.  v = xyz  [x,  y,  z] 

6.  v = ( x 2 + y2  + zY3/2  [x,  y,  z] 

7.  v = [0,  0,  e~x  sin  y] 

8.  v = \_e~z\  e~ e-y2] 
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FLUID  FLOW 


Let  v be  the  velocity  vector  of  a steady  fluid  flow.  Is  the 
flow  irrotational?  Incompressible?  Find  the  streamlines  (the 
paths  of  the  particles).  Hint.  See  the  answers  to  Probs.  9 
and  1 1 for  a determination  of  a path. 

9.  v = [0,  3z2,  0] 


10.  v = [sec  x,  esc  x,  0] 


11.  v = [y,  —2x,  0] 

12.  v = [—  y,  x,  7 r] 

13.  v = [x,  y,  — z\ 


14.  PROJECT.  Useful  Formulas  for  the  Curl.  Assuming 
sufficient  differentiability,  show  that 

(a)  curl  (u  + v)  = curl  u + curl  v 

(b)  div  (curl  v)  = 0 

(c)  curl  (/v)  = (grad/)  X v + / curl  v 

(d)  curl  (grad/)  = 0 

(e)  div  (u  X v)  = v • curl  u — u • curl  v 
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DIV  AND  CURL 


With  respect  to  right-handed  coordinates,  let  u = \y,  z,  x], 
v = [yz,  zx,  xy],/  = xyz,  and  g = x + y + z-  Find  the  given 
expressions.  Check  your  result  by  a formula  in  Proj.  14 
if  applicable. 


15.  curl  (u  + v),  curl  v 


16.  curl  (gv) 

17.  v • curl  u,  u • curl  v,  u • curl  u 


18.  div  (u  x v) 


19.  curl  (gu  + v),  curl  (gu) 

20.  div  (grad  (/g)) 


mKB9REMElLQUES  T I O N S AND  PROBLEMS 


1.  What  is  a vector?  A vector  function?  A vector  field?  A 
scalar?  A scalar  function?  A scalar  field?  Give  examples. 

2.  What  is  an  inner  product,  a vector  product,  a scalar  triple 
product?  What  applications  motivate  these  products? 

3.  What  are  right-handed  and  left-handed  coordinates? 
When  is  this  distinction  important? 

4.  When  is  a vector  product  the  zero  vector?  What  is 
orthogonality? 

5.  How  is  the  derivative  of  a vector  function  defined? 
What  is  its  significance  in  geometry  and  mechanics? 

6.  If  r (t)  represents  a motion,  what  are  r'(t),  |r,(t)|,  r " (t), 
and  |rw(f)|? 

7.  Can  a moving  body  have  constant  speed  but  variable 
velocity?  Nonzero  acceleration? 

8.  What  do  you  know  about  directional  derivatives?  Their 
relation  to  the  gradient? 

9.  Write  down  the  definitions  and  explain  the  significance 
of  grad,  div,  and  curl. 

10.  Granted  sufficient  differentiability,  which  of  the 
following  expressions  make  sense?  / curl  v,  v curl  f 
U X V,  U X V X w,  /•  V,  /•  (v  x w),  u • (v  X w), 
v x curl  v,  div  (/v),  curl  (/v),  and  curl  (/•  v). 

ALGEBRAIC  OPERATIONS  FOR  VECTORS 

Let  a = [4,  7,  0],  b = [3,  -1,  5],  c = [-6,  2,  0],  and  d = 
[1,  — 2,  8].  Calculate  the  following  expressions.  Try  to 
make  a sketch. 

11.  a • c,  3b  • 8d,  24d  • b,  a • a 


12.  a x c,  b x d,  d x b,  a x a 

13.  b x c,  c x b,  c x c,  c • c 

14.  5(a  x b)  • c,  a • (5b  x c),  (5a  b c),  5(a  • b)  x c 

15.  6(a  x b)  x d,  a x 6(b  x d),  2a  x 3b  x d 

16.  (l/|a|)a,  ( 1/ 1 b | )b,  a * b/ 1 b | , a * b/ 1 a | 

17.  (a  b d),  (b  a d),  (b  d a) 

18.  |a  + b|,  |a|  + |b| 

19.  a x b — b x a,  (a  x c)  • c,  |a  x b| 

20.  Commutativity.  When  is  u x v = v x u?  When  is 
u • v = v • u? 

21.  Resultant,  equilibrium.  Find  u such  that  u and  a,  b, 

c,  d above  and  u are  in  equilibrium. 

22.  Resultant.  Find  the  most  general  v such  that  the  resultant 
of  v,  a,  b,  c (see  above)  is  parallel  to  the  yz-plane. 

23.  Angle.  Find  the  angle  between  a and  c.  Between  b and 

d.  Sketch  a and  c. 

24.  Planes.  Find  the  angle  between  the  two  planes 
P\.  4x  — y + 3 z = 12andP2:  x + 2y  + 4z  = 4.  Make 
a sketch. 

25.  Work.  Find  the  work  done  by  q = [5,  2,  0]  in  the 
displacement  from  (1,  1,  0)  to  (4,  3,  0). 

26.  Component.  When  is  the  component  of  a vector  v in 
the  direction  of  a vector  w equal  to  the  component  of 
w in  the  direction  of  v? 

27.  Component.  Find  the  component  of  v = [4,  7,  0]  in 
the  direction  of  w = [2,  2,  0].  Sketch  it. 
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28.  Moment.  When  is  the  moment  of  a force  equal  to  zero? 

29.  Moment.  A force  p = [4,  2,  0]  is  acting  in  a line 
through  (2,  3,  0).  Find  its  moment  vector  about  the 
center  (5,  1,  0)  of  a wheel. 

30.  Velocity,  acceleration.  Find  the  velocity,  speed, 
and  acceleration  of  the  motion  given  by  r (?)  = [3 
cos  t,  3 sin  T,  At]  ( t = time)  at  the  point  P : (3/V2, 
3/ V2,  7 r). 

31.  Tetrahedron.  Find  the  volume  if  the  vertices  are 
(0,  0,  0),  (3,  1,  2),  (2,  4,  0),  (5,  4,  0). 


32-40 


GRAD,  DIV,  CURL,  V2,  Dvf 


Let  f — xv  — yz,  v = [2y,  2z,  Ax  + z],  and  w = [3z2, 
x2  - v2,  >'2].  Find: 

32.  grad  / and  / grad  / at  P:  (2,  7,  0) 

33.  div  v,  div  w 34.  curl  v,  curl  w 

35.  div  (grad/),  V2/  V2(xy/) 

36.  (curl  w)  • v at  (4,  0,  2)  37.  grad  (div  w) 

38.  Dvf  at  P:  (1,  1,  2)  39.  Dwf  at  P:  (3,  0,  2) 

40.  v • ((curl  w)  x v) 


SUMMARY  OF  CHAPTER  9 

Vector  Differential  Calculus.  Grad,  Div,  Curl 


All  vectors  of  the  form  a = [aq,  a2,  <33]  = a\i  + a2 j + a3k  constitute  the  real 
vector  space  R3  with  componentwise  vector  addition 

(1)  [ai,  a2,  a3]  + [b1}  b2,  b3]  = [ax  + bx,  a2  + b2,  a3  + b3\ 
and  componentwise  scalar  multiplication  (c  a scalar,  a real  number) 

(2)  c[flr,  a2,  a3]  = [cai,  ca2,  ca3]  (Sec.  9.1). 

For  instance,  the  resultant  of  forces  a and  b is  the  sum  a + b. 

The  inner  product  or  dot  product  of  two  vectors  is  defined  by 

(3)  a • b = | a 1 1 b | cos  y = a^b-i  + a2b2  + a3b3  (Sec.  9.2) 

where  y is  the  angle  between  a and  b.  This  gives  for  the  norm  or  length  |a|  of  a 

(4)  | a | = Va  • a = Vaf  + a\  + a3 

as  well  as  a formula  for  y.  If  a • b = 0,  we  call  a and  b orthogonal.  The  dot  product 
is  suggested  by  the  work  W = p • d done  by  a force  p in  a displacement  d. 

The  vector  product  or  cross  product  v = a x b is  a vector  of  length 

(5)  |a  x b|  = | a | |b|  sin  y (Sec.  9.3) 

and  perpendicular  to  both  a and  b such  that  a,  b,  v form  a right-handed  triple.  In 
terms  of  components  with  respect  to  right-handed  coordinates, 

i j k 

Cl\  &2  CI3 

b\  b2  b3 


(6) 


a x b = 


(Sec.  9.3). 


Summary  of  Chapter  9 
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The  vector  product  is  suggested,  for  instance,  by  moments  of  forces  or  by  rotations. 
CAUTION  This  multiplication  is  anhcommutative,  axb  = — b X a,  and  is  not 
associative. 

An  (oblique)  box  with  edges  a,  b,  c has  volume  equal  to  the  absolute  value  of 

the  scalar  triple  product 

(7)  (a  b c)  = a • (b  x c)  = (a  x b)  • c. 

Sections  9. 4-9. 9 extend  differential  calculus  to  vector  functions 

v(f)  = [ui(t),  v2(t),  u3(f)]  = Ui(f)i  + v2(t) j + u3(f)k 

and  to  vector  functions  of  more  than  one  variable  (see  below).  The  derivative  of 
v(f)  is 


(8) 


= = lim 

dt 


v(r  + At)  - v(f) 
At 


[v[,  v2,  ^ 3]  = vii  + v2  j + u3k. 


Differentiation  rules  are  as  in  calculus.  They  imply  (Sec.  9.4) 

(u  • v)'  = u'  • v + u • v',  (u  x v)'  = u'  x v + u X v'. 

Curves  C in  space  represented  by  the  position  vector  r(t)  have  r \i)  as  a tangent 
vector  (the  velocity  in  mechanics  when  t is  time),  r (s)  (.v  arc  length,  Sec.  9.5)  as 
the  unit  tangent  vector,  and  |r,,(^)|  = k as  the  curvature  (the  acceleration  in 
mechanics). 

Vector  functions  v (x,  y,  z ) = [i-’i  (x,  y,  z),  v2(x,  y,  z),  u2(x,  y,  ")]  represent  vector 
fields  in  space.  Partial  derivatives  with  respect  to  the  Cartesian  coordinates  x,  y,  z 
are  obtained  componentwise,  for  instance, 


d\ 

dx 


dVi  dv2  dv3 
dx  dx  dx 


dVi  dv2  dv3 

i H j + k 

dx  dx  dx 


The  gradient  of  a scalar  function  / is 

(9)  grad  / = V/  = 

The  directional  derivative  of  / in  the  direction  of  a vector  a is 

(10) 

The  divergence  of  a vector  function  v is 

dVi  dV2  dv3 

(11)  div  v = V • v = + + . 

dx  dy  dz 


'df  V df 
dx  ’ dy’  dz 


df  1 

DJ=-  = — a-V/ 

ds  a 


(Sec.  9.6). 


(Sec.  9.7). 


(Sec.  9.7). 


(Sec.  9.8). 
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The  curl  of  v is 

i j k 

(12) 

curl  v = V x v = 

d d d 

dx  dy  dz 

(Sec.  9.9) 

v1  v2  v3 

or  minus  the  determinant  if  the  coordinates  are  left-handed. 
Some  basic  formulas  for  grad,  div,  curl  are  (Secs.  9. 7-9. 9) 

(13) 

V(/g)  = /Vg  + gVf 
V(//g)  = d/g2)(gV/-/Vg) 

(14) 

div  (/v)  = / div  v + v • V/ 

div  (/Vg)  =/V2g  + V/*  Vg 

(15) 

V2/=  div  (V/) 

V2(/g)  = gV2/+2V/*Vg+/V2g 

(16) 

curl  (/v)  = V/  X v + / curl  v 
div  (u  X v)  = v • curl  u — u • curl  v 

(17) 

curl  (V/)  = 0 
div  (curl  v)  = 0. 

For  grad,  div,  curl,  and  V2  in  curvilinear  coordinates  see  App.  A3.4. 

CHAPTER 


Vector  Integral  Calculus. 
Integral  Theorems 


Vector  integral  calculus  can  be  seen  as  a generalization  of  regular  integral  calculus.  You 
may  wish  to  review  integration.  (To  refresh  your  memory,  there  is  an  optional  review 
section  on  double  integrals;  see  Sec.  10.3.) 

Indeed,  vector  integral  calculus  extends  integrals  as  known  from  regular  calculus  to 
integrals  over  curves,  called  line  integrals  (Secs.  10. 1, 10.2),  surfaces,  called  surface  integrals 
(Sec.  10.6),  and  solids,  called  triple  integrals  (Sec.  10.7).  The  beauty  of  vector  integral 
calculus  is  that  we  can  transform  these  different  integrals  into  one  another.  You  do  this 
to  simplify  evaluations,  that  is,  one  type  of  integral  might  be  easier  to  solve  than  another, 
such  as  in  potential  theory  (Sec.  10.8).  More  specifically,  Green’s  theorem  in  the  plane 
allows  you  to  transform  line  integrals  into  double  integrals,  or  conversely,  double  integrals 
into  line  integrals,  as  shown  in  Sec.  10.4.  Gauss’s  convergence  theorem  (Sec.  10.7)  converts 
surface  integrals  into  triple  integrals,  and  vice-versa,  and  Stokes’s  theorem  deals  with 
converting  line  integrals  into  surface  integrals,  and  vice-versa. 

This  chapter  is  a companion  to  Chapter  9 on  vector  differential  calculus.  From  Chapter  9, 
you  will  need  to  know  inner  product,  curl,  and  divergence  and  how  to  parameterize  curves. 
The  root  of  the  transformation  of  the  integrals  was  largely  physical  intuition.  Since  the 
corresponding  formulas  involve  the  divergence  and  the  curl,  the  study  of  this  material  will 
lead  to  a deeper  physical  understanding  of  these  two  operations. 

Vector  integral  calculus  is  very  important  to  the  engineer  and  physicist  and  has  many 
applications  in  solid  mechanics,  in  fluid  flow,  in  heat  problems,  and  others. 

Prerequisite:  Elementary  integral  calculus,  Secs.  9. 7-9. 9 

Sections  that  may  be  omitted  in  a shorter  course:  10.3,  10.5,  10.8 

References  and  Answers  to  Problems:  App.  1 Part  B,  App.  2 


10.1  Line  Integrals 

The  concept  of  a line  integral  is  a simple  and  natural  generalization  of  a definite  integral 


0) 


b 

fix)  dx. 


Recall  that,  in  (1),  we  integrate  the  function /(x),  also  known  as  the  integrand,  from  x = a 
along  the  x-axis  to  x = b.  Now,  in  a line  integral,  we  shall  integrate  a given  function,  also 
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called  the  integrand,  along  a curve  C in  space  or  in  the  plane.  (Hence  curve  integral 
would  be  a better  name  but  line  integral  is  standard). 

This  requires  that  we  represent  the  curve  C by  a parametric  representation  (as  in  Sec.  9.5) 

(2)  r(t)  = [x(t),  y(t),  z(t)]  = x(t)  i + y(r)j  + z(/)k  (a  g t g b). 

The  curve  C is  called  the  path  of  integration.  Look  at  Fig.  219a.  The  path  of  integration 
goes  from  A to  B.  Thus  A:  r (a)  is  its  initial  point  and  B:  r (b)  is  its  terminal  point.  C is 
now  oriented.  The  direction  from  A to  B,  in  which  t increases  is  called  the  positive 
direction  on  C.  We  mark  it  by  an  arrow.  The  points  A and  B may  coincide,  as  it  happens 
in  Fig.  219b.  Then  C is  called  a closed  path. 

C is  called  a smooth  curve  if  it  has  at  each  point  a unique  tangent  whose  direction  varies 
continuously  as  we  move  along  C.  We  note  that  r (t)  in  (2)  is  differentiable.  Its  derivative 
r (r)  = dr/dt  is  continuous  and  different  from  the  zero  vector  at  every  point  of  C. 

General  Assumption 

In  this  book,  every  path  of  integration  of  a line  integral  is  assumed  to  be  piecewise  smooth, 
that  is,  it  consists  of finitely  many  smooth  curves. 

For  example,  the  boundary  curve  of  a square  is  piecewise  smooth.  It  consists  of  four 
smooth  curves  or,  in  this  case,  line  segments  which  are  the  four  sides  of  the  square. 

Definition  and  Evaluation  of  Line  Integrals 

A line  integral  of  a vector  function  F(r)  over  a curve  C:  r (t)  is  defined  by 


(3) 


F(r)  • dr 

c 


rb 

F(r(0)  • r'(f)  dt 


t 


r 


dr 

dt 


where  r(t)  is  the  parametric  representation  of  C as  given  in  (2).  (The  dot  product  was  defined 
in  Sec.  9.2.)  Writing  (3)  in  terms  of  components,  with  dr  = [dx,  dy,  dz\  as  in  Sec.  9.5 
and  ' = d/dt,  we  get 


F(r)  • dr 

c 


f/'  i dx  + F2dy  + F3  dz) 
c 

rb 


(3') 


( Fi_x ' + F2y'  + F3z')  dt. 
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EXAMPLE  1 


1 * 


Fig.  220.  Example  1 
EXAMPLE  2 


Fig.  221.  Example  2 


If  the  path  of  integration  C in  (3)  is  a closed  curve,  then  instead  of 

we  also  write  o . 

■’c  c 

Note  that  the  integrand  in  (3)  is  a scalar,  not  a vector,  because  we  take  the  dot  product.  Indeed, 
F • r,/|r,|  is  the  tangential  component  of  F.  (For  “component”  see  (11)  in  Sec.  9.2.) 

We  see  that  the  integral  in  (3)  on  the  right  is  a definite  integral  of  a function  of  t taken 
over  the  interval  a =§  t g b on  the  r-axis  in  the  positive  direction:  The  direction  of 
increasing  t.  This  definite  integral  exists  for  continuous  F and  piecewise  smooth  C,  because 
this  makes  F • piecewise  continuous. 

Line  integrals  (3)  arise  naturally  in  mechanics,  where  they  give  the  work  done  by  a 
force  F in  a displacement  along  C.  This  will  be  explained  in  detail  below.  We  may  thus 
call  the  line  integral  (3)  the  work  integral.  Other  forms  of  the  line  integral  will  be  discussed 
later  in  this  section. 

Evaluation  of  a Line  Integral  in  the  Plane 

Find  the  value  of  the  line  integral  (3)  when  F(r)  = [— y,  —xy]  = — yi  — xyj  and  C is  the  circular  arc  in  Fig.  220 
from  A to  B. 

Solution.  We  may  represent  C by  r (t)  = [cos  t,  sin  t]  = cosri  + sinrj,  where  0 77/2.  Then 

x(t ) = cos  t,  y(t ) = sin  t,  and 

F(r(0)  = — y(?)i  — ;t(/)y(0j  = [— sin  t,  —cos  t sin  t ] = —sin  t i — cos  t sin  t j. 

By  differentiation,  r \t)  = [—sin  t,  cos  t]  = —sin  t i + cos  t j,  so  that  by  (3)  [use  (10)  in  App.  3.1;  set  cos  t = u 
in  the  second  term] 


77-/2 


F(r)  • dr  = I [ — sin  t,  —cos  t sin  t ] • [—sin  t,  cos  t]  dt  = I (sin2  t — cos2  t sin  t)  dt 


tt/2 


77-/2 


(1  - cos  2t ) dt  - I u\-du)  = — - 0 » 0.4521. 


Line  Integral  in  Space 

The  evaluation  of  line  integrals  in  space  is  practically  the  same  as  it  is  in  the  plane.  To  see  this,  find  the  value 
of  (3)  when  F(r)  = [z,  x,  y]  = zi  + xj  + yk  and  C is  the  helix  (Fig.  221) 

(4)  r(f)  = [cos  t,  sin  t,  3t ] = cos  / i + sin  t j + 3?k  (0  ^ t ^ 277). 

Solution.  From  (4)  we  have  x(t)  = cos  t,  y(t)  — sin  t,  z(t ) = 3t.  Thus 

F(r(r))  • r'(t)  = (3t  i + cos  t j + sin  t k)  • (— sin  t\  + cos  t j + 3k). 

The  dot  product  is  3t(— sin  t)  + cos2  t + 3 sin  t.  Hence  (3)  gives 

F(r)  • dr  = (—3 1 sin  t + cos2 1 + 3 sin  t)  dt  = 677  + 77  + 0 = lir  ~ 21.99. 


Simple  general  properties  of  the  line  integral  (3)  follow  directly  from  corresponding 
properties  of  the  definite  integral  in  calculus,  namely, 


(5a) 


&F  • dr 

c 


F«  dr 


Jc 


(k  constant) 
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Fig.  222. 

Formula  (5c) 


THEOREM  1 


PROOF 


(5b) 


(F  + G)  • dr 

c 


F • dr  + 


G • r/r 


Jc 


Jc 


(5c) 


F • dr 

Jc 


F • dr  + 


F'dr 


Jc i 


c2 


(Fig.  222) 


where  in  (5c)  the  path  C is  subdivided  into  two  arcs  C\  and  C2  that  have  the  same 
orientation  as  C (Fig.  222).  In  (5b)  the  orientation  of  C is  the  same  in  all  three  integrals. 
If  the  sense  of  integration  along  C is  reversed,  the  value  of  the  integral  is  multiplied  by  — 1 . 
However,  we  note  the  following  independence  if  the  sense  is  preserved. 


Direction-Preserving  Parametric  Transformations 

Any  representations  of  C that  give  the  same  positive  direction  on  C also  yield  the 
same  value  of  the  line  integral  (3). 


The  proof  follows  by  the  chain  rule.  Let  r(f)  be  the  given  representation  with  a ts  t ^ b 
as  in  (3).  Consider  the  transformation  t = (f>(t*)  which  transforms  the  t interval  to 
a*  Si  t*  Si  b*  and  has  a positive  derivative  dt/dt*.  We  write  r (?)  = r (4>(t*))  = r*(f*). 
Then  dt  = (dt/dt*)  dt*  and 


F(r*)  • dr* 
c 


F(r(0) ' -f  dt  = 

at 


F(r)  • dr. 


Motivation  of  the  Line  Integral  (3): 

Work  Done  by  a Force 

The  work  W done  by  a constant  force  F in  the  displacement  along  a straight  segment  d 
is  W = F • d;  see  Example  2 in  Sec.  9.2.  This  suggests  that  we  define  the  work  W done 
by  a variable  force  F in  the  displacement  along  a curve  C:  r(t)  as  the  limit  of  sums  of 
works  done  in  displacements  along  small  chords  of  C.  We  show  that  this  definition  amounts 
to  defining  W by  the  line  integral  (3). 

For  this  we  choose  points  to  (=a)  < t\  < • • • < tn  ( =b ).  Then  the  work  A Vl^  done 
by  F(r(fm))  in  the  straight  displacement  from  r(fm)  to  r(fm+i)  is 

A Wm  F(r(tm))  • [r(tm+i)  r(tm)]  F(r(tm))  • r (tm)Atm  (A tm  A trn+\  tm). 

The  sum  of  these  n works  is  Wn  = AWo  + •••  + AVV^-i.  If  we  choose  points  and 
consider  Wn  for  every  n arbitrarily  but  so  that  the  greatest  A tm  approaches  zero  as 
n — > oo;  then  the  limit  of  Wn  as  n — » °°  is  the  line  integral  (3).  This  integral  exists 
because  of  our  general  assumption  that  F is  continuous  and  C is  piecewise  smooth; 
this  makes  r ( t ) continuous,  except  at  finitely  many  points  where  C may  have  corners 
or  cusps. 
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EXAMPLE  3 


EXAMPLE  4 


EXAMPLE  5 


Work  Done  by  a Variable  Force 

If  F in  Example  1 is  a force,  the  work  done  by  F in  the  displacement  along  the  quarter-circle  is  0.4521,  measured 
in  suitable  units,  say,  newton-meters  (nt  • m,  also  called  joules,  abbreviation  J;  see  also  inside  front  cover). 
Similarly  in  Example  2. 


Work  Done  Equals  the  Gain  in  Kinetic  Energy 

Let  F be  a force,  so  that  (3)  is  work.  Let  t be  time,  so  that  dr/dt  = v,  velocity.  Then  we  can  write  (3)  as 
(6) 


W=  F • dr  = F(r(r))  • v(f)  dt. 

■'C  a 

Now  by  Newton’s  second  law,  that  is,  force  = mass  X acceleration,  we  get 

F = = m\'(t), 

where  m is  the  mass  of  the  body  displaced.  Substitution  into  (5)  gives  [see  (11),  Sec.  9.4] 


W = my'  • v dt  = m 


v • v 
2 


m | ,n 
dt  = ~ |v|2 
2 


On  the  right,  m |v|2/2  is  the  kinetic  energy.  Hence  the  work  done  equals  the  gain  in  kinetic  energy.  This  is  a 
basic  law  in  mechanics. 


Other  Forms  of  Line  Integrals 

The  line  integrals 


(7) 


Fi  dx, 
Jc 


F2dv, 

c 


F3dz 

c 


are  special  cases  of  (3)  when  F = F\  i or  F2,j  or  F3k,  respectively. 

Furthermore,  without  taking  a dot  product  as  in  (3)  we  can  obtain  a line  integral  whose 
value  is  a vector  rather  than  a scalar,  namely, 


(8) 


F(r)  dt 
c 


rb 

F(r(f))  dt 


b 

[FiOm,  F2( r(0),  F:i(r(tm  dt. 


Obviously,  a special  case  of  (7)  is  obtained  by  taking  F1  = /,  F2 


F3  = 0.  Then 


(8*) 


m dt 

c 


rb 

f(r(t))  dt 


with  C as  in  (2).  The  evaluation  is  similar  to  that  before. 


A Line  Integral  of  the  Form  (8) 

Integrate  F(r)  = [xy,  yz,  z ] along  the  helix  in  Example  2. 

Solution.  F(r(f))  = [cos  t sin  t,  3 1 sin  t,  3 1]  integrated  with  respect  to  t from  0 to  2tt  gives 


F(r(0  dt  = 


1 o 3 o 

- — cos  t,  3 sin  t — 3 1 cos  t,  — t 

2 2 


= [0,  -677,  677  ]. 
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Path  Dependence 

Path  dependence  of  line  integrals  is  practically  and  theoretically  so  important  that  we 
formulate  it  as  a theorem.  And  a whole  section  (Sec.  10.2)  will  be  devoted  to  conditions 
under  which  path  dependence  does  not  occur. 


THEOREM  2 


Path  Dependence 

The  line  integral  (3)  generally  depends  not  only  on  F and  on  the  endpoints  A and 
B of  the  path,  but  also  on  the  path  itself  along  which  the  integral  is  taken. 


PROOF  Almost  any  example  will  show  this.  Take,  for  instance,  the  straight  segment  Cp  ir(f)  = 
[f,  t,  0]  and  the  parabola  C2:  r2(f)  = [t,  t2,  0]  with  0 tS  f Si  I (Fig.  223)  and  integrate 
F = [0,  xy,  0].  Then  F(r1(f))  • r{(t)  = t2,  F(r2(f))  * r2 (f)  = 2f4,  so  that  integration  gives 
1/3  and  2/5,  respectively.  ■ 


Fig.  223.  Proof  of  Theorem  2 


PRQBLE^SET-1^.T 


1.  WRITING  PROJECT.  From  Definite  Integrals  to 
Line  Integrals.  Write  a short  report  (1-2  pages)  with 
examples  on  line  integrals  as  generalizations  of  definite 
integrals.  The  latter  give  the  area  under  a curve.  Explain 
the  corresponding  geometric  interpretation  of  a line 
integral. 


LINE  INTEGRAL.  WORK 

Calculate  F(r)  • dr  for  the  given  data.  If  F is  a force,  this 
■'c 

gives  the  work  done  by  the  force  in  the  displacement  along 
C.  Show  the  details. 

2.  F = [y2,  -x2],  C:  y = 4x2  from  (0,  0)  to  (1,  4) 

3.  F as  in  Prob.  2,  C from  (0, 0)  straight  to  ( 1 , 4).  Compare. 

4.  F = [xy,  x2y2],  C from  (2,  0)  straight  to  (0,  2) 

5.  F as  in  Prob.  4,  C the  quarter-circle  from  (2,  0)  to 
(0,  2)  with  center  (0,  0) 

6.  F = [x  — y,  y — z,  z — x],  C:  r = [2  cos  t,  t,  2 sin  f] 
from  (2,  0,  0)  to  (2,  277,  0) 

7.  F = [x2,  y2,  z2],  C:  r = [cos  t,  sin  t,  e‘]  from  (1, 0, 1) 
to  (1,  0,  e27T).  Sketch  C. 


8.  F = [ex,  coshy,  sinh  z],  C:  r = [f,  t 2,  t 3]  from  (0,  0,  0) 
to  (2,  £ g).  Sketch  C. 

9.  F = [x  + y , y + z,z  +x],  C:  r = [2 1, 5 1,  t]  from  t = 0 
to  1 . Also  from  t = — 1 to  1 . 

10.  F = [x,  — z,  2y]  from  (0,  0,  0)  straight  to  (1,  1,  0),  then 
to  (1,  1,  1),  back  to  (0,  0,  0) 

11.  F = [e~x,  e~v,  e~z],  C:  r = [f,  t2,  t ] from  (0,  0,  0)  to 
(2,  4,  2).  Sketch  C. 

12.  PROJECT.  Change  of  Parameter.  Path  Dependence. 

Consider  the  integral  F(r)  ’dr,  where  F = [xy,  — y2]. 

'c 

(a)  One  path,  several  representations.  Find  the  value 
of  the  integral  when  r = [cos  t,  sin  f],  OS  (S  7t/2. 
Show  that  the  value  remains  the  same  if  you  set  t = — p 
or  t — p2  or  apply  two  other  parametric  transformations 
of  your  own  choice. 

(b)  Several  paths.  Evaluate  the  integral  when  C:  y = 
xn,  thus  r = [t,  tn],  OStSl,  where  n = 1,  2,  3,  • ■ • . 
Note  that  these  infinitely  many  paths  have  the  same 
endpoints. 
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(c)  Limit.  What  is  the  limit  in  (b)  as  n —> > °o?  Can  you 
confirm  your  result  by  direct  integration  without  referring 
to  (b)? 

(d)  Show  path  dependence  with  a simple  example  of 
your  choice  involving  two  paths. 

13.  ML-Inequality,  Estimation  of  Line  Integrals.  Let  F 
be  a vector  function  defined  on  a curve  C.  Let  | F | be 
bounded,  say,  |F|  Si  M on  C,  where  Mis  some  positive 
number.  Show  that 


(9) 


F • dr 


S ML 


(L  = Length  of  C). 


14.  Using  (9),  find  a bound  for  the  absolute  value  of  the 
work  W done  by  the  force  F = [x2,  y]  in  the  dis- 
placement from  (0,  0)  straight  to  (3,  4).  Integrate  exactly 
and  compare. 


15-20 


INTEGRALS  (8)  AND  (8*) 


Evaluate  them  with  F or /and  C as  follows. 

15.  F = [y2,  £2,  x2],  C:  r = [3  cos  t,  3 sin  t,  2 f], 
0 S t £ 477- 


16.  f = 3x  + y + 5z,  C:  r = [f,  cosh  t,  sinh  f], 

0 fi  t fi  1.  Sketch  C. 

17.  F = [x  + y,  y + z,  Z + x],  C:  r = [4  cos  t,  sin  t , 0], 

0 S t £ 7T 

18.  F = [y1/3,  jc1/3,  0],  C the  hypocycloid  r = [cos3  t, 
sin3  f,  0],  0S(S  7t/4 

19.  / = xyz,  C:  r = [At,  3 t2,  12t],  -2  £ t fi  2. 
Sketch  C. 

20.  F = [xz,  yz,  x2y2],  C:  r = [t,  t,  e\  0S(S5. 
Sketch  C. 


10..  Path  Independence  of  Line  Integrals 

We  want  to  find  out  under  what  conditions,  in  some  domain,  a line  integral  takes  on  the 
same  value  no  matter  what  path  of  integration  is  taken  (in  that  domain).  As  before  we 
consider  line  integrals 


Fig.  224.  Path 
independence 


(1) 


F(r)  • dr 

c 


(F\  dx  + F2dy  + F3  dz ) 
c 


(dr  = [r/.r,  dy,  cfe]) 


The  line  integral  (1)  is  said  to  be  path  independent  in  a domain  D in  space  if  for  every 
pair  of  endpoints  A,  B in  domain  D,  (1)  has  the  same  value  for  all  paths  in  D that  begin  at 
A and  end  at  B.  This  is  illustrated  in  Fig.  224.  (See  Sec.  9.6  for  “domain.”) 

Path  independence  is  important.  For  instance,  in  mechanics  it  may  mean  that  we  have 
to  do  the  same  amount  of  work  regardless  of  the  path  to  the  mountaintop,  be  it  short  and 
steep  or  long  and  gentle.  Or  it  may  mean  that  in  releasing  an  elastic  spring  we  get  back 
the  work  done  in  expanding  it.  Not  all  forces  are  of  this  type — think  of  swimming  in  a 
big  round  pool  in  which  the  water  is  rotating  as  in  a whirlpool. 

We  shall  follow  up  with  three  ideas  about  path  independence.  We  shall  see  that  path 
independence  of  (1)  in  a domain  D holds  if  and  only  if: 


( Theorem  1) 
( Theorem  2) 
( Theorem  3) 


F = grad/,  where  grad /is  the  gradient  of/ as  explained  in  Sec.  9.7. 

Integration  around  closed  curves  C in  D always  gives  0. 

curl  F = 0,  provided  D is  simply  connected,  as  defined  below. 


Do  you  see  that  these  theorems  can  help  in  understanding  the  examples  and  counterexample 
just  mentioned? 

Let  us  begin  our  discussion  with  the  following  very  practical  criterion  for  path 
independence. 
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THEOREM  1 


PROOF 


Path  Independence 

A line  integral  (1)  with  continuous  h\,  F2,  A3  in  a domain  D in  space  is  path 
independent  in  D if  and  only  if  F = [//,  If,  F3\  is  the  gradient  of  some  function 
f in  D, 

df  df  df 

(2)  F = grad/,  thus,  = — , F2  = F3  = 

dx  dy  dz 


(a)  We  assume  that  (2)  holds  for  some  function /in  D and  show  that  this  implies  path 
independence.  Let  C be  any  path  in  D from  any  point  A to  any  point  B in  D,  given  by 
r(r)  = [a/),  y(t),  z(?)L  where  aS/S  /?.  Then  from  (2),  the  chain  rule  in  Sec.  9.6,  and 
(3  ) in  the  last  section  we  obtain 


(Fi  dx  + F2dy  + F3  dz)  = 


df  df  df 

— dx  H dy  H dz 

dx  dy  dz 


df  dx  df  dy  df  dz\ 

— — + — — H — 1 dt 

dx  dt  dy  dt  dz  dt  J 


df 

dt 


dt  = f[x(t),  y(t),  z(t )] 


t=b 


t=a 


= f(x(b),  y(b),  z(b))  - f{x{a),  y(a),  z(a)) 


=m  -m. 


(b)  The  more  complicated  proof  of  the  converse,  that  path  independence  implies  (2) 
for  some  / is  given  in  App.  4. 


The  last  formula  in  part  (a)  of  the  proof. 


(3) 


B 

(Fi  dx  + F2dy  + F3  dz)  = f(B)  - f(A)  [F  = grad/] 

A 


is  the  analog  of  the  usual  formula  for  definite  integrals  in  calculus, 


b b 

g(x)  dx  = G(x) 


G{b)  - G(a) 


lG\x)  = *(*)]. 


Formula  (3)  should  be  applied  whenever  a line  integral  is  independent  of  path. 

Potential  theory  relates  to  our  present  discussion  if  we  remember  from  Sec.  9.7  that  when 
F = grad/,  then/is  called  a potential  of  F.  Thus  the  integral  (1)  is  independent  of  path 
in  D if  and  only  if  F is  the  gradient  of  a potential  in  D. 
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EXAMPLE  1 Path  Independence 

Show  that  the  integral  F • dr  = (2x  dx  + 2y  dy  + 4z  dz ) is  path  independent  in  any  domain  in  space  and 

•'c  ■'c 

find  its  value  in  the  integration  from  A:  (0,  0,  0)  to  B:  (2,  2,  2). 

Solution.  F = [2x,  2 y,  4 z\  = grad/,  where/  = x2  + y2  + 2z2  because  df/dx  = 2x  = Fi,  df/dy  = 2 y = F2, 
df/dz  = 4 z = F3.  Hence  the  integral  is  independent  of  path  according  to  Theorem  1,  and  (3)  gives 
f(B)  — f(A)  =/( 2,2,2)  -/(0,0,0)  - 4 + 4 + 8 = 16. 

If  you  want  to  check  this,  use  the  most  convenient  path  C\  r(t)  = [t,  t,  r],  0 = / = 2,  on  which 
F(r(r)  = [2 1,  2 1,  4t],  sothatF(r(/))  • r'(t)  = 2t  + 2t  + 4t  = 8 1,  and  integration  from  0 to  2 gives  8 • 22/2  =16. 
If  you  did  not  see  the  potential  by  inspection,  use  the  method  in  the  next  example. 

EXAMPLE  2 Path  Independence.  Determination  of  a Potential 

Evaluate  the  integral  I = (3x2  dx  + 2 yz  dy  + y2  dz)  from  A:  (0,  1,  2)  to  B:  (1,  — 1,  7)  by  showing  that  F has  a 

'c 

potential  and  applying  (3). 

Solution.  If  F has  a potential  / we  should  have 

fx  = F\  = 3x2,  fy  = F2  = 2 yz,  fz  = F3  = y2. 

We  show  that  we  can  satisfy  these  conditions.  By  integration  of  fx  and  differentiation, 

/ = *3  + g(y , z),  fy  = gy  = 2yz,  g = y2z  + h(z),  f=x2  + y2z  + h(z) 

fz  = y2  + h'  = y2,  h’  — 0 h = 0,  say. 

This  gives  f{x,y,z)  = x3  + y3z  and  by  (3), 

/=/(!,  -1,7)  -/( 0,  1,2)  = 1 + 7 - (0  + 2)  = 6. 


Path  Independence  and  Integration 
Around  Closed  Curves 

The  simple  idea  is  that  two  paths  with  common  endpoints  (Fig.  225)  make  up  a single 
closed  curve.  This  gives  almost  immediately 


THEOREM  2 


Path  Independence 

The  integral  (1)  is  path  independent  in  a domain  D if  and  only  if  its  value  around 
every  closed  path  in  D is  zero. 


PROOF 


Fig.  225.  Proof  of 
Theorem  2 


If  we  have  path  independence,  then  integration  from  A to  B along  C \ and  along  C2  in 
Fig.  225  gives  the  same  value.  Now  Cy  and  C2  together  make  up  a closed  curve  C,  and 
if  we  integrate  from  A along  C \ to  B as  before,  but  then  in  the  opposite  sense  along  C2 
back  to  A (so  that  this  second  integral  is  multiplied  by  —1),  the  sum  of  the  two  integrals 
is  zero,  but  this  is  the  integral  around  the  closed  curve  C. 

Conversely,  assume  that  the  integral  around  any  closed  path  C in  D is  zero.  Given  any 
points  A and  B and  any  two  curves  C\  and  C2  from  A to  B in  I),  we  see  that  C\  with  the 
orientation  reversed  and  C2  together  form  a closed  path  C.  By  assumption,  the  integral 
over  C is  zero.  Hence  the  integrals  over  C 1 and  C2,  both  taken  from  A to  B,  must  be  equal. 
This  proves  the  theorem.  ■ 
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Work.  Conservative  and  Nonconservative  (Dissipative)  Physical  Systems 

Recall  from  the  last  section  that  in  mechanics,  the  integral  (1)  gives  the  work  done  by  a 
force  F in  the  displacement  of  a body  along  the  curve  C.  Then  Theorem  2 states  that  work 
is  path  independent  in  D if  and  only  if  its  value  is  zero  for  displacement  around  every 
closed  path  in  D.  Furthermore,  Theorem  1 tells  us  that  this  happens  if  and  only  if  F is  the 
gradient  of  a potential  in  D.  In  this  case,  F and  the  vector  field  defined  by  F are  called 
conservative  in  D because  in  this  case  mechanical  energy  is  conserved;  that  is,  no  work 
is  done  in  the  displacement  from  a point  A and  back  to  A.  Similarly  for  the  displacement 
of  an  electrical  charge  (an  electron,  for  instance)  in  a conservative  electrostatic  field. 

Physically,  the  kinetic  energy  of  a body  can  be  interpreted  as  the  ability  of  the  body  to 
do  work  by  virtue  of  its  motion,  and  if  the  body  moves  in  a conservative  field  of  force, 
after  the  completion  of  a round  trip  the  body  will  return  to  its  initial  position  with  the 
same  kinetic  energy  it  had  originally.  For  instance,  the  gravitational  force  is  conservative; 
if  we  throw  a ball  vertically  up,  it  will  (if  we  assume  air  resistance  to  be  negligible)  return 
to  our  hand  with  the  same  kinetic  energy  it  had  when  it  left  our  hand. 

Friction,  air  resistance,  and  water  resistance  always  act  against  the  direction  of  motion. 
They  tend  to  diminish  the  total  mechanical  energy  of  a system,  usually  converting  it  into 
heat  or  mechanical  energy  of  the  surrounding  medium  (possibly  both).  Furthermore, 
if  during  the  motion  of  a body,  these  forces  become  so  large  that  they  can  no  longer 
be  neglected,  then  the  resultant  force  F of  the  forces  acting  on  the  body  is  no  longer 
conservative.  This  leads  to  the  following  terms.  A physical  system  is  called  conservative 
if  all  the  forces  acting  in  it  are  conservative.  If  this  does  not  hold,  then  the  physical  system 
is  called  nonconservative  or  dissipative. 

Path  Independence  and  Exactness 
of  Differential  Forms 

Theorem  1 relates  path  independence  of  the  line  integral  (1)  to  the  gradient  and  Theorem  2 
to  integration  around  closed  curves.  A third  idea  (leading  to  Theorems  3*  and  3,  below) 
relates  path  independence  to  the  exactness  of  the  differential  form  or  Pfaffian  form1 

(4)  F • dr  = F1  dx  + F2  dy  + F3  dz 

under  the  integral  sign  in  (1).  This  form  (4)  is  called  exact  in  a domain  D in  space  if  it 
is  the  differential 

df  df  df 

df  = — dx  + — dy  + — dz  = (grad/)  • dr 
dx  dy  dz 

of  a differentiable  function  fix,  y,  z)  everywhere  in  D,  that  is,  if  we  have 

F • dr  = df. 

Comparing  these  two  formulas,  we  see  that  the  form  (4)  is  exact  if  and  only  if  there  is  a 
differentiable  function /(x,  y,  z)  in  D such  that  everywhere  in  D. 

df  df  df 

(5)  F = grad  f thus,  F\  = — , F2  = — , F3  = 

dx  dy  dz 


1JOHANN  FRIEDRICH  PFAFF  (1765-1825).  German  mathematician. 
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THEOREM  3* 


THEOREM  3 


PROOF 


Hence  Theorem  1 implies 

Path  Independence 

The  integral  ( 1 ) is  path  independent  in  a domain  D in  space  if  and  only  if  the  differential 
form  (4)  has  continuous  coefficient  functions  F-y,  F2,  F3  and  is  exact  in  D. 

This  theorem  is  of  practical  importance  because  it  leads  to  a useful  exactness  criterion. 
First  we  need  the  following  concept,  which  is  of  general  interest. 

A domain  D is  called  simply  connected  if  every  closed  curve  in  D can  be  continuously 
shrunk  to  any  point  in  D without  leaving  D. 

For  example,  the  interior  of  a sphere  or  a cube,  the  interior  of  a sphere  with  finitely  many 
points  removed,  and  the  domain  between  two  concentric  spheres  are  simply  connected.  On 
the  other  hand,  the  interior  of  a torus,  which  is  a doughnut  as  shown  in  Fig.  249  in  Sec.  10.6 
is  not  simply  connected.  Neither  is  the  interior  of  a cube  with  one  space  diagonal  removed. 
The  criterion  for  exactness  (and  path  independence  by  Theorem  3*)  is  now  as  follows. 


Criterion  for  Exactness  and  Path  Independence 

Let  F\ , F2,  F3  in  the  line  integral  (1), 


F(r)  • dr 

Jc 


(F\  dx  + F2dy  + F3  dz), 
c 


be  continuous  and  have  continuous  first  partial  derivatives  in  a domain  D in  space.  Then: 
(a)  If  the  differential  form  (4)  is  exact  in  D — and  thus  (1)  is  path  independent 
by  Theorem  3* — , then  in  D, 

(6)  curl  F = 0; 


in  components  (see  Sec.  9.9) 


t d /A,  d l’2  dF  i dF3  dF2  ()f’\ 

^ ^ dy  dz  dz  dx  ’ dx  dy 

(b)  If  (6)  holds  in  D and  D is  simply  connected,  then  (4)  is  exact  in  D — and 
thus  (1)  is  path  independent  by  Theorem  3*. 


(a)  If  (4)  is  exact  in  D,  then  F = grad / in  D by  Theorem  3*,  and,  furthermore, 
curl  F = curl  (grad/)  = 0 by  (2)  in  Sec.  9.9,  so  that  (6)  holds. 

(b)  The  proof  needs  “Stokes’s  theorem”  and  will  be  given  in  Sec.  10.9. 


Line  Integral  in  the  Plane.  For 


F(r)  • dr  = 


Jc 


(/•’i  dx  + F2  dy)  the  curl  has  only  one 


Jc 


component  (the  z-component),  so  that  ((f  ) reduces  to  the  single  relation 


(6") 


dF2  dF\ 
dx  dy 


(which  also  occurs  in  (5)  of  Sec.  1.4  on  exact  ODEs). 
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EXAMPLE  3 


EXAMPLE  4 


Exactness  and  Independence  of  Path.  Determination  of  a Potential 

Using  (6  ),  show  that  the  differential  form  under  the  integral  sign  of 


I = [2 xyz2  dx  + (. x2z 2 + z cos  yz)  dy  + (2 x2yz  + y cos  yz)  dz\ 

'c 

is  exact,  so  that  we  have  independence  of  path  in  any  domain,  and  find  the  value  of  I from  A:  (0,  0,  1)  to 
B:  (1,  T7-/4,  2). 

Solution.  Exactness  follows  from  ( 6 ’ ),  which  gives 

(F3)y  = 2x2z  + cos  yz  - yz  sin  yz  = ( F2)z 

(Fi)z  = 4xyz  = (F3)x 
(F2)x  = ZXZ2  = (FJy. 

To  find/,  we  integrate  F2  (which  is  “long,”  so  that  we  save  work)  and  then  differentiate  to  compare  with  //  and  F3, 


/ = J F2  dy  = J (*V  + z cos  yz)  dy  = x2z2y  + sin  yz  + g(*,  z) 
fx  = 2 xz2y  + gx  = F1  = 2xyz2,  gx  = 0,  g = h(z) 
fz  = 2 x2zy  + y cos  yz  + h'  = F3  = 2 X2zy  + y cos  yz,  hi  = 0. 

h'  = 0 implies  h = const  and  we  can  take  h = 0,  so  that  g = 0 in  the  first  line.  This  gives,  by  (3), 

/'fx,  y,  z)  = x2yz2  + sin  vz,  /(B)  — f(A)  =1-  — - 4 + sin  — — 0 = 17  + 1. 

4 2 

The  assumption  in  Theorem  3 that  D is  simply  connected  is  essential  and  cannot  be  omitted. 
Perhaps  the  simplest  example  to  see  this  is  the  following. 

On  the  Assumption  of  Simple  Connectedness  in  Theorem  3 

Let 


(V) 


Fi  = ~~ 


2 , 2 
x + y 


F„  = 


F3  = 0. 


Differentiation  shows  that  ft/ ) is  satisfied  in  any  domain  of  the  xy-plane  not  containing  the  origin,  for  example, 
in  the  domain  D\\<.  Vx2  + y2  < | shown  in  Fig.  226.  Indeed,  //  and  F2  do  not  depend  on  z,  and  F3  = 0, 
so  that  the  first  two  relations  in  (6’)  are  trivially  true,  and  the  third  is  verified  by  differentiation: 

dF2  x2  + y2  - x - 2x  y2  - x2 
dx  ~ (x2  + y2)2  ~ (x2  + y2)2  ’ 

3F!  x2  + y2  - y ■ 2y  y2  - x2 
17  “ (x2  + y2)2  ” (x2  + y2)2 ' 


Clearly,  D in  Fig.  226  is  not  simply  connected.  If  the  integral 


f —y  dx  + x dy 

(F1  dx  + F2  dy)  = — 

c + y2 

were  independent  of  path  in  D,  then  I = 0 on  any  closed  curve  in  D,  for  example,  on  the  circle  x2  + y2  — 1. 
But  setting  x = r cos  6,  y = r sin  6 and  noting  that  the  circle  is  represented  by  /*  = 1,  we  have 


x = cos  6, 


dx  = —sin  6 dO, 


y = sin  6 , dy  = cos  6 dd. 


SEC.  10.2 
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so  that  — y dx  + x dy  = sin2  6 dd  + cos2  6 dd  = dO  and  counterclockwise  integration  gives 


/ = 


= 277. 


Since  D is  not  simply  connected,  we  cannot  apply  Theorem  3 and  cannot  conclude  that  I is  independent  of  path 
in  D. 

Although  F = grad f where  / = arctan  (y/x)  (verify!),  we  cannot  apply  Theorem  1 either  because  the  polar 
angle/  = 6 = arctan  (y/x)  is  not  single- valued,  as  it  is  required  for  a function  in  calculus. 


FRO  B 1 tTF 2 


1.  WRITING  PROJECT.  Report  on  Path  Independence. 

Make  a list  of  the  main  ideas  and  facts  on  path 
independence  and  dependence  in  this  section.  Then 
work  this  list  into  a report.  Explain  the  definitions  and 
the  practical  usefulness  of  the  theorems,  with  illustrative 
examples  of  your  own.  No  proofs. 

2.  On  Example  4.  Does  the  situation  in  Example  4 of  the 
text  change  if  you  take  the  domain  0 < Vr+  / < 
3/2? 


3-9 


PATH  INDEPENDENT  INTEGRALS 


Show  that  the  form  under  the  integral  sign  is  exact  in  the 
plane  (Probs.  3-4)  or  in  space  (Probs.  5-9)  and  evaluate  the 
integral.  Show  the  details  of  your  work. 


r (.IT,  0) 

3.  (|  cos  \x  cos  2v  dx  — 2 sin  \x  sin  2y  dy) 

Xtt/2,  v) 


4. 


5. 


r(6,  1) 

e4y(2x  dx  + 4x2  dy) 

x4,  0) 

r (2, 1/2,  tt/2) 

exy(y  sin  z dx  + x sin  z dy  + cos  z dz) 

(0,  0,  7 r) 


r (1,1,0) 

6.  e^  + y2  + ^(xdx  + ydy  + zdz) 

V 0,  0) 

f (1,1,1) 

7.  ( yz  sinh  xz  dx  + cosh  xzdy  + xy  sinh  xz  dz) 


r(3,u-,3) 

8.  (cos  yz  dx  — xz  sin  yz  dy  — xy  sin  yz  dz) 

(5,  3,  7t) 

<■(1,0,1) 

9.  (ex  cosh  y dx  + (ex  sinh  y + e''  cosh  y)  dy 

V 1,  0) 

+ ez  sinh  y dz) 

10.  PROJECT.  Path  Dependence,  (a)  Show  that 

I = (jc2y  dx  + 2 xv2  dy)  is  path  dependent  in  the 
■'c 

xy-plane. 

(b)  Integrate  from  (0,  0)  along  the  straight-line 
segment  to  (1,  b),  0 S b = 1,  and  then  vertically  up  to 
(1,  1);  see  the  figure.  For  which  b is  /maximum?  What 
is  its  maximum  value? 

(c)  Integrate  1 from  (0, 0)  along  the  straight-line  segment 
to  (c,  1),  0 S c S 1,  and  then  horizontally  to  (1,  1).  For 
c = 1,  do  you  get  the  same  value  as  for  b = 1 in  (b)? 
For  which  c is  / maximum?  What  is  its  maximum  value? 


(0,  2,  3) 


426 


CHAP.  10  Vector  Integral  Calculus.  Integral  Theorems 


11.  On  Example  4.  Show  that  in  Example  4 of  the  text, 
F = grad  (arctan  iy/x)).  Give  examples  of  domains  in 
which  the  integral  is  path  independent. 

12.  CAS  EXPERIMENT.  Extension  of  Project  10.  Inte- 
grate x2y  dx  + 2xy2  dy  over  various  circles  through  the 
points  (0,  0)  and  (1,  1).  Find  experimentally  the  smallest 
value  of  the  integral  and  the  approximate  location  of 
the  center  of  the  circle. 

PATH  INDEPENDENCE? 

Check,  and  if  independent,  integrate  from  (0,  0,  0)  to  ( a , b,  c). 

13.  2ex  (x  cos  2y  dx  — sin  2y  dy) 


14.  (sinh  xy)  (z  dx  — x dz) 

15.  x2y  dx  — 4 xy2  dy  + 8z2x  dz 

16.  ey  dx  + (xev  — ez)  dy  — yez  dz 

17.  4y  dx  + z.dy  + (y  — 2z)  dz 

18.  (cos  xy){yz  dx  + xz  dy)  — 2 sin  xy  dz 

19.  (cos  (x2  + 2y2  + z2))  (2x  dx  + 4y  dy  + 2z  dz) 

20.  Path  Dependence.  Construct  three  simple  examples 
in  each  of  which  two  equations  (6')  are  satisfied,  but 
the  third  is  not. 


10.3  Calculus  Review:  Double  Integrals. 

Optional 

This  section  is  optional.  Students  familiar  with  double  integrals  from  calculus  should 
skip  this  review  and  go  on  to  Sec.  10.4.  This  section  is  included  in  the  book  to  make  it 
reasonably  self-contained. 

In  a definite  integral  (1),  Sec.  10.1,  we  integrate  a function  /( x)  over  an  interval 
(a  segment)  of  the  x-axis.  In  a double  integral  we  integrate  a function  f(x,  y ),  called  the 
integrand,  over  a closed  bounded  region2  R in  the  xy-plane,  whose  boundary  curve  has  a 
unique  tangent  at  almost  every  point,  but  may  perhaps  have  finitely  many  cusps  (such  as 
the  vertices  of  a triangle  or  rectangle). 

The  definition  of  the  double  integral  is  quite  similar  to  that  of  the  definite  integral.  We 
subdivide  the  region  R by  drawing  parallels  to  the  x-  and  y-axes  (Fig.  227).  We  number  the 
rectangles  that  are  entirely  within  R from  1 to  n.  In  each  such  rectangle  we  choose  a point, 
say,  (xk,  yk)  in  the  kth  rectangle,  whose  area  we  denote  by  A Ak.  Then  we  form  the  sum 

n 

~ f(.xki  yk)  ^A-k- 
k = 1 


y 


Fig.  227.  Subdivision  of  a region  R 


2 A region  R is  a domain  (Sec.  9.6)  plus,  perhaps,  some  or  all  of  its  boundary  points.  R is  closed  if  its  boundary 
(all  its  boundary  points)  are  regarded  as  belonging  to  R\  and  R is  bounded  if  it  can  be  enclosed  in  a circle  of 
sufficiently  large  radius.  A boundary  point  P of  R is  a point  (of  R or  not)  such  that  every  disk  with  center  P 
contains  points  of  R and  also  points  not  of  R. 


SEC.  10.3  Calculus  Review:  Double  Integrals.  Optional 


427 


This  we  do  for  larger  and  larger  positive  integers  n in  a completely  independent  manner, 
but  so  that  the  length  of  the  maximum  diagonal  of  the  rectangles  approaches  zero  as  n 
approaches  infinity.  In  this  fashion  we  obtain  a sequence  of  real  numbers  Jni,  Jri2, 
Assuming  that  /'Oc,  y)  is  continuous  in  R and  R is  bounded  by  finitely  many  smooth  curves 
(see  Sec.  10.1),  one  can  show  (see  Ref.  [GenRef4]  in  App.  1)  that  this  sequence  converges 
and  its  limit  is  independent  of  the  choice  of  subdivisions  and  corresponding  points 
(xp-,  v'fc).  This  limit  is  called  the  double  integral  of  f{x,y)  over  the  region  R,  and  is 
denoted  by 


fix,  y ) dx  dy  or  f(x,  y)  dA. 

JRJ  R 

Double  integrals  have  properties  quite  similar  to  those  of  definite  integrals.  Indeed,  for 
any  functions  / and  g of  (x,  y),  defined  and  continuous  in  a region  R, 


kf  dx  dy  = k 


f dx  dy 


( k constant) 


(1) 


if  + g)  dx  dy  = 
f dx  dy  = 


fdxdy  + 
f dx  dy  + 


JRi 


g dx  dy 
fdx  dy 


(Fig.  228). 


Furthermore,  if  R is  simply  connected  (see  Sec.  10.2),  then  there  exists  at  least  one  point 
(x0,  yo)  in  R such  that  we  have 


(2) 


fix,  y)  dx  dy  = f(x0,  y0)A, 


where  A is  the  area  of  R.  This  is  called  the  mean  value  theorem  for  double  integrals. 


Fig.  228.  Formula  (1) 


Evaluation  of  Double  Integrals 
by  Two  Successive  Integrations 

Double  integrals  over  a region  R may  be  evaluated  by  two  successive  integrations.  We 
may  integrate  first  over  y and  then  over  x.  Then  the  formula  is 


fix,  y)  dx  dy 


/•bp  h(x) 

fix,  y)  dy 

a i g(x) 


(3) 


dx 


(Fig.  229). 
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Here  y = g(x)  and  y = h(x)  represent  the  boundary  curve  of  R (see  Fig.  229)  and,  keeping 
x constant,  we  integrate /(x,  y)  over  y from  g(x)  to  h(x).  The  result  is  a function  of  x,  and 
we  integrate  it  from  x = a to  x = b (Fig.  229). 

Similarly,  for  integrating  first  over  x and  then  over  y the  formula  is 


(4) 


r 

fdr 

<2(2/)  -| 

f(x , y)  dx  dy  = 

fix,  y)  dx 

jR* 

. 

c L 

Vw) 

(Fig.  230). 


Fig.  229.  Evaluation  of  a double  integral  Fig.  230.  Evaluation  of  a double  integral 


The  boundary  curve  of  R is  now  represented  by  x = p(y ) and  x = q(y).  Treating  y as  a 
constant,  we  first  integrate  fix,  y)  over  x from  p(y)  to  q(y)  (see  Fig.  230)  and  then  the 
resulting  function  of  y from  y = c to  y = d. 

In  (3)  we  assumed  that  R can  be  given  by  inequalities  a x g b and  g(x)  = y = h(x). 
Similarly  in  (4)  by  c Si  y Si  d and  p(y)  ^ x ts  q(y).  If  a region  R has  no  such  representation, 
then,  in  any  practical  case,  it  will  at  least  be  possible  to  subdivide  R into  finitely  many 
portions  each  of  which  can  be  given  by  those  inequalities.  Then  we  integrate /(x,  y)  over 
each  portion  and  take  the  sum  of  the  results.  This  will  give  the  value  of  the  integral  of 
fix,  y)  over  the  entire  region  R. 


Applications  of  Double  Integrals 

Double  integrals  have  various  physical  and  geometric  applications.  For  instance,  the  area 
A of  a region  R in  the  xy-plane  is  given  by  the  double  integral 


A = 


dx  dy. 


The  volume  V beneath  the  surface  z = fix,  y)  (>  0)  and  above  a region  R in  the  xy-plane 
is  (Fig.  231) 


V = 


f(x,  y ) dx  dy 


JR 


because  the  term /(x^,  y/JAA^  in  Jn  at  the  beginning  of  this  section  represents  the  volume 
of  a rectangular  box  with  base  of  area  A A^  and  altitude /(x^,  yp-). 
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Fig.  231.  Double  integral  as  volume 


As  another  application,  let/(x,  y)  be  the  density  (=  mass  per  unit  area)  of  a distribution 
of  mass  in  the  xy-plane.  Then  the  total  mass  M in  R is 


M = 


fix,  y)  dx  dy; 


the  center  of  gravity  of  the  mass  in  R has  the  coordinates  x,  y,  where 


1 

Mj 


xf(x,  y)  dx  dy  and  y = 


Mj 


yf(x,  y)  dx  dy; 


the  moments  of  inertia  Ix  and  Iy  of  the  mass  in  R about  the  x-  and  y-axes,  respectively,  are 


U.  = 


y fix,  y)  dx  dy,  L = 


xjix,  y)  dx  dy; 


JRJ  JRJ 

and  the  polar  moment  of  inertia  70  about  the  origin  of  the  mass  in  R is 


h)  Ix  "f"  Iy 


(x2  + y2)f(x,  y)  dx  dy. 


An  example  is  given  below. 


Change  of  Variables  in  Double  Integrals.  Jacobian 

Practical  problems  often  require  a change  of  the  variables  of  integration  in  double  integrals. 
Recall  from  calculus  that  for  a definite  integral  the  formula  for  the  change  from  x to  u is 


(5) 


b 

fix)  dx 


f(x(u)) 


dx 

du 


du. 


Here  we  assume  that  x = x(u)  is  continuous  and  has  a continuous  derivative  in  some 
interval  a u g /3  such  that  x(a)  = a,  x(J3)  = b [or  x(a)  = b,  x((i)  = a ] and  x(u)  varies 
between  a and  b when  u varies  between  a and  /3. 

The  formula  for  a change  of  variables  in  double  integrals  from  x,  y to  u,  v is 


fix,  y)  dx  dy 


JR' 


fixiu,v),yiu,  v)) 


d(x,  y) 
d(u,  v ) 


(6) 


du  du; 
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that  is,  the  integrand  is  expressed  in  terms  of  u and  v,  and  dx  dy  is  replaced  by  du  dv  times 
the  absolute  value  of  the  Jacobian3 


dx 

dx 

d(x,  y) 

du 

dv 

dx  dy 

dx  dy 

d(u,  v) 

dy 

du 

dy 

dv 

du  dv 

dv  du 

Here  we  assume  the  following.  The  functions 

x = x(u , v),  y = y(u,  v ) 

effecting  the  change  are  continuous  and  have  continuous  partial  derivatives  in  some  region 
R*  in  the  wu-plane  such  that  for  every  (w,  v)  in  R*  the  corresponding  point  (x,  y)  lies  in 
R and,  conversely,  to  every  (x,  y)  in  R there  corresponds  one  and  only  one  (u,  v)  in  R*\ 
furthermore,  the  Jacobian  J is  either  positive  throughout  R*  or  negative  throughout  R*. 
For  a proof,  see  Ref.  [GenRef4]  in  App.  1 . 


Change  of  Variables  in  a Double  Integral 

Evaluate  the  following  double  integral  over  the  square  R in  Fig.  232. 


(x2  + y2)  dx  dy 


Solution.  The  shape  of  R suggests  the  transformation  x + y = u,  x — y = v.  Then  x = \{u  + v), 
y = |(w  — v).  The  Jacobian  is 


J = 


d(x,  y) 


1 l 

2 2 


1 _ I 

2 2 


d(u,  V ) 

R corresponds  to  the  square  0 = m = 2,  0 = u = 2.  Therefore, 

r 2 r2 


(xz  + yz)dxdy=  I \ -(i,2  + v2)-dudv  = -. 

1 1 2 2 3 


3Named  after  the  German  mathematician  CARL  GUSTAV  JACOB  JACOBI  (1804-1851),  known  for  his 
contributions  to  elliptic  functions,  partial  differential  equations,  and  mechanics. 
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EXAMPLE  2 


Fig.  233. 

Example  2 


Of  particular  practical  interest  are  polar  coordinates  r and  0 , which  can  be  introduced 
by  setting  x = r cos  0,  y = r sin  0.  Then 


dp,  y) 
d(r,  0) 


cos  0 —r  sin  0 

sin  0 r cos  0 


r 


and 


(8) 


fix,  y)  dx  dy 


JR 


JR' 


f{r  cos  0,  r sin  0)  r dr  d0 


where  R*  is  the  region  in  the  rd-plane  corresponding  to  R in  the  xy-plane. 


Double  Integrals  in  Polar  Coordinates.  Center  of  Gravity.  Moments  of  Inertia 

Let /(a,  y)  = 1 be  the  mass  density  in  the  region  in  Fig.  233.  Find  the  total  mass,  the  center  of  gravity,  and  the 
moments  of  inertia  Ix,  Iy,  /q. 

Solution.  We  use  the  polar  coordinates  just  defined  and  formula  (8).  This  gives  the  total  mass 


M = 


TT/2  r 1 


rdrdd 


77 

4 ' 


The  center  of  gravity  has  the  coordinates 


TT/2  r 1 


r cos  6 r dr  dd  = 


77/2. 


— cos  6 dd  — — = 0.4244 
3 377 


y = — for  reasons  of  symmetry. 
37 7 


The  moments  of  inertia  are 


Ix 


Iy 


77/2  rl  r 77/2  i 

yz  dx  dy  = | | r2  sin2  6 r dr  dd  = J —sin 2 d dd 


o •'o 

77/2  j 


1 ( IT 


-(1  - cos20)c?0  = - ( 0 1 = — 


8 \ 2 


16 


77  . 77 

— for  reasons  of  symmetry,  Iq  = Ix  + Iy  = — 


0.1963 

0.3927. 


Why  are  a'  and  y less  than 


This  is  the  end  of  our  review  on  double  integrals.  These  integrals  will  be  needed  in  this 
chapter,  beginning  in  the  next  section. 
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1.  Mean  value  theorem.  Illustrate  (2)  with  an  example. 


2-8 


DOUBLE  INTEGRALS 


Describe  the  region  of  integration  and  evaluate. 

r2  r 2X 

2.  (x  + y )2  dy  dx 
■'o  ■'x 

3.  f f (xz  + y2)  dx  dy 
^ -v 

4.  Prob.  3,  order  reversed. 

5.  (1  — 2xy)  dy  dx 
■'o  V 

fV 

6.  sinh  (x  + y)  dx  dy 
■'o 

7.  Prob.  6,  order  reversed. 

r 77/4  r COS  JJ 

8.  xz  sin  y dx  dy 

■'o  ■'o 


9-11 


VOLUME 


Find  the  volume  of  the  given  region  in  space. 

9.  The  region  beneath  z = 4x2  + 9v2  and  above  the 
rectangle  with  vertices  (0,  0),  (3,  0),  (3,  2),  (0,  2)  in  the 
xv-plane. 


10.  The  first  octant  region  bounded  by  the  coordinate  planes 
and  the  surfaces  y = 1 — x2,  z = 1 — x2.  Sketch  it. 

11.  The  region  above  the  xy-plane  and  below  the  parabo- 
loid z = 1 — (x2  + y2). 


17-20 


MOMENTS  OF  INERTIA 


Find  Ix,  Iy,  I0  of  a mass  of  density /(x,  y)  = 1 in  the  region 
R in  the  figures,  which  the  engineer  is  likely  to  need,  along 
with  other  profiles  listed  in  engineering  handbooks. 


17.  R as  in  Prob.  13. 


18.  R as  in  Prob.  12. 
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10.4  Greens  Theorem  in  the  Plane 

Double  integrals  over  a plane  region  may  be  transformed  into  line  integrals  over  the 
boundary  of  the  region  and  conversely.  This  is  of  practical  interest  because  it  may  simplify 
the  evaluation  of  an  integral.  It  also  helps  in  theoretical  work  when  we  want  to  switch  from 
one  kind  of  integral  to  the  other.  The  transformation  can  be  done  by  the  following  theorem. 


THEOREM  1 


Green’s  Theorem  in  the  Plane4 

(Transformation  between  Double  Integrals  and  Line  Integrals) 

Let  R be  a closed  bounded  region  (see  Sec.  10.3)  in  the  xy-plane  whose  boundary 
C consists  of  finitely  many  smooth  cumes  (see  Sec.  10.1).  Let  Ffx,  y)  and  F2(x,  y) 
be  functions  that  are  continuous  and  have  continuous  partial  derivatives  dF\/ By 
and  dF2/dx  everywhere  in  some  domain  containing  R.  Then 


(1) 


d F2  dF  i\ 

— — ) dx  dy  = <>(Fi  dx  + F2  dy). 

dx  dy  J J 


F[ere  we  integrate  along  the  entire  boundary  C of  R in  such  a sense  that  R is  on 
the  left  as  we  advance  in  the  direction  of  integration  (see  Fig.  234). 


Fig.  234.  Region  R whose  boundary  C consists  of  two  parts: 

C,  is  traversed  counterclockwise,  while  C2  is  traversed  clockwise 
in  such  a way  that  R is  on  the  left  for  both  curves 

Setting  F = [Fi,  F2]  = Fii  + F2 j and  using  (1)  in  Sec.  9.9,  we  obtain  (1)  in  vectorial 
form. 


d') 


(curl  F)  • k dx  dy  = oF  • dr. 

c 


The  proof  follows  after  the  first  example.  For  $ see  Sec.  10.1. 


4GEORGE  GREEN  (1793-1841),  English  mathematician  who  was  self-educated,  started  out  as  a baker,  and 
at  his  death  was  fellow  of  Caius  College,  Cambridge.  His  work  concerned  potential  theory  in  connection  with 
electricity  and  magnetism,  vibrations,  waves,  and  elasticity  theory.  It  remained  almost  unknown,  even  in  England, 
until  after  his  death. 

A “domain  containing  R”  in  the  theorem  guarantees  that  the  assumptions  about  Fi  and  F2  at  boundary  points 
of  R are  the  same  as  at  other  points  of  R. 
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Verification  of  Green’s  Theorem  in  the  Plane 

Green’s  theorem  in  the  plane  will  be  quite  important  in  our  further  work.  Before  proving  it,  let  us  get  used  to 
it  by  verifying  it  for  Fi  = y2  — 7_y.  F2  = 2xy  + 2x  and  C the  circle  x2  + y2  = 1 . 

Solution.  In  (1)  on  the  left  we  get 


[(2 y + 2)  — (2y  — 7)]  dx  dy  = 9 \dxdy  = 977 


since  the  circular  disk  R has  area  77. 

We  now  show  that  the  line  integral  in  (1)  on  the  right  gives  the  same  value,  977.  We  must  orient  C 
counterclockwise,  say,  r(r)  = [cos  /,  sin  f].  Then  r'(f)  = [—sin  t,  cos  7],  and  on  C, 


F\  = y2  — ly  = sin2  1 — 7 sin  t,  F2  = 2xy  + 2x  = 2 cos  t sin  t + 2 cos  t. 


Hence  the  line  integral  in  (1)  becomes,  verifying  Green’s  theorem. 


ty(F\X  + F2y')  dt  = [(sin2  t — 1 sin  f)(— sin  t ) + 2(cos  ! sin  l + cos  ()(cos  f)]  dt 

Jc  A) 


(—sin3  / + 7 sin2  t + 2 cos2  t sin  t + 2 cos2  f)  dt 


= 0 + 777  — 0 + 277  = 977. 


We  prove  Green’s  theorem  in  the  plane,  first  for  a special  region  R that  can  be  represented 
in  both  forms 


a x Si  b. 

u(x)  g y g u(x) 

(Fig.  235) 

c Si  j g d. 

p(y)  g x g q(y) 

(Fig.  236) 

Fig.  235.  Example  of  a special  region  Fig.  236.  Example  of  a special  region 


Using  (3)  in  the  last  section,  we  obtain  for  the  second  term  on  the  left  side  of  (1)  taken 
without  the  minus  sign 


(2) 


dFy 

fb 

r(x)  dFi  i 

dx  dy  = 

dy 

dy  J 

a 

. 

dy 

u(x)  J 

(see  Fig.  235). 
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(The  first  term  will  be  considered  later.)  We  integrate  the  inner  integral: 


v(x) 


Ju(x ) 


dFi 

dy 


dy  = F±  (x,  y) 


y=v(x ) 


= F i [x,  u(x)]  — F1  [x,  m(x)]. 


y=uix) 


By  inserting  this  into  (2)  we  find  (changing  a direction  of  integration) 


dFi 

dy 


rb 


dx  dy  = 


rb 


F i [x,  u(x)]  dx 


F i [x,  m(x)]  dx 


rb 


Fx  [x,  u(x)]  dx  — 


Jb 


F i [x,  u(x)]  dx. 


Since  y = v(x)  represents  the  curve  C**  (Fig.  235)  and  y = u(x)  represents  C*,  the  last 
two  integrals  may  be  written  as  line  integrals  over  C**  and  C*  (oriented  as  in  Fig.  235); 
therefore. 


(3) 


dFi 

dy 


dx  dy  = 


F i(x,  y)  dx 


Fi(x,  y)  dx 


Jc* 


= —oF1(x,y)dx. 
c 


This  proves  (1)  in  Green’s  theorem  if  F2  = 0. 

The  result  remains  valid  if  C has  portions  parallel  to  the  v-axis  (such  as  C and  C in 
Fig.  237).  Indeed,  the  integrals  over  these  portions  are  zero  because  in  (3)  on  the  right  we 
integrate  with  respect  to  x.  Hence  we  may  add  these  integrals  to  the  integrals  over  C*  and 
C**  to  obtain  the  integral  over  the  whole  boundary  C in  (3). 

We  now  treat  the  first  term  in  (1)  on  the  left  in  the  same  way.  Instead  of  (3)  in  the  last 
section  we  use  (4),  and  the  second  representation  of  the  special  region  (see  Fig.  236). 
Then  (again  changing  a direction  of  integration) 


dF2 

fd 

(qWdF2  I 

dx  dy  = 

dx 

C 

. 

dx 

dx 

p(y)  J 

dy 


rd 


Fz(q(y),  y ) dy  + 


F2(.p(y),  y ) dy 


Jd 


°F2(x,  y)  dy. 
c 


Together  with  (3)  this  gives  (1)  and  proves  Green’s  theorem  for  special  regions. 

We  now  prove  the  theorem  for  a region  R that  itself  is  not  a special  region  but  can  be 
subdivided  into  finitely  many  special  regions  as  shown  in  Fig.  238.  In  this  case  we  apply 
the  theorem  to  each  subregion  and  then  add  the  results;  the  left-hand  members  add  up  to 
the  integral  over  R while  the  right-hand  members  add  up  to  the  line  integral  over  C plus 
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EXAMPLE  3 


6- 


X 


Fig.  237.  Proof  of  Green’s  theorem 


Fig.  238.  Proof  of  Green’s  theorem 


integrals  over  the  curves  introduced  for  subdividing  R.  The  simple  key  observation  now 
is  that  each  of  the  latter  integrals  occurs  twice,  taken  once  in  each  direction.  Hence  they 
cancel  each  other,  leaving  us  with  the  line  integral  over  C. 

The  proof  thus  far  covers  all  regions  that  are  of  interest  in  practical  problems.  To  prove 
the  theorem  for  a most  general  region  R satisfying  the  conditions  in  the  theorem,  we  must 
approximate  R by  a region  of  the  type  just  considered  and  then  use  a limiting  process. 
For  details  of  this  see  Ref.  [GenRef4]  in  App.  1 . 


Some  Applications  of  Green’s  Theorem 

Area  of  a Plane  Region  as  a Line  Integral  Over  the  Boundary 

In  (1)  we  first  choose  Fi  = 0,  — x and  then  F±  = —y,  F 2 = 0.  This  gives 


dxdy  = <p  x dy  and  | | dx  dy  = 


respectively.  The  double  integral  is  the  area  A of  R.  By  addition  we  have 


(4) 


A = — <p  (x  dy  — y dx) 


\ydx 


where  we  integrate  as  indicated  in  Green’s  theorem.  This  interesting  formula  expresses  the  area  of  R in  terms 
of  a line  integral  over  the  boundary.  It  is  used,  for  instance,  in  the  theory  of  certain  planimeters  (mechanical 
instruments  for  measuring  area).  See  also  Prob.  11. 

For  an  ellipse  x2/a 2 + y2/b2  = 1 or  x = a cos  t,  y = b sin  t we  get  x = — a sin  t,y'  = b cos  t\  thus  from 
(4)  we  obtain  the  familiar  formula  for  the  area  of  the  region  bounded  by  an  ellipse, 


1 /•  277  1 f217 

A = — I (xyr  — yx  ) dt  = — [ab  cos2  t — (~ab  sin2  /)]  dt  = irab. 

2 1 2 L 


Area  of  a Plane  Region  in  Polar  Coordinates 

Let  r and  6 be  polar  coordinates  defined  by  j k = r cos  6,y  — r sin  6.  Then 


dx  = cos  6 dr  — r sin  6 dO,  dy  = sin  6 dr  + r cos  6 dd. 
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EXAMPLE  4 


and  (4)  becomes  a formula  that  is  well  known  from  calculus,  namely. 


(5) 


[>  r2de. 
c 


As  an  application  of  (5),  we  consider  the  cardioid  r = a{  1 — cos  6),  where  0^0^  2rr  (Fig.  239).  We  find 


A = 


(1  — cos  Oydd  = 


37 T 


Transformation  of  a Double  Integral  of  the  Laplacian  of  a Function 
into  a Line  Integral  of  Its  Normal  Derivative 

The  Laplacian  plays  an  important  role  in  physics  and  engineering.  A first  impression  of  this  was  obtained  in 
Sec.  9.7,  and  we  shall  discuss  this  further  in  Chap.  12.  At  present,  let  us  use  Green’s  theorem  for  deriving  a 
basic  integral  formula  involving  the  Laplacian. 

We  take  a function  w(x,  y)  that  is  continuous  and  has  continuous  first  and  second  partial  derivatives  in  a 
domain  of  the  xy-plane  containing  a region  R of  the  type  indicated  in  Green’s  theorem.  We  set  = —dw/dy 
and  F2  = dw/dx.  Then  dF^/dy  and  dF^dx  are  continuous  in  R,  and  in  (1)  on  the  left  we  obtain 


(6) 


dF2  BFi  d2w  d2w 

dx  dy  dx2  dy2 


the  Laplacian  of  w (see  Sec.  9.7).  Furthermore,  using  those  expressions  for  Fi  and  F2,  we  get  in  (1)  on  the  right 


(7) 


\{F1dx  + F2dy)  = 
c 


dw  dx 
dy  ds 


dw  dy\  , 

I ds 

dx  ds) 


where  s is  the  arc  length  of  C,  and  C is  oriented  as  shown  in  Fig.  240.  The  integrand  of  the  last  integral  may 
be  written  as  the  dot  product 


(8) 


(grad  w)  • n 


’ dw  dw' 

dy  dx' 

dx  ’ dy 

ds  ds 

dw  dy 
dx  ds 


dw  dx 
dy  ds 


The  vector  n is  a unit  normal  vector  to  C,  because  the  vector  r (5)  = dr/ds  = [dx/ds,  dy/ ds]  is  the  unit  tangent 
vector  of  C,  and  r'  • n = 0,  so  that  n is  perpendicular  to  r . Also,  n is  directed  to  the  exterior  of  C because  in 
Fig.  240  the  positive  v-component  dx/ds  of  r'  is  the  negative  y-component  of  n,  and  similarly  at  other  points.  From 
this  and  (4)  in  Sec.  9.7  we  see  that  the  left  side  of  (8)  is  the  derivative  of  w in  the  direction  of  the  outward  normal 
of  C.  This  derivative  is  called  the  normal  derivative  of  w and  is  denoted  by  dw/dn;  that  is,  dw/dn  = (grad  w)  • n. 
Because  of  (6),  (7),  and  (8),  Green’s  theorem  gives  the  desired  formula  relating  the  Laplacian  to  the  normal  derivative. 


(9) 


V2  w dx  dy 


For  instance,  w = x2  — y2  satisfies  Laplace’s  equation  V2w  = 0.  Hence  its  normal  derivative  integrated  over  a closed 
curve  must  give  0.  Can  you  verify  this  directly  by  integration,  say,  for  the  square  0 = v 1,  0 = y = 1? 


Fig.  239.  Cardioid 


Fig.  240  Example  4 
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Green’s  theorem  in  the  plane  can  be  used  in  both  directions,  and  thus  may  aid  in  the 
evaluation  of  a given  integral  by  transforming  the  given  integral  into  another  integral  that 
is  easier  to  solve.  This  is  illustrated  further  in  the  problem  set.  Moreover,  and  perhaps 
more  fundamentally.  Green’s  theorem  will  be  the  essential  tool  in  the  proof  of  a very 
important  integral  theorem,  namely,  Stokes’s  theorem  in  Sec.  10.9. 


PR  QBLE  M SETTO  4 


1-10 


LINE  INTEGRALS:  EVALUATION 
BY  GREEN’S  THEOREM 


Evaluate  F(r)  • dr  counterclockwise  around  the  boundary 

c 

C of  the  region  R by  Green’s  theorem,  where 

1.  F = [y,  — x],  C the  circle  x2  + y2  = 1/4 

2.  F = [6y2,  2x  — 2 y ],  R the  square  with  vertices 

± (2,  2),  ±(2,  -2) 


3.  F = [x2ey,  y2ex],  R the  rectangle  with  vertices  (0,  0), 
(2,  0),  (2,  3),  (0,  3) 

4.  F = [x  cosh  2y,  2x2  sinh  2y],  R:  x2  £ y £ x 

5.  F = [x2  + y2,  x2  - y\  R:  1 £ y £ 2 — x2 

6.  F = [cosh  y,  —sinh  x],  R:  1 £ x £ 3,  x £ y £ 3x 

7.  F = grad  (x3  cos2  (xy)),  R as  in  Prob.  5 

8.  F = [— e_xcosy,  — e_xsiny],  R the  semidisk 
x2  + y2  £ 16,  x £ 0 

9.  F = [ey,x,  ev\nx  + 2x],  R:  1 + x4  S y S 2 

10.  F = [x2y2,  -x/y2],  ftlSx2  + y2S4,xS0, 
y a x.  Sketch  R. 

11.  CAS  EXPERIMENT.  Apply  (4)  to  figures  of  your 
choice  whose  area  can  also  be  obtained  by  another 
method  and  compare  the  results. 


12.  PROJECT.  Other  Forms  of  Green’s  Theorem  in 
the  Plane.  Let  R and  C be  as  in  Green’s  theorem,  r' 
a unit  tangent  vector,  and  n the  outer  unit  normal  vector 
of  C (Fig.  240  in  Example  4).  Show  that  (1)  may  be 
written 


where  k is  a unit  vector  perpendicular  to  the  xy-plane. 
Verify  (10)  and  (11)  for  F = [7x,  — 3y]  and  C the  circle 
x2  + y2  = 4 as  well  as  for  an  example  of  your  own 
choice. 


13-17 


INTEGRAL 

OF  THE  NORMAL  DERIVATIVE 


Using  (9),  find  the  value  of  '' w ds  taken  counterclockwise 

c dn 

over  the  boundary  C of  the  region  R. 

13.  w = cosh  x,  R the  triangle  with  vertices  (0,  0),  (4,  2), 

(0,  2). 

14.  w = x2y  + xy2,  R:  x2  + y2  £ 1,  x £ 0,  y £ 0 

15.  w = ex  cos  y + xy3,  R:  1 £ y £ 10  — x2,  x £ 0 

16.  W = x2  + y2,  C:  x2  + y2  = 4.  Confirm  the  answer 
by  direct  integration. 

17.  w = x3  - y3,  0 S y S x2,  |x|  S 2 

18.  Laplace’s  equation.  Show  that  for  a solution  w(x,  y) 
of  Laplace’s  equation  V2w  = 0 in  a region  R with 
boundary  curve  C and  outer  unit  normal  vector  n, 


(12) 


dx  dy 


LvdWds. 

X dn 


(10) 


div  F dx  dy 


Q>  F • n ds 

'c 


or 


(ID 


(curl  F)  • k dx  dy  = 


PF’r'ds 

c 


19.  Show  that  w = exsiny  satisfies  Laplace’s  equation 
V2vv  = 0 and,  using  (12),  integrate  w(dw/dn ) counter- 
clockwise around  the  boundary  curve  C of  the  rectangle 
0SxS2,  0SyS5. 

20.  Same  task  as  in  Prob.  19  when  w = x2  + y2  and  C 
the  boundary  curve  of  the  triangle  with  vertices  (0,  0), 
(1,  0),  (0,  1). 
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10.5  Surfaces  for  Surface  Integrals 

Whereas,  with  line  integrals,  we  integrate  over  curves  in  space  (Secs.  10.1,  10.2),  with 
surface  integrals  we  integrate  over  surfaces  in  space.  Each  curve  in  space  is  represented 
by  a parametric  equation  (Secs.  9.5,  10.1).  This  suggests  that  we  should  also  find 
parametric  representations  for  the  surfaces  in  space.  This  is  indeed  one  of  the  goals  of 
this  section.  The  surfaces  considered  are  cylinders,  spheres,  cones,  and  others.  The 
second  goal  is  to  learn  about  surface  normals.  Both  goals  prepare  us  for  Sec.  10.6  on 
surface  integrals.  Note  that  for  simplicity,  we  shall  say  “surface”  also  for  a portion  of 
a surface. 


Representation  of  Surfaces 

Representations  of  a surface  S in  xyz-space  are 

(1)  z=f(x,y ) or  g(x,  y,  z)  = 0. 

For  example,  z = +V a2  — x2  — y2  or  x2  + y2  + z2  — a2  = 0 (z  § 0)  represents  a 
hemisphere  of  radius  a and  center  0. 

Now  for  curves  C in  line  integrals,  it  was  more  practical  and  gave  greater  flexibility  to 
use  a parametric  representation  r = r(f),  where  a Si  t g b.  This  is  a mapping  of  the 
interval  a Si  t Si  b,  located  on  the  r-axis,  onto  the  curve  C (actually  a portion  of  it)  in 
xyz-space.  It  maps  every  t in  that  interval  onto  the  point  of  C with  position  vector  r(r). 
See  Fig.  241A. 

Similarly,  for  surfaces  S in  surface  integrals,  it  will  often  be  more  practical  to  use  a 
parametric  representation.  Surfaces  are  two-dimensional.  Hence  we  need  two  parameters, 


(i-axis) 


(A)  Curve 

Fig.  241.  Parametric  representations  of  a curve  and  a surface 
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EXAMPLE  2 


which  we  call  u and  v.  Thus  a parametric  representation  of  a surface  S in  space  is  of 
the  form 

(2)  r(n,  v)  = [x(u,  v),  y(u,  v ),  z(u,  u)]  = x(u , u)i  + y(u,  u)j  + z(u,  u)k 

where  ( u , v ) varies  in  some  region  R of  the  wu-plane.  This  mapping  (2)  maps  every  point 
(u,  v)  in  R onto  the  point  of  S with  position  vector  r (u,  v).  See  Fig.  241B. 

Parametric  Representation  of  a Cylinder 

The  circular  cylinder  x2  + y2  = a2,  — 1 S z S 1,  has  radius  a , height  2,  and  the  7-axis  as  axis.  A parametric 
representation  is 


r (u,  v ) = [a  cos  u,  a sin  u,  v]  = a cos  ui  + a sin  u j + uk  (Fig.  242). 

The  components  of  r are  x = a cos  u,y  = a sin  it,  z = V.  The  parameters  u,  v vary  in  the  rectangle  R'.  0 = u = 
In,  — 1 S V £ lin  the  ac-plane.  The  curves  u = const  are  vertical  straight  lines.  The  curves  0 = const  are 
parallel  circles.  The  point  P in  Fig.  242  corresponds  to  u = tt/3  = 60°,  V = 0.7. 

2 


(U=  1) 


(u  = 0) 


y 

(u  = -l) 


Fig.  242.  Parametric  representation 
of  a cylinder 

Parametric  Representation  of  a Sphere 

A sphere  x2  + y2  + z2  — a2  can  be  represented  in  the  form 

(3)  r (u,  v)  = a cos  v cos  u\  + a cos  v sin  uj  + a sin  vk 

where  the  parameters  u,  v vary  in  the  rectangle  R in  the  wu-plane  given  by  the  inequalities  0 ^ u ^ 277, 

— 77/2  = v = 77/2.  The  components  of  r are 

x = a cos  v cos  u,  y = a cos  v sin  u,  z = a sin  u. 

The  curves  u = const  and  v = const  are  the  “meridians”  and  “parallels”  on  S (see  Fig.  243).  This  representation 

is  used  in  geography  for  measuring  the  latitude  and  longitude  of  points  on  the  globe. 

Another  parametric  representation  of  the  sphere  also  used  in  mathematics  is 

(3*)  r (u,  v)  = a cos  u sin  v i + a sin  u sin  v j + a cos  v k 


z 


where  0 ^ u ^ 27 7,  0 ^ v ^ 77. 


SEC.  10.5  Surfaces  for  Surface  Integrals 


441 


EXAMPLE  3 


Parametric  Representation  of  a Cone 

A circular  cone  z — \/ x2  + y2,  0 ^ t ^ H can  be  represented  by 

r («,  v ) = [u  cos  v , u sin  v,  u\  = u cos  ui  + u sin  v j + u k, 

in  components  x = u cos  v,  y = u sin  v,  z — u.  The  parameters  vary  in  the  rectangle  R:  0 ^ u = H,  0 = v = 277. 
Check  that  x2  + y2  = z2,  as  it  should  be.  What  are  the  curves  u = const  and  v = const? 


Tangent  Plane  and  Surface  Normal 

Recall  from  Sec.  9.7  that  the  tangent  vectors  of  all  the  curves  on  a surface  S through  a point 
P of  S form  a plane,  called  the  tangent  plane  of  S at  P (Fig.  244).  Exceptions  are  points  where 
S has  an  edge  or  a cusp  (like  a cone),  so  that  S cannot  have  a tangent  plane  at  such  a point. 
Furthermore,  a vector  perpendicular  to  the  tangent  plane  is  called  a normal  vector  of  S at  P. 

Now  since  S can  be  given  by  r = r(n,  v)  in  (2),  the  new  idea  is  that  we  get  a curve  C 
on  S by  taking  a pair  of  differentiable  functions 


u = n(t),  v = v(t) 

whose  derivatives  u'  = du/dt  and  v'  = dv/dt  are  continuous.  Then  C has  the  position 
vector  r (t)  = r(w(f),  v(t)).  By  differentiation  and  the  use  of  the  chain  rule  (Sec.  9.6)  we 
obtain  a tangent  vector  of  C on  S 


r'(r) 


dr  dr  , dr  , 

— = LI  H V . 

dt  du  dv 


Hence  the  partial  derivatives  ru  and  r,,  at  P are  tangential  to  S at  P.  We  assume  that  they 
are  linearly  independent,  which  geometrically  means  that  the  curves  u = const  and 
v = const  on  S intersect  at  P at  a nonzero  angle.  Then  ru  and  r,,  span  the  tangent  plane 
of  S at  P.  Hence  their  cross  product  gives  a normal  vector  N of  S at  P. 

(4)  N = ru  x r„  ^ 0. 


The  corresponding  unit  normal  vector  n of  S at  P is  (Fig.  244) 


(5) 


n = 7 — rN  = t 

N r„  x r„ 


Fig.  244.  Tangent  plane  and  normal  vector 
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Also,  if  S is  represented  by  g(x,  y,  z)  = 0,  then,  by  Theorem  2 in  Sec.  9.7, 

(5*)  n = -grad  g. 

I grad  g I 

A surface  S is  called  a smooth  surface  if  its  surface  normal  depends  continuously  on 
the  points  of  S. 

S is  called  piecewise  smooth  if  it  consists  of  finitely  many  smooth  portions. 

For  instance,  a sphere  is  smooth,  and  the  surface  of  a cube  is  piecewise  smooth 
(explain!).  We  can  now  summarize  our  discussion  as  follows. 


THEOREM  1 


Tangent  Plane  and  Surface  Normal 

If  a surface  S is  given  by  (2)  with  continuous  ru  = dr/du  and  rv  = dr/dv  satisfying 
(4)  at  every  point  of  S,  then  S has,  at  every  point  P,  a unique  tangent  plane  passing 
through  P and  spanned  by  ru  and  rv,  and  a unique  normal  whose  direction  depends 
continuously  on  the  points  ofS.  A normal  vector  is  given  by  (4)  and  the  corresponding 
unit  normal  vector  by  (5).  (See  Fig.  244.) 


EXAMPLE  4 


Unit  Normal  Vector  of  a Sphere 

From  (5*)  we  find  that  the  sphere  g(x,  y,  z)  = x2  + y2  + z2  — a2  = 0 has  the  unit  normal  vector 


n(*,  y,  z) 


x y z 
a’  a'  a 


x . 

a 


z 

a 


k. 


We  see  that  n has  the  direction  of  the  position  vector  [x,  y,  z]  of  the  corresponding  point.  Is  it  obvious  that  this 
must  be  the  case? 


EXAMPLE  5 Unit  Normal  Vector  of  a Cone 

At  the  apex  of  the  cone  g(x,  y,  z)  — ~~z  + Vx2  + y2  = 0 in  Example  3,  the  unit  normal  vector  n becomes 
undetermined  because  from  (5*)  we  get 


x y -1  ' 

. Vl(x2  + y2)  ’ V2(x2  + y2)’  VZ. 


i / x y 

vAv^T/1  + v?T7J  “ k 


We  are  now  ready  to  discuss  surface  integrals  and  their  applications,  beginning  in  the  next 
section. 


PROBCE  M~SE  T~1Q~S 


PARAMETRIC  SURFACE  REPRESENTATION 

Familiarize  yourself  with  parametric  representations  of 
important  surfaces  by  deriving  a representation  (1),  by 
finding  the  parameter  curves  (curves  u — const  and 
v = const)  of  the  surface  and  a normal  vector  N = ru  X r„ 
of  the  surface.  Show  the  details  of  your  work. 

1.  xy-plane  r (u,  v ) = (u,  v ) (thus  u i + uj;  similarly  in 
Probs.  2-8). 

2.  xy-plane  in  polar  coordinates  r(u,  v ) = [u  cos  u,  u sin  v] 
(thus  u = r,  v = 6) 


3.  Cone  r(n,  v)  = [u  cos  v,  u sin  v,  cu] 

4.  Elliptic  cylinder  r(n,  v)  = [a  cos  v,  b sin  v,  u\ 

5.  Paraboloid  of  revolution  r(r<,  v ) = [u  cos  v,  u sin  v, 
u2] 

6.  Helicoid  r{u,  v)  = [«  cos  p,  u sin  v,  v].  Explain  the 
name. 

7.  Ellipsoid  r(u,  v)  = [a  cos  v cos  u,  b cos  v sin  u, 
c sin  u] 

8.  Hyperbolic  paraboloid  r (u,  v)  = [au  cosh  v, 
bu  sinh  v,  u2] 
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9.  CAS  EXPERIMENT.  Graphing  Surfaces,  Depen- 
dence on  a , b,  c.  Graph  the  surfaces  in  Probs.  3-8.  In 
Prob.  6 generalize  the  surface  by  introducing  parame- 
ters a,  b.  Then  find  out  in  Probs.  4 and  6-8  how  the 
shape  of  the  surfaces  depends  on  a,  b,  c. 

10.  Orthogonal  parameter  curves  u = const  and 
v = const  on  r (u,  v)  occur  if  and  only  if  ru  • r„  = 0. 
Give  examples.  Prove  it. 

11.  Satisfying  (4).  Represent  the  paraboloid  in  Prob.  5 so 
that  N(0,  0)  + 0 and  show  N. 

12.  Condition  (4).  Find  the  points  in  Probs.  1-8  at  which 
(4)  N + 0 does  not  hold.  Indicate  whether  this  results 
from  the  shape  of  the  surface  or  from  the  choice  of  the 
representation. 

13.  Representation  z = f(x,  y).  Show  that  z = f(x,  y)  or 
g = z ~ f(x,  y)  = 0 can  be  written  {fu  = df/du,  etc.) 

^ r (u,  v)  = [u,  v,  f(u,  u)]  and 

N = gradg  = [-/„,  1], 


14-19 


DERIVE  A PARAMETRIC 
REPRESENTATION 


Find  a normal  vector.  The  answer  gives  one  representation; 
there  are  many.  Sketch  the  surface  and  parameter  curves. 

14.  Plane  Ax  + 3y  + 2z  = 12 

15.  Cylinder  of  revolution  (x  — 2)2  + {y  + l)2  = 25 

16.  Ellipsoid  x2  + y2  + g z2  = 1 

17.  Sphere  *2  + (y  + 2.8)2  + (z  ~ 3.2 )2  = 2.25 

18.  Elliptic  cone  z = Vr2  + 4y2 

19.  Hyperbolic  cylinder  x2  — y2  = 1 

20.  PROJECT.  Tangent  Planes  T(P)  will  be  less 
important  in  our  work,  but  you  should  know  how  to 
represent  them. 

(a)  If  S:  r (u,  v),  then  T(P)\  (r*  — r ru  r„)  = 0 
(a  scalar  triple  product)  or 

r*(p,  q)  = r (P)  + pru(P)  + qrv(P). 

(b)  If  S:  g(x,  y,  z ) = 0,  then 

T(P):  (r*  - r(P))  • Vg  = 0. 

(c)  If  S',  z = f{x,  y),  then 

T(P):  z*~z  = ( x * - x)fx(P)  + (y*  - y)fy(P). 
Interpret  (a)— (c)  geometrically.  Give  two  examples  for 
(a),  two  for  (b),  and  two  for  (c). 


10.6  Surface  Integrals 


To  define  a surface  integral,  we  take  a surface  S,  given  by  a parametric  representation  as 
just  discussed, 


(1)  r(u,  v)  = [x(u,  v),y(u,  v),  ziu,  u)]  = x(u,  u)i  + y{u,  u)j  + z(u,  u)k 


where  ( u , v)  varies  over  a region  R in  the  wu-plane.  We  assume  S to  be  piecewise  smooth 
(Sec.  10.5),  so  that  S has  a normal  vector 

(2)  N = ru  X r„  and  unit  normal  vector  n = ——  N 

|N| 

at  every  point  (except  perhaps  for  some  edges  or  cusps,  as  for  a cube  or  cone).  For  a given 
vector  function  F we  can  now  define  the  surface  integral  over  S by 


(3) 


F • n dA  = 


F (r  (w,  v ))  • N (w,  v ) du  dv. 


Here  N = |N|n  by  (2),  and  |N|  = \ru  X r„  is  the  area  of  the  parallelogram  with  sides 
ru  and  r„,  by  the  definition  of  cross  product.  Hence 

(3*)  n dA  = n|N|  du  dv  = N du  dv. 

And  we  see  that  dA  = N|<:/h  dv  is  the  element  of  area  of  S. 
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Also  F • n is  the  normal  component  of  F.  This  integral  arises  naturally  in  flow  problems, 
where  it  gives  the  flux  across  S when  F = pv.  Recall,  from  Sec.  9.8,  that  the  flux  across 
S is  the  mass  of  fluid  crossing  S per  unit  time.  Furthermore,  p is  the  density  of  the  fluid 
and  v the  velocity  vector  of  the  flow,  as  illustrated  by  Example  1 below.  We  may  thus 
call  the  surface  integral  (3)  the  flux  integral. 

We  can  write  (3)  in  components,  using  F = [F1;  F2,  F3],  N = [/Vj . /V2,  N3], 

and  n = [cos  a,  cos  [3,  cos  y],  Here,  a,  (3,  y are  the  angles  between  n and  the  coordinate 
axes;  indeed,  for  the  angle  between  n and  i,  formula  (4)  in  Sec.  9.2  gives  cos  a = 
n • i/|n|  |i|  = n • i,  and  so  on.  We  thus  obtain  from  (3) 


(4) 


F • n dA 


(F i cos  a + F2  cos  j3  + F3  cos  y)  dA 


R 


(F\N-[  + F2N2  + F3N3)  du  dv. 


In  (4)  we  can  write  cos  a dA  = dy  dz,  cos  f3  dA  = dz  dx,  cos  y dA  = dx  dy.  Then  (4) 
becomes  the  following  integral  for  the  flux: 


(5) 


F • n dA 


(/■’i  dy  dz  + F2  dz  dx  + F3  dx  dy). 


We  can  use  this  formula  to  evaluate  surface  integrals  by  converting  them  to  double  integrals 
over  regions  in  the  coordinate  planes  of  the  xyz-coordinate  system.  But  we  must  carefully 
take  into  account  the  orientation  of  S (the  choice  of  n).  We  explain  this  for  the  integrals 
of  the  f 3-terms, 


(5') 


F3  cos  y dA  = 


F3  dx  dy. 


s 


s 


If  the  surface  S is  given  by  z = h(x,  y)  with  (x,  y)  varying  in  a region  R in  the  xy-plane, 
and  if  S is  oriented  so  that  cos  y > 0,  then  (5  ) gives 


(5") 


F3 cos  y dA 


+ 


F3(x,  v,  h (x,  y))  dx  dy. 


R 


But  if  cos  y < 0,  the  integral  on  the  right  of  (5”)  gets  a minus  sign  in  front.  This  follows 
if  we  note  that  the  element  of  area  dx  dy  in  the  xy-plane  is  the  projection  |cos  y\  dA 
of  the  element  of  area  dA  of  S',  and  we  have  cos  y = +|cos  y\  when  cos  y > 0,  but 
cos  y = — | cos  y \ when  cos  y < 0.  Similarly  for  the  other  two  terms  in  (5).  At  the  same 
time,  this  justifies  the  notations  in  (5). 

Other  forms  of  surface  integrals  will  be  discussed  later  in  this  section. 


Flux  Through  a Surface 

Compute  the  flux  of  water  through  the  parabolic  cylinder  S:  y = x2,  0 = x ~ 2,  ()  = z = 3 (Fig.  245)  if  the 
velocity  vector  is  v = F = [3z  , 6,  6xz  I , speed  being  measured  in  meters/sec.  (Generally,  F = pv,  but  water 

has  the  density  p = 1 g/cirr  = 1 ton/m  .) 
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EXAMPLE  2 


Fig.  245.  Surface  5 in  Example  1 

Solution.  Writing  x = u and  z = V,  we  have  y = x2  = u2.  Hence  a representation  of  S is 

S:  r = [«,  »2,  u]  (0  s « £ 2,  0 s t;  S 3). 

By  differentiation  and  by  the  definition  of  the  cross  product, 

N = rw  x ry  = [1,  2 u,  0]  x [0,  0,  1]  = [2 u,  -1,  0]. 

On  S,  writing  simply  F (S)  for  F[r(w,  y)],  we  have  F(S)  = \3v2,  6,  6uv].  Hence  F(S)  • N = 6 uv2  — 6.  By 
integration  we  thus  get  from  (3)  the  flux 


F • n dA  = 


(6 uv  — 6)  dudv  = (3 u v — 6 u) 


o Jo 

3 

= I (\2v2  - 12)  dv  = (4v3  - \2v) 


dv 


= 108  - 36  = 72  [m3/ sec] 


or  72,000  liters/sec.  Note  that  the  y-component  of  F is  positive  (equal  to  6),  so  that  in  Fig.  245  the  flow  goes 
from  left  to  right. 

Let  us  confirm  this  result  by  (5).  Since 

N = |N|n  = |N  | [cos  a,  cos  (3,  cosy]  = [2  u,  — 1,  0]  = [2x,  —1,  0] 

we  see  that  cos  a > 0,  cos  f3  < 0,  and  cos  y = 0.  Hence  the  second  term  of  (5)  on  the  right  gets  a minus  sign, 
and  the  last  term  is  absent.  This  gives,  in  agreement  with  the  previous  result, 


F • n dA  = 


3z  dydz  ~ 


6 dzdx=  4(3 z*)dz~  6 • 3 dx  = 4 • 3 — 6 • 3 • 2 = 72. 


Surface  Integral 

Evaluate  (3)  when  F = [x2,  0,  3y2]  and  S is  the  portion  of  the  plane  x + y + z — lin  the  first  octant  (Fig.  246). 

Solution.  Writing  x = u and  y = v,  we  have  z = 1 — x — y = 1 — u — v.  Hence  we  can  represent  the 
plane  x + y + z=  lin  the  form  r(w,  v)  = [u,  v,  1 — u — v].  We  obtain  the  first-octant  portion  S of  this  plane 
by  restricting  x = u and  y = v to  the  projection  R of  S in  the  xy-plane.  R is  the  triangle  bounded  by  the  two 
coordinate  axes  and  the  straight  line  x + y = 1,  obtained  from  x + y + z=  l by  setting  z — 0.  Thus 
0 ^ x ^ 1 - y,  0 ^ y ^ 1. 


Fig.  246.  Portion  of  a plane  in  Example  2 
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THEOREM  1 


EXAMPLE  3 


By  inspection  or  by  differentiation. 

N = rM  x r„  = [1,0,  -1]  x [0,  1,  -1]  = [1.  1,  1], 
Hence  F(S)  • N = [uz,  0,  3v2}  • [1,  1,  1]  = u2  + 3v2.  By  (3), 


F • n clA  = I I (m  + 3ir)  du  dv  = \ | (u  + 3v  ) du  dv 

R 


o o 

lr  , 


-(1  - vf  + 3u2(l  - v ) 


dv  = 


Orientation  of  Surfaces 

From  (3)  or  (4)  we  see  that  the  value  of  the  integral  depends  on  the  choice  of  the  unit 
normal  vector  n.  (Instead  of  n we  could  choose  — n.)  We  express  this  by  saying  that  such 
an  integral  is  an  integral  over  an  oriented  surface  S,  that  is,  over  a surface  S on  which 
we  have  chosen  one  of  the  two  possible  unit  normal  vectors  in  a continuous  fashion.  (For 
a piecewise  smooth  surface,  this  needs  some  further  discussion,  which  we  give  below.) 
If  we  change  the  orientation  of  S,  this  means  that  we  replace  n with  — n.  Then  each 
component  of  n in  (4)  is  multiplied  by  — 1 , so  that  we  have 


Change  of  Orientation  in  a Surface  Integral 

The  replacement  of  n by  — n ( hence  of  N by  — N)  corresponds  to  the  multiplication 
of  the  integral  in  (3)  or  (4)  by  — 1. 


In  practice,  how  do  we  make  such  a change  of  N happen,  if  S is  given  in  the  form  (1)? 
The  easiest  way  is  to  interchange  u and  v,  because  then  ru  becomes  r„  and  conversely, 
so  that  N = ru  X r„  becomes  r,,  x ru  = —ru  X rv  = — N,  as  wanted.  Let  us  illustrate 
this. 


Change  of  Orientation  in  a Surface  Integral 

In  Example  1 we  now  represent  S by  r = [v,  v2,  u],  0 = it  = 2,  0 = m = 3.  Then 
N = r„  x r„  = [0,  0,  1]  x [1,  2v,  0]  = [ -2v , 1,  0], 

For  F = [3z2,  6,  6 xz]  we  now  get  F (S)  = [3 u2,  6,  6 uv].  Hence  F(S)  • N = —6 u2v  + 6 and  integration  gives  the 
old  result  times  — 1 . 

| | F(S)  • N dv  du  = f f (— 6u2v  + 6)  dv  du  = f (— 12«2  + 12)  du  = —72. 

V u>  'v  u> 


Orientation  of  Smooth  Surfaces 

A smooth  surface  S (see  Sec.  10.5)  is  called  orientable  if  the  positive  normal  direction, 
when  given  at  an  arbitrary  point  P0  of  .S',  can  be  continued  in  a unique  and  continuous 
way  to  the  entire  surface.  In  many  practical  applications,  the  surfaces  are  smooth  and  thus 
orientable. 
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(b)  Piecewise  smooth  surface 

Fig.  247.  Orientation  of  a surface 


Orientation  of  Piecewise  Smooth  Surfaces 

Here  the  following  idea  will  do  it.  For  a smooth  orientable  surface  S with  boundary  curve 
C we  may  associate  with  each  of  the  two  possible  orientations  of  S an  orientation  of  C, 
as  shown  in  Fig.  247a.  Then  a piecewise  smooth  surface  is  called  orientable  if  we  can 
orient  each  smooth  piece  of  S so  that  along  each  curve  C*  which  is  a common  boundary 
of  two  pieces  Si  and  S2  the  positive  direction  of  C*  relative  to  Sj  is  opposite  to  the 
direction  of  C*  relative  to  S2 ■ See  Fig.  247b  for  two  adjacent  pieces;  note  the  arrows 
along  C*. 

Theory:  Nonorientable  Surfaces 

A sufficiently  small  piece  of  a smooth  surface  is  always  orientable.  This  may  not  hold  for 
entire  surfaces.  A well-known  example  is  the  Mobius  strip,5  shown  in  Fig.  248.  To  make 
a model,  take  the  rectangular  paper  in  Fig.  248,  make  a half-twist,  and  join  the  short  sides 
together  so  that  A goes  onto  A,  and  B onto  B.  At  P0  take  a normal  vector  pointing,  say, 
to  the  left.  Displace  it  along  C to  the  right  (in  the  lower  part  of  the  figure)  around  the  strip 
until  you  return  to  P0  and  see  that  you  get  a normal  vector  pointing  to  the  right,  opposite 
to  the  given  one.  See  also  Prob.  17. 


5AUGUST  FERDINAND  MOBIUS  (1790-1868),  German  mathematician,  student  of  Gauss,  known  for  his 
work  in  surface  theory,  geometry,  and  complex  analysis  (see  Sec.  17.2). 
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EXAMPLE  4 


EXAMPLE  5 


Surface  Integrals  Without  Regard  to  Orientation 

Another  type  of  surface  integral  is 


(6) 


G (r)  dA 


G(r(n,  u))|N(n,  u)|  du  dv. 
J . 

R 


Here  dA  = N | du  dv  = \ru  X r,,  | du  dv  is  the  element  of  area  of  the  surface  S represented 
by  (1)  and  we  disregard  the  orientation. 

We  shall  need  later  (in  Sec.  10.9)  the  mean  value  theorem  for  surface  integrals,  which 
states  that  if  R in  (6)  is  simply  connected  (see  Sec.  10.2)  and  G(r)  is  continuous  in  a 
domain  containing  R , then  there  is  a point  (m0,  Uq)  in  R such  that 


(7) 


G(r)dA  = G(r(«0,  Uo)M  (A  = Area  of  S). 


As  for  applications,  if  G(r)  is  the  mass  density  of  S , then  (6)  is  the  total  mass  of  S.  If 
G = 1,  then  (6)  gives  the  area  A(S)  of  S, 


(8) 


A(S) 


dA  = 
J . 


\ru  x rj  du  dv. 
J . 


S R 

Examples  4 and  5 show  how  to  apply  (8)  to  a sphere  and  a torus.  The  final  example, 
Example  6,  explains  how  to  calculate  moments  of  inertia  for  a surface. 

Area  of  a Sphere 

For  a sphere  r(u,  v)  = [a  cos  v cos  u,  a cos  v sin  u,  a sin  v],  0 S u S 277,  — 7t/2  Sd  £ 7t/2  [see  (3) 

in  Sec.  10.5].  we  obtain  by  direct  calculation  (verify!) 

o n op.  p 

ru  x ru  = cos  v cos  u,  a cos  v sin  u,  a cos  v sin  u]. 

Using  cos2  u + sin2  u = 1 and  then  cos2  v + sin2  v = 1,  we  obtain 

\ru  x rj  = a2(cos4  v cos2  u + cos4  v sin2  u + cos2  v sin2  v)1^2  = «2|cos  u|. 

With  this,  (8)  gives  the  familiar  formula  (note  that  |cos  y|  = cos  v when  — 77/2  ^ v = 77/2) 


A(S)  = a2 


rn/2 

v\du  dv  = 277 ci2  cos  v dv  = 47Ta2. 

-77-/2 


Torus  Surface  (Doughnut  Surface):  Representation  and  Area 

A torus  surface  S is  obtained  by  rotating  a circle  C about  a straight  line  L in  space  so  that  C does  not  intersect 
or  touch  L but  its  plane  always  passes  through  L.  If  L is  the  z-axis  and  C has  radius  b and  its  center  has  distance 
a (>  b ) from  L,  as  in  Fig.  249,  then  S can  be  represented  by 

r (w,  v)  = (a  + b cos  v)  cos  u i + (a  + b cos  v)  sin  u j + b sin  u k 

where  0 ^ u ^ 277,  0 ^ v ^ 27 7.  Thus 

ru  = — ( a + b cos  v)  sin  u i + (a  + b cos  v)  cos  u j 

rv  = —b  sin  v cos  u i — b sin  v sin  u j + b cos  v k 

vu  x ru  = b(a  + b cos  u)(cos  u cos  i;  i + sin  u cos  v j + sin  v k). 
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EXAMPLE  6 


Hence  |ru  X r„|  = b(a  + b cos  v),  and  (8)  gives  the  total  area  of  the  torus, 

i2-7T  r 2TT 

I b(a  + b cos  v ) du  dv  = 4rr2ab. 
o J o 


Fig.  249.  Torus  in  Example  5 

Moment  of  Inertia  of  a Surface 

Find  the  moment  of  inertia  I of  a spherical  lamina  S:  = x2  + y2  + z2  = a2  of  constant  mass  density  and  total 
mass  M about  the  z-axis. 

Solution.  If  a mass  is  distributed  over  a surface  S and  fx(x , y,  z)  is  the  density  of  the  mass  (=  mass  per  unit 
area),  then  the  moment  of  inertia  / of  the  mass  with  respect  to  a given  axis  L is  defined  by  the  surface  integral 


(10) 


/ = j j /jlD  dA 
s 


where  D(x,  y,  z ) is  the  distance  of  the  point  ( x , y,  z)  from  L.  Since,  in  the  present  example,  /x,  is  constant  and  S 
has  the  area  A = 4irci2,  we  have  /x,  = M/A  — M/{4ttci2). 

2 2 2 2 2 

For  S we  use  the  same  representation  as  in  Example  4.  Then  D = x + y = a cos  v.  Also,  as  in  that  example, 
dA  = a2  cos  vdu  dv.  This  gives  the  following  result.  [In  the  integration,  use  cos3  v = cos  v (1  — sin2  V ).  \ 


I = juzr  dA  = 


M 


4-770  J_ 


77/2  r 277 


a4  cos3  v du  dv  = ■ 


Ma 


2 fir/2 


ir/2  J0 


cos3  v dv  = - 


2 Ma2 


-17/2 


Representations  z = f(x,  _y).  If  a surface  S is  given  by  z = fix,  y),  then  setting  u = x, 
v = y,  r = [u,  v,f]  gives 

|N|  = \ru  x rj  = |[1,0,/J  x [0,  l,/j|  = \[-fu,  1] | = Vl  + ft  + f2v 

and,  since  fu  = fx,fv  = fy , formula  (6)  becomes 


(ID 


G (r)  dA 


G(x,  y,f{x,  y)) 

R 
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Here  R*  is  the  projection  of  S into  the  xy-plane  (Fig.  250)  and  the  normal  vector  N on  S 
points  up.  If  it  points  down,  the  integral  on  the  right  is  preceded  by  a minus  sign. 

From  (1 1)  with  G = 1 we  obtain  for  the  area  A (S)  of  .S':  z = f(x,  y)  the  formula 


(12) 


MS)  = 


+ 


(V\ 

\dy  J dxdy 


where  R*  is  the  projection  of  S into  the  xy-plane,  as  before. 


1-10 


FLUX  INTEGRALS  (3)  ( F • n dA 


Evaluate  the  integral  for  the  given  data.  Describe  the  kind 
of  surface.  Show  the  details  of  your  work. 

1.  F = [-x2,  y2,  0],  5:  r = [u,  v,  3u  — 2v], 

OSuS  1.5,  -2  S v £ 2 

2.  F = [ey,  ex,  1],  S:x  + y + z = 
z = 0 

3.  F = [0,  x,  0],  S:  x2+y2  + z2 
y 5 0,  z SO 

4.  F = [ ey , -ez,  ex],  S:  x2  + y2  = 
y a 0,  0 S z S 2 

5.  F = [x,  y,  z],  S\  r = [u  cos  v,  u 

0 S u S 4,  —7 t StSir 

6.  F = [coshy,  0,  sinhx],  S:  z.  = j 
0 S x S 1 

7.  F = [0,  sin  y,  cos  z],  S the  cyli 
0 S y S 7t/4  and  0 S z S y 

8.  F = [tanxy,  x,  y],  S:  y2  + z2  = 
y 5 0,  z SO 

9.  F = [0,  sinh  z,  coshx],  S:  x2  + z2  = 4, 

0 S x S 1/V2,  0 S y S 5,  cSO 

10.  F = [y2,  x2,  z4].  S:  z = Wx2  + y2,  0 S z S 8, 

y a 0 

11.  CAS  EXPERIMENT.  Flux  Integral.  Write  a pro- 
gram for  evaluating  surface  integrals  (3)  that  prints 
intermediate  results  (F,  F • N,  the  integral  over  one  of 


the  two  variables).  Can  you  obtain  experimentally  some 
rules  on  functions  and  surfaces  giving  integrals  that  can 
be  evaluated  by  the  usual  methods  of  calculus?  Make 
a list  of  positive  and  negative  results. 


12-16 


SURFACE  INTEGRALS  (6) 


G (r)  dA 


X 

IIV 

o 

V 

IIV 

o 

S 

Evaluate  these  integrals  for  the  following  data.  Indicate  the 

x a o, 

kind  of  surface.  Show  the  details. 

12.  G = cos  x + sin  x,  S the  portion  of  x + y + z = 1 

x a o. 

in  the  first  octant 
13.  G = x + y + z,  z 

= x + 2y,  0 S x S it, 

’,  W2], 

0 S y S x 

14.  G = ax  + by  + cz. 

S:  x2  + y2  + z2  = 1,  y = 0, 

H 

VII 

VII 

o 

<N  " 

z = 0 

15.  G = (1  + 9xz)3/2, 

S:  r = [u,  v,  u3],  0 S u S 1, 

' x = y2,  where 

— 2 Si)S2 
16.  G = arctan  (y/x), 

S:  z = x2  + y2,  1 S z S 9, 

2 S x S 5, 

x a 0,  y a 0 

17.  Fun  with  Mobius. 

Make  Mobius  strips  from  long  slim 

rectangles  R of  grid  paper  (graph  paper)  by  pasting  the 
short  sides  together  after  giving  the  paper  a half-twist. 
In  each  case  count  the  number  of  parts  obtained  by 
cutting  along  lines  parallel  to  the  edge,  (a)  Make  R three 
squares  wide  and  cut  until  you  reach  the  beginning, 
(b)  Make  R four  squares  wide.  Begin  cutting  one  square 
away  from  the  edge  until  you  reach  the  beginning.  Then 
cut  the  portion  that  is  still  two  squares  wide,  (c)  Make 
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R five  squares  wide  and  cut  similarly,  (d)  Make  R six 
squares  wide  and  cut.  Formulate  a conjecture  about  the 
number  of  parts  obtained. 

18.  Gauss  “Double  Ring”  (See  Mobius,  Works  2,  518- 
559).  Make  a paper  cross  (Fig.  251)  into  a “double  ring” 
by  joining  opposite  arms  along  their  outer  edges  (without 
twist),  one  ring  below  the  plane  of  the  cross  and  the  other 
above.  Show  experimentally  that  one  can  choose  any  four 
boundary  points  A,  B,  C,  D and  join  A and  C as  well  as 
B and  D by  two  nonintersecting  curves.  What  happens  if 
you  cut  along  the  two  curves?  If  you  make  a half-twist 
in  each  of  the  two  rings  and  then  cut?  (Cf.  E.  Kreyszig, 
Proc.  CSHPM  13  (2000),  23^13.) 


b 


Fig.  251.  Problem  18.  Gauss  “Double  Ring” 


APPLICATIONS 

19.  Center  of  gravity.  Justify  the  following  formulas  for 
the  mass  M and  the  center  of  gravity  ( x , y,  z)  of  a lamina 
S of  density  (mass  per  unit  area)  cr(x,  y,  z)  in  space: 


M = 


y = 


M 


ycrdA, 


xcrdA, 

J zcrdA. 


s s 

20.  Moments  of  inertia.  Justify  the  following  formulas 
for  the  moments  of  inertia  of  the  lamina  in  Prob.  19 
about  the  x-,  y-,  and  z-axes,  respectively: 


Ix  = (y  + z )crdA,  Iy  = (x*  + z>  dA 


4=||  (x2  + y2)o-  dA. 


21.  Find  a formula  for  the  moment  of  inertia  of  the  lamina 
in  Prob.  20  about  the  line  y = x,  z = 0. 


22-23 


Find  the  moment  of  inertia  of  a lamina  S of  density 


1 about  an  axis  B,  where 


22.  5:  x2  + y2  — 1,  0 S z S h,  B:  the  line  z = h/2  in 
the  xz-plane 

23.  5:  x2  + y2  = zZ,  0 S z S h,  B:  the  z-axis 

24.  Steiner’s  theorem.6  If  IB  is  the  moment  of  inertia  of 
a mass  distribution  of  total  mass  M with  respect  to  a line 
B through  the  center  of  gravity,  show  that  its  moment 
of  inertia  Ik  with  respect  to  a line  K,  which  is  parallel 
to  B and  has  the  distance  k from  it  is 


Ik  = Ib  + k2M. 


25.  Using  Steiner’s  theorem,  find  the  moment  of  inertia  of 
a mass  of  density  1 on  the  sphere  S:  x2  + y2  + z2  — 1 
about  the  line  K.x=  l,y  = 0 from  the  moment  of 
inertia  of  the  mass  about  a suitable  line  B,  which  you 
must  first  calculate. 

26.  TEAM  PROJECT.  First  Fundamental  Form  of  S. 

Given  a surface  S:  r (u,  v),  the  differential  form 

(13)  ds2  = Edu2  + 2 Fdu  dv  + G dv2 


with  coefficients  (in  standard  notation,  unrelated  to  F, 
G elsewhere  in  this  chapter) 

(14)  E = ru  • tu,  F = ru  • r„,  G = r„  • r„ 


is  called  the  first  fundamental  form  of  S.  This  form 
is  basic  because  it  permits  us  to  calculate  lengths, 
angles,  and  areas  on  S.  To  show  this  prove  (a)-(c): 

(a)  For  a curve  C:  u = u{t),v  = v (t),  a S t S b,  on 
5,  fonnulas  (10),  Sec.  9.5,  and  (14)  give  the  length 


/ = 


(15) 


Vr'  (f)  • r'  (?)  dt 
{ VEu'2  + 2 Fu'v'  + Gv'2dt. 


(b)  The  angle  y between  two  intersecting  curves 
Ci:  u = g(t),  v = h(t)  and  C2'.  u — p(t),v  = q(t)  on 
S:  r(u,  v)  is  obtained  from 

a*b 

(16)  cos  y = 

|a||b| 

where  a = r ug'  + r vh'  andb  = r up  + r vq  are  tan- 
gent vectors  of  C 1 and  C2. 


6JAC0B  STEINER  (1796-1863),  Swiss  geometer,  bom  in  a small  village,  learned  to  write  only  at  age  14, 
became  a pupil  of  Pestalozzi  at  1 8,  later  studied  at  Heidelberg  and  Berlin  and,  finally,  because  of  his  outstanding 
research,  was  appointed  professor  at  Berlin  University. 
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(c)  The  square  of  the  length  of  the  normal  vector  N 
can  be  written 

(17)  |N|2  = \ru  x r„|2  = EG  - F2, 

so  that  formula  (8)  for  the  area  A ( S ) of  S becomes 


R 

(d)  For  polar  coordinates  u (=  r)  and  v (=  6)  defined 
by  x = u cos  v,  y = u sin  v we  have  E = 1,  F — 0, 

G = u2,  so  that 

ds 2 = du2  + u2  dv2  = dr2  + r2  dd2. 

Calculate  from  this  and  (18)  the  area  of  a disk  of 
radius  a. 

10.  Triple  Integrals. 

Divergence  Theorem  of  Gauss 

In  this  section  we  discuss  another  “big”  integral  theorem,  the  divergence  theorem,  which 
transforms  surface  integrals  into  triple  integrals.  So  let  us  begin  with  a review  of  the 
latter. 

A triple  integral  is  an  integral  of  a function  fix,  y,  z)  taken  over  a closed  bounded, 
three-dimensional  region  T in  space.  (Note  that  “closed”  and  “bounded”  are  defined  in 
the  same  way  as  in  footnote  2 of  Sec.  10.3,  with  “sphere”  substituted  for  “circle”).  We 
subdivide  T by  planes  parallel  to  the  coordinate  planes.  Then  we  consider  those  boxes  of 
the  subdivision  that  lie  entirely  inside  T,  and  number  them  from  1 to  n.  Here  each  box 
consists  of  a rectangular  parallelepiped.  In  each  such  box  we  choose  an  arbitrary  point, 
say,  (xfc,  yk,  Zk ) in  box  k.  The  volume  of  box  k we  denote  by  A Vp-.  We  now  form  the  sum 

n 

Jn  = 2 /(*fc>  Tfc>  2 k)  A Vk- 
k= 1 

This  we  do  for  larger  and  larger  positive  integers  n arbitrarily  but  so  that  the  maximum 
length  of  all  the  edges  of  those  n boxes  approaches  zero  as  n approaches  infinity.  This 
gives  a sequence  of  real  numbers  Jnv  Jn.2,  • • • . We  assume  that  /iv,  y,  z)  is  continuous  in  a 
domain  containing  T,  and  T is  bounded  by  finitely  many  smooth  surfaces  (see  Sec.  10.5). 
Then  it  can  be  shown  (see  Ref.  [GenRef4]  in  App.  1)  that  the  sequence  converges  to 
a limit  that  is  independent  of  the  choice  of  subdivisions  and  corresponding  points 


(18) 


A(S)  = \\dA  = |N|  dudv 


= ||  VEG  - F2  du  dv. 


(e)  Find  the  first  fundamental  form  of  the  torus  in 
Example  5.  Use  it  to  calculate  the  area  A of  the  torus. 
Show  that  A can  also  be  obtained  by  the  theorem  of 
Pappus,7  which  states  that  the  area  of  a surface  of 
revolution  equals  the  product  of  the  length  of  a 
meridian  C and  the  length  of  the  path  of  the  center  of 
gravity  of  C when  C is  rotated  through  the  angle  27 r. 

(f ) Calculate  the  first  fundamental  form  for  the  usual 
representations  of  important  surfaces  of  your  own 
choice  (cylinder,  cone,  etc.)  and  apply  them  to  the 
calculation  of  lengths  and  areas  on  these  surfaces. 


7PAPPUS  OF  ALEXANDRIA  (about  A.D.  300),  Greek  mathematician.  The  theorem  is  also  called  Guidin’ s 
theorem.  HABAKUK  GULDIN  (1577-1643)  was  born  in  St.  Gallen,  Switzerland,  and  later  became  professor 
in  Graz  and  Vienna. 
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(xfa  Vfe,  Zfe).  This  limit  is  called  the  triple  integral  af  fix,  y,  z)  over  the  region  T and  is 
denoted  by 


fix,  y,  z ) dx  dy  dz  or  by 
J J . 

T 


fix,  y,  z)  dv. 
J J . 

T 


Triple  integrals  can  be  evaluated  by  three  successive  integrations.  This  is  similar  to  the 
evaluation  of  double  integrals  by  two  successive  integrations,  as  discussed  in  Sec.  10.3. 
Example  1 below  explains  this. 


Divergence  Theorem  of  Gauss 

Triple  integrals  can  be  transformed  into  surface  integrals  over  the  boundary  surface  of  a 
region  in  space  and  conversely.  Such  a transformation  is  of  practical  interest  because  one 
of  the  two  kinds  of  integral  is  often  simpler  than  the  other.  It  also  helps  in  establishing 
fundamental  equations  in  fluid  flow,  heat  conduction,  etc.,  as  we  shall  see.  The  transformation 
is  done  by  the  divergence  theorem,  which  involves  the  divergence  of  a vector  function 
F = [Fi,  F2,  F3]  = F{\  + F2\  + F3k,  namely. 


(1) 


div  F 


i)F\  d F2  d F3 

+ 4- 

dx  dy  dz 


(Sec.  9.8). 


THEOREM  1 


Divergence  Theorem  of  Gauss 

(Transformation  Between  Triple  and  Surface  Integrals) 

Let  T be  a closed  bounded  region  in  space  whose  boundary  is  a piecewise  smooth 
orientable  surface  S.  Let  F(x,  y,  z)  be  a vector  function  that  is  continuous  and  has 
continuous  first  partial  derivatives  in  some  domain  containing  T.  Then 


(2) 


div  F d V 
J J . 

T 


F • n dA. 

J . 
s 


In  components  of  F = [Fi,  F2,  F3]  and  of  the  outer  unit  normal  vector 

n = [cos  a,  cos  [3,  cos  y]  of  S (as  in  Fig.  253),  formula  (2)  becomes 


(dJ± 

\ dx 


dF2  dF3 
dy  dz 


dx  dy  dz 


(2*) 


(Fj  cos  a + F2  cos  [3  + F3  cos  y)  dA 


J 

s 


(f-\  dy  dz  + F2  dz  dx  + F3  dx  dy). 
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Fig.  252.  Surface  S 
in  Example  1 


PROOF 


“Closed  bounded  region”  is  explained  above,  “piecewise  smooth  orientable”  in  Sec.  10.5, 
and  “domain  containing  T”  in  footnote  4,  Sec.  10.4,  for  the  two-dimensional  case. 
Before  we  prove  the  theorem,  let  us  show  a standard  application. 


Evaluation  of  a Surface  Integral  by  the  Divergence  Theorem 

Before  we  prove  the  theorem,  let  us  show  a typical  application.  Evaluate 

I = ||  (x3  dy  dz  + x2y  dzdx  + x2zdx  dy) 
s 

where  S is  the  closed  surface  in  Fig.  252  consisting  of  the  cylinder  x2  + y2  = a2  (0  ^ z = b)  and  the  circular 
disks  z — 0 and  z — b {x2  + y2  ^ a2). 

Solution.  F1  = x 3,  F2  — x*y.  F%  = A.  Hence  div  F = 3x2  + x2  + x2  — 5x2.  The  form  of  the  surface 
suggests  that  we  introduce  polar  coordinates  r,  0 defined  by  x = r cos  6,y  = r sin  0 (thus  cylindrical  coordinates 
r,  0,  z).  Then  the  volume  element  is  dx  dy  dz  = r dr  dd  dz,  and  we  obtain 

r b r 2-77  r a 

5x2  dx  dy  dz  = (5 r2  cos2  0)  r dr  dd  dz 

Jz=0Je=oJr=o 


= 5 


2-77  4 

— cos2  Odd  dz  = 5 
'z- oJe=o  ^ 


b 4 

a 77 


dz  = 


577 


orb. 


We  prove  the  divergence  theorem,  beginning  with  the  first  equation  in  (2*).  This 
equation  is  true  if  and  only  if  the  integrals  of  each  component  on  both  sides  are  equal; 
that  is, 


(3) 

(4) 

(5) 


()F  \ 

dx  dy  dz 

JJJ  dx 

T 


Fi  cos  a dA, 
J . 
s 


dF2 

dx  dy  dz 

JJJ  dy 

T 


F2  cos  dA, 
J . 
s 


d/*3 

dx  dy  dz 

JJJ  dz 

T 


F3  cos  y dA. 
J . 

S 


We  first  prove  (5)  for  a special  region  T that  is  bounded  by  a piecewise  smooth 
orientable  surface  S and  has  the  property  that  any  straight  line  parallel  to  any  one  of  the 
coordinate  axes  and  intersecting  T has  at  most  one  segment  (or  a single  point)  in  common 
with  T.  This  implies  that  T can  be  represented  in  the  form 


(6)  g (x,  y)  = z = h(x,  y) 

where  (x,  y)  varies  in  the  orthogonal  projection  R of  Tin  the  xv-plane.  Clearly,  z — g(x,  y) 
represents  the  “bottom”  S2  of  S (Fig.  253),  whereas  z = h(x,  y)  represents  the  “top”  .Sj  of 
S,  and  there  may  be  a remaining  vertical  portion  S3  of  S.  (The  portion  S3  may  degenerate 
into  a curve,  as  for  a sphere.) 
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To  prove  (5),  we  use  (6).  Since  F is  continuously  differentiable  in  some  domain  containing 
T,  we  have 


(7) 


37*3 

dx  dy  dz 

JJJ  dz 

T 


r 

J . 

R 

h(.x,  y) 


Jg(x,y) 


dF3 

dz 


dz 


dx  dy. 


Integration  of  the  inner  integral  [ ■ • • ] gives  F3[x,  y,  h (x,  y)]  — F3[x,  y,  g(x,  y)].  Hence  the 
triple  integral  in  (7)  equals 


(8) 


F3[x,  y,  h (x,  y)]  dx  dy 

R 


F?J[x,  y,  g (x,  y)]  dx  dy. 

R 


Fig.  253.  Example  of  a special  region 


But  the  same  result  is  also  obtained  by  evaluating  the  right  side  of  (5);  that  is  [see  also 
the  last  line  of  (2*)], 


F3 cos  y dA 
J . 

S 


F3  dx  dy 
J . 
s 


= + 


F3[x,  y,  h ( x , y)]  dx  dy 
J_. 

R 


F3[x,  y,  g(x,  y)]  dx  dy, 

R 


where  the  first  integral  over  R gets  a plus  sign  because  cos  y > 0 on  Si  in  Fig.  253  [as 
in  ^ ),  Sec.  10.6],  and  the  second  integral  gets  a minus  sign  because  cos  y < 0 on 
This  proves  (5). 

The  relations  (3)  and  (4)  now  follow  by  merely  relabeling  the  variables  and  using  the 
fact  that,  by  assumption,  T has  representations  similar  to  (6),  namely, 

g(y,z)  g x g h (y,  z)  and  g(z,  x)  ^ y ^ h(z,x). 


This  proves  the  first  equation  in  (2*)  for  special  regions.  It  implies  (2)  because  the  left  side 
of  (2*)  is  just  the  definition  of  the  divergence,  and  the  right  sides  of  (2)  and  of  the  first 
equation  in  (2*)  are  equal,  as  was  shown  in  the  first  line  of  (4)  in  the  last  section.  Finally, 
equality  of  the  right  sides  of  (2)  and  (2*),  last  line,  is  seen  from  (5)  in  the  last  section. 

This  establishes  the  divergence  theorem  for  special  regions. 
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For  any  region  T that  can  be  subdivided  into  finitely  many  special  regions  by  means  of 
auxiliary  surfaces,  the  theorem  follows  by  adding  the  result  for  each  part  separately.  This 
procedure  is  analogous  to  that  in  the  proof  of  Green’s  theorem  in  Sec.  10.4.  The  surface 
integrals  over  the  auxiliary  surfaces  cancel  in  pairs,  and  the  sum  of  the  remaining  surface 
integrals  is  the  surface  integral  over  the  whole  boundary  surface  S of  T;  the  triple  integrals 
over  the  parts  of  T add  up  to  the  triple  integral  over  T. 

The  divergence  theorem  is  now  proved  for  any  bounded  region  that  is  of  interest  in 
practical  problems.  The  extension  to  a most  general  region  T of  the  type  indicated  in  the 
theorem  would  require  a certain  limit  process;  this  is  similar  to  the  situation  in  the  case 
of  Green’s  theorem  in  Sec.  10.4. 

Verification  of  the  Divergence  Theorem 

Evaluate  f|  (7xi  — jk)  • n dA  over  the  sphere  S:  x2  + y2  + z2  = 4 (a)  by  (2),  (b)  directly. 

s 

Solution,  (a)  div  F = div  [lx,  0,  — z]  = div  [7.vi  — zk]  = 7 — 1=6.  Answer:  6 ■ (|)7T  ■ 23  = 6477. 

(b)  We  can  represent  S by  (3),  Sec.  10.5  (with  a = 2),  and  we  shall  use  n dA  = N du  dv  [see  (3*),  Sec.  10.6], 
Accordingly, 


S : r = [2  cos  v cos  u,  2 cos  v sin  u,  2 sin  u | 

Then  ru  = [—2  cos  v sin  u,  2 cos  v cos  u,  0] 

r„  = [—2  sin  v cos  u,  —2  sin  v sin  u,  2 cos  y] 

N = r,  X r„  = [4  cos2  v cos  u,  4 cos2  v sin  u,  4 cos  y sin  v]. 

Now  on  S we  have  x = 2 cos  v cos  u,  z = 2 sin  v,  so  that  F = [lx,  0,  — z]  becomes  on  S 

F(S)  = [14  cos  v cos  u,  0,  — 2 sin  y] 

and  F(S)  • N = (14  cos  y cos  u)  ■ 4 cos2  y cos  u + (—2  sin  y)  ■ 4 cos  y sin  y 

= 56  cos3  y cos2  u — 8 cos  y sin2  y. 

On  S we  have  to  integrate  over  u from  0 to  277.  This  gives 

77  ■ 56  cos3  y — 277  • 8 cos  y sin2  y. 

The  integral  of  cos  y sin2  v equals  (sin3  y)/3,  and  that  of  cos3  y = cos  y (1  — sin2  y)  equals  sin  y — (sin3  y)/3. 

On  S we  have  —77/2  S y 77/2,  so  that  by  substituting  these  limits  we  get 

5677(2  - |)  - 1677  • | = 6477 

as  hoped  for.  To  see  the  point  of  Gauss’s  theorem,  compare  the  amounts  of  work. 


Coordinate  Invariance  of  the  Divergence.  The  divergence  (1)  is  defined  in  terms  of 
coordinates,  but  we  can  use  the  divergence  theorem  to  show  that  div  F has  a meaning 
independent  of  coordinates. 

For  this  purpose  we  first  note  that  triple  integrals  have  properties  quite  similar  to  those 
of  double  integrals  in  Sec.  10.3.  In  particular,  the  mean  value  theorem  for  triple  integrals 
asserts  that  for  any  continuous  function /(.r,  y,  z)  in  a bounded  and  simply  connected  region 
T there  is  a point  Q .(x 0,  yo,  zq)  in  T such  that 


(9) 


fix,  y,  z)  dV  = f(x0,  yo,  z0)  V{T)  (V(T)  = volume  of  T). 
J J . 

T 
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In  this  formula  we  interchange  the  two  sides,  divide  by  V(T),  and  set  / = div  F.  Then  by 
the  divergence  theorem  we  obtain  for  the  divergence  an  integral  over  the  boundary  surface 
S(T)  of  T, 


(10) 


div  F(x0,  To,  zQ)  = 


V(T)  J 


div  F dV  = 


V(T) 


F • n dA. 


SCT) 


We  now  choose  a point  z{)  in  T and  let  T shrink  down  onto  P so  that  the 

maximum  distance  d(T)  of  the  points  of  T from  P goes  to  zero.  Then  Q:  (x0,  yo,  Zo)  must 
approach  P.  Hence  (10)  becomes 


(ID 


This  proves 


div  F (P) 


lim  

d(T)—*0  V(T)  . 


SCO 


F • n dA. 


THEOREM  2 


Invariance  of  the  Divergence 

The  divergence  of  a vector  function  F with  continuous  first  partial  derivatives  in  a 
region  T is  independent  of  the  particular  choice  of  Cartesian  coordinates.  For  any 
P in  T it  is  given  by  (11). 


Equation  (11)  is  sometimes  used  as  a definition  of  the  divergence.  Then  the  representation  (1) 
in  Cartesian  coordinates  can  be  derived  from  (11). 

Further  applications  of  the  divergence  theorem  follow  in  the  problem  set  and  in  the 
next  section.  The  examples  in  the  next  section  will  also  shed  further  light  on  the  nature 
of  the  divergence. 


PROBLEM  SET  1017 


1-8 


APPLICATION:  MASS  DISTRIBUTION 


Find  the  total  mass  of  a mass  distribution  of  density  cr  in 
a region  T in  space. 

1.  cr  = x2  + y2  + z2,  T the  box  \x\  fi  4,  |y|  fi  1, 

0 fi  z fi  2 


2.  cr  = xyz,  T the  box  0 fi  x £ a,  OSyS  b, 

0 £ z fi  c 

3.  cr  = e~x~v~z,  T:  0 £ x £ 1 - y,  OSySl, 

0 fi  z fi  2 

4.  cr  as  in  Prob.  3,  T the  tetrahedron  with  vertices  (0,  0,  0), 
(3,  0,  0),  (0,  3,  0),  (0,  0,  3) 

5.  cr  = sin  2x  cos  2 y,  T : 0 £ x £ 577, 

477- — x S y £ 577-,  0£z£6 

6.  cr  = x2y2z2,  T the  cylindrical  region  x2  + z2  £ 16, 
lyl  s 4 

7.  cr  = arctan  ( y/x ),  T:  x2  + yz  + z2  £ a2,  z = 0 

8.  cr  = x2  + v2,  T as  in  Prob.  7 


9-18 


APPLICATION 

OF  THE  DIVERGENCE  THEOREM 


Evaluate  the  surface  integral  F • n dA  by  the  divergence 

■'■'s 

theorem.  Show  the  details. 

9.  F = [x2,  0,  z2],  S the  surface  of  the  box  \x\  £ 1, 
|y|  £ 3,  0 £ z £ 2 


10.  Solve  Prob.  9 by  direct  integration. 

11.  F = [ex,  ey,  ez\,  S the  surface  of  the  cube  \x\  £ 1, 

|v|  fi  1,  Id  £ 1 

12.  F = [x3  - y3,  y3  - z3,  z3  - x3],  5 the  surface  of 

x2  + y2  + z2  fi  25,  z a 0 

13.  F = [sin  y,  cos  x,  cos  z],  S,  the  surface  of 

x2  + y2  £ 4,  |z|  £ 2 (a  cylinder  and  two  disks!) 

14.  F as  in  Prob.  13,  S the  surface  of  x2  + y2  £ 9, 
0 fi  z £ 2 
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15.  F = [ 2x 2 , \ y2,  sin  7Tz\,  S the  surface  of  the  tetrahe- 
dron with  vertices  (0,  0,  0),  (1,0,  0),  (0,  1,  0),  (0,  0,  1) 


19.  The  box  —a  £ x £ a,  —b  £ y £ b,  —c  £ z£  c 

20.  The  ball  x2  + y2  + zZ  & a2 

21.  The  cylinder  y2  + 2 £ a2,  0 S x £ h 

22.  The  paraboloid  y2  + z2  £ x,  OSrS/i 

23.  The  cone  y2  + z2  & x2,  OSrS  h 

24.  Why  is  Ix  in  Prob.  23  for  large  h larger  than  Ix  in  Prob. 


16.  F = [cosh  jc,  z,  y],  S as  in  Prob.  15 


17.  F = [x2,  y2,  z2],  S the  surface  of  the  cone  x2  + y2  £ z2, 
0 £ z £ h 


18.  F = [xy,  yz,  zx],  S the  surface  of  the  cone  x2  + y2 
£ 4 z2,  0£z£2 


22  (and  the  same  h)2  Why  is  it  smaller  for  h — 1?  Give 
physical  reason. 


19-23  APPLICATION:  MOMENT  OF  INERTIA 


Given  a mass  of  density  1 in  a region  T of  space,  find  the 
moment  of  intertia  about  the  jc-axis 


25.  Show  that  for  a solid  of  revolution,  1X  = r4  (x)  dx. 


Solve  Probs.  20-23  by  this  formula. 

(y2  + z2)  dx  dy  dz. 


The  divergence  theorem  has  many  important  applications:  \n  fluid  flow,  it  helps  characterize 
sources  and  sinks  of  fluids.  In  heat  flow,  it  leads  to  the  heat  equation.  In  potential  theory, 
it  gives  properties  of  the  solutions  of  Laplace’s  equation.  In  this  section,  we  assume  that 
the  region  T and  its  boundary  surface  S are  such  that  the  divergence  theorem  applies. 


purpose  we  consider  the  flow  of  an  incompressible  fluid  (see  Sec.  9.8)  of  constant  density  p = 1 which  is  steady, 
that  is,  does  not  vary  with  time.  Such  a flow  is  determined  by  the  field  of  its  velocity  vector  v (P)  at  any  point  P. 

Let  S be  the  boundary  surface  of  a region  T in  space,  and  let  n be  the  outer  unit  normal  vector  of  S.  Then 
v • n is  the  normal  component  of  v in  the  direction  of  n,  and  | v • n dA  \ is  the  mass  of  fluid  leaving  T (if  v • n > 0 
at  some  P)  or  entering  T (if  v • n < 0 at  P)  per  unit  time  at  some  point  P of  S through  a small  portion  AS  of 
S of  area  A A.  Hence  the  total  mass  of  fluid  that  flows  across  S from  T to  the  outside  per  unit  time  is  given  by 
the  surface  integral 


Since  the  flow  is  steady  and  the  fluid  is  incompressible,  the  amount  of  fluid  flowing  outward  must  be  continuously 
supplied.  Hence,  if  the  value  of  the  integral  (1)  is  different  from  zero,  there  must  be  sources  (positive  sources 
and  negative  sources,  called  sinks)  in  T,  that  is,  points  where  fluid  is  produced  or  disappears. 

If  we  let  T shrink  down  to  a fixed  point  P in  T,  we  obtain  from  (1)  the  source  intensity  at  P given  by  the 
right  side  of  (11)  in  the  last  section  with  F • n replaced  by  v • n,  that  is. 


T 


10.8  Further 
of  the 


Theorem 


Fluid  Flow.  Physical  Interpretation  of  the  Divergence 


From  the  divergence  theorem  we  may  obtain  an  intuitive  interpretation  of  the  divergence  of  a vector.  For  this 


v • n dA. 


s 


Division  by  the  volume  V of  T gives  the  average  flow  out  of  T: 


(1) 


s 


(2) 


SCO 
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Hence  the  divergence  of  the  velocity  vector  v of  a steady  incompressible  flow  is  the  source  intensity  of  the  flow 
at  the  corresponding  point. 

There  are  no  sources  in  T if  and  only  if  div  v is  zero  everywhere  in  T.  Then  for  any  closed  surface  S in  T we  have 

1 1 v • n dA  = 0. 
s 

Modeling  of  Heat  Flow.  Heat  or  Diffusion  Equation 

Physical  experiments  show  that  in  a body,  heat  flows  in  the  direction  of  decreasing  temperature,  and  the  rate  of 
flow  is  proportional  to  the  gradient  of  the  temperature.  This  means  that  the  velocity  v of  the  heat  flow  in  a body 
is  of  the  form 

(3)  v = -K  grad  U 

where  U (x,  y,  z,  t ) is  temperature,  t is  time,  and  K is  called  the  thermal  conductivity  of  the  body;  in  ordinary 
physical  circumstances  K is  a constant.  Using  this  information,  set  up  the  mathematical  model  of  heat  flow,  the 
so-called  heat  equation  or  diffusion  equation. 

Solution.  Let  T be  a region  in  the  body  bounded  by  a surface  S with  outer  unit  normal  vector  n such  that 
the  divergence  theorem  applies.  Then  v • n is  the  component  of  v in  the  direction  of  n,  and  the  amount  of  heat 
leaving  T per  unit  time  is 


1 1 v • n dA. 
s 

This  expression  is  obtained  similarly  to  the  corresponding  surface  integral  in  the  last  example.  Using 

div  (grad  U)  = V2U  = Uxx  + Uyy  + Uzz 
(the  Laplacian;  see  (3)  in  Sec.  9.8),  we  have  by  the  divergence  theorem  and  (3) 


(4) 


v • n dA  = — j j j div  (grad  U)  dx  dy  dz 
s'  T 

= —K  | J J V2  U dx  dy  dz. 

T 


On  the  other  hand,  the  total  amount  of  heat  H in  T is 


H = | | j (ipU  dx  dy  dz 
T 


where  the  constant  a is  the  specific  heat  of  the  material  of  the  body  and  p is  the  density  (=  mass  per  unit  volume) 
of  the  material.  Hence  the  time  rate  of  decrease  of  H is 


dH 

dt 


dU 

or  p -7—  dx  dy  dz 
dt 


and  this  must  be  equal  to  the  above  amount  of  heat  leaving  T.  From  (4)  we  thus  have 


a p dx  dy  dz  — ~K  | | | Vz  U dx  dy  dz 


a p — — K V2t/  )dx  dy  dz  — 0. 
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Since  this  holds  for  any  region  T in  the  body,  the  integrand  (if  continuous)  must  be  zero  everywhere;  that  is, 


(5) 


SU  o o 
— = c2V2U 
dt 


K 

ap 


where  c2  is  called  the  thermal  dijfusivity  of  the  material.  This  partial  differential  equation  is  called  the  heat 
equation.  It  is  the  fundamental  equation  for  heat  conduction.  And  our  derivation  is  another  impressive 
demonstration  of  the  great  importance  of  the  divergence  theorem.  Methods  for  solving  heat  problems  will  be 
shown  in  Chap.  12. 

The  heat  equation  is  also  called  the  diffusion  equation  because  it  also  models  diffusion  processes  of  motions 
of  molecules  tending  to  level  off  differences  in  density  or  pressure  in  gases  or  liquids. 

If  heat  flow  does  not  depend  on  time,  it  is  called  steady-state  heat  flow.  Then  dU/dt  = 0,  so  that  (5)  reduces 
to  Laplace’s  equation  V2U  = 0.  We  met  this  equation  in  Secs.  9.7  and  9.8,  and  we  shall  now  see  that  the 
divergence  theorem  adds  basic  insights  into  the  nature  of  solutions  of  this  equation. 


Potential  Theory.  Harmonic  Functions 

The  theory  of  solutions  of  Laplace’s  equation 


(6) 


V2/ 


dzf  d2f  d2f 

1 1 

->2  -.2  2 

dx  dy  dz 


is  called  potential  theory.  A solution  of  (6)  with  continuous  second-order  partial  derivatives 
is  called  a harmonic  function.  That  continuity  is  needed  for  application  of  the  divergence 
theorem  in  potential  theory,  where  the  theorem  plays  a key  role  that  we  want  to  explore. 
Further  details  of  potential  theory  follow  in  Chaps.  12  and  18. 

A Basic  Property  of  Solutions  of  Laplace’s  Equation 

The  integrands  in  the  divergence  theorem  are  div  F and  F • n (Sec.  10.7).  If  F is  the  gradient  of  a scalar  function, 
say,  F = grad/,  then  div  F = div  (grad/)  = V2/;  see  (3),  Sec.  9.8.  Also,  F • n = n • F = n • grad/.  This  is  the 
directional  derivative  of/in  the  outer  normal  direction  of  S,  the  boundary  surface  of  the  region  T in  the  theorem. 
This  derivative  is  called  the  (outer)  normal  derivative  of/  and  is  denoted  by  df/dn.  Thus  the  formula  in  the 
divergence  theorem  becomes 


(7) 


r/A. 


This  is  the  three-dimensional  analog  of  (9)  in  Sec.  10.4.  Because  of  the  assumptions  in  the  divergence  theorem 
this  gives  the  following  result. 


A Basic  Property  of  Harmonic  Functions 

Let  f(x,  y,  z ) be  a harmonic  function  in  some  domain  D is  space.  Let  S be  any 
piecewise  smooth  closed  orientable  surface  in  D whose  entire  region  it  encloses 
belongs  to  D.  Then  the  integral  of  the  normal  derivative  off  taken  over  S is  zero. 
(For  “piecewise  smooth”  see  Sec.  10.5.) 
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EXAMPLE  4 


EXAMPLE  5 


Green’s  Theorems 

Let  / and  g be  scalar  functions  such  that  F = f grad  g satisfies  the  assumptions  of  the  divergence  theorem  in 
some  region  T.  Then 


div  F = div  (/grad  g) 


= div 


dg  dg  dg“|  \ 

dx  dy  3zJ  / 


dg  ^g\  M ag  /af  ag  3j?\ 

\3*  dx  + ^dx2)  + U.V  dy  + ^ dy2)  + \3z  dz  + ^ dz2) 


= /V2g  + grad  / • grad  g. 


Also,  since  / is  a scalar  function. 


F • n — n • F 


= n • (/  grad  g) 

= (n  • grad  g)f 

Now  n • grad  g is  the  directional  derivative  dg/ dn  of  g in  the  outer  normal  direction  of  S.  Hence  the  formula  in 
the  divergence  theorem  becomes  “Green’s  first  formula” 


(8) 


III (/V2g  + grad/-  grad  g)  dV  = |/^  dA. 

T S 


Formula  (8)  together  with  the  assumptions  is  known  as  the  first  form  of  Green’s  theorem. 
Interchanging  / and  g we  obtain  a similar  formula.  Subtracting  this  formula  from  (8)  we  find 


(9) 


This  formula  is  called  Green’s  second  formula  or  (together  with  the  assumptions)  the  second  form  of  Green’s 
theorem. 

Uniqueness  of  Solutions  of  Laplace’s  Equation 

Let /be  harmonic  in  a domain  D and  let /be  zero  everywhere  on  a piecewise  smooth  closed  orientable  surface 
S in  D whose  entire  region  T it  encloses  belongs  to  D.  Then  \g  is  zero  in  T,  and  the  surface  integral  in  (8)  is 
zero,  so  that  (8)  with  g = f gives 


grad/*  grad fdV  = f ||  |grad/|2dV  = 0. 


Since/is  harmonic,  grad/ and  thus  |grad/|  are  continuous  in  T and  on  S,  and  since  |grad/|  is  nonnegative, 
to  make  the  integral  over  T zero,  grad  / must  be  the  zero  vector  everywhere  in  T.  Hence  fx=fy=fz~ 
and  / is  constant  in  T and,  because  of  continuity,  it  is  equal  to  its  value  0 on  S.  This  proves  the  following 
theorem. 
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Let  f{x,  y,  z)  be  harmonic  in  some  domain  D and  zero  at  every  point  of  a piecewise 
smooth  closed  orientable  surface  S in  D whose  entire  region  T it  encloses  belongs 
to  D.  Then  f is  identically  zero  in  T. 


This  theorem  has  an  important  consequence.  Let  fi  and  f2  be  functions  that  satisfy  the  assumptions  of  Theorem 
1 and  take  on  the  same  values  on  S.  Then  their  difference  j\  — f2  satisfies  those  assumptions  and  has  the  value 
0 everywhere  on  S.  Hence,  Theorem  2 implies  that 


/’]  />  0 throughout  T, 


and  we  have  the  following  fundamental  result. 


THEOREM  3 


Uniqueness  Theorem  for  Laplace’s  Equation 

Let  T be  a region  that  satisfies  the  assumptions  of  the  divergence  theorem,  and  let 
fix,  y,  z)  be  a harmonic  function  in  a domain  D that  contains  T and  its  boundary 
surface  S.  Then  f is  uniquely  determined  in  T by  its  values  on  S. 


The  problem  of  determining  a solution  u of  a partial  differential  equation  in  a region  T such  that  u assumes 
given  values  on  the  boundary  surface  S of  T is  called  the  Dirichlet  problem.8  We  may  thus  reformulate  Theorem 
3 as  follows. 


THEOREM  3* 


Uniqueness  Theorem  for  the  Dirichlet  Problem 

If  the  assumptions  in  Theorem  3 are  satisfied  and  the  Dirichlet  problem  for  the 
Laplace  equation  has  a solution  in  T,  then  this  solution  is  unique. 


These  theorems  demonstrate  the  extreme  importance  of  the  divergence  theorem  in  potential  theory. 


FR-aB4.-E^M=SET~l-Q^a 


VERIFICATIONS 

1.  Harmonic  functions.  Verify  Theorem  1 for  / = 2z2  — 
x2  — y2  and  S the  surface  of  the  box  0 S x S a, 
OSySfe,  OSzSc. 

2.  Harmonic  functions.  Verify  Theorem  1 for  / = 
x2  — y2  and  the  surface  of  the  cylinder  x2  + y2  = 4, 
OSzSi. 


3.  Green’s  first  identity.  Verify  (8)  for / = 4v2,  g = x2, 
S the  surface  of  the  “unit  cube”  0 S x S 1, 
OSySl,  0 = z = 1.  What  are  the  assumptions  on 
/ and  g in  (8)?  Must/ and  g be  harmonic? 

4.  Green’s  first  identity.  Verify  (8)  for  / = x, 
g — y2  + z2,  S the  surface  of  the  box  0 S x S 1, 
0 S y S 2,  0 S z S 3. 


8PETER  GUSTAV  LEJEUNE  DIRICHLET  (1805-1859),  German  mathematician,  studied  in  Paris  under 
Cauchy  and  others  and  succeeded  Gauss  at  Gottingen  in  1855.  He  became  known  by  his  important  research  on 
Fourier  series  (he  knew  Fourier  personally)  and  in  number  theory. 
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5.  Green’s  second  identity.  Verify  (9)  for  / = 6y2, 
g = h?,  S the  unit  cube  in  Prob.  3. 

6.  Green’s  second  identity.  Verify  (9)  for  / = jc2, 
g = v4,  S the  unit  cube  in  Prob.  3. 


7-11 


VOLUME 


Use  the  divergence  theorem,  assuming  that  the  assumptions 
on  T and  S are  satisfied. 


7.  Show  that  a region  T with  boundary  surface  S has  the 
volume 


V = J J x dy  dz  = | J y dz  dx  = J J z dx  dy 
s s s 

— j (x  dy  dz  + y dz  dx  + zdx  dy). 


8.  Cone.  Using  the  third  expression  for  v in  Prob.  7, 
verify  V = 7ra2  h/3  for  the  volume  of  a circular  cone 
of  height  h and  radius  of  base  a. 

9.  Ball.  Find  the  volume  under  a hemisphere  of  radius  a 
from  in  Prob.  7. 


10.  Volume.  Show  that  a region  T with  boundary  surface 
S has  the  volume 


r cos  </>  dA 


where  r is  the  distance  of  a variable  point  P:  (x,  y,  z) 
on  S from  the  origin  O and  (f>  is  the  angle  between 
the  directed  line  OP  and  the  outer  normal  of  S at  P. 


10.9  Stokes's  Theorem 


Make  a sketch.  Hint.  Use  (2)  in  Sec.  10.7  with 

F = [x,y,zl 

11.  Ball.  Find  the  volume  of  a ball  of  radius  a from 
Prob.  10. 

12.  TEAM  PROJECT.  Divergence  Theorem  and  Poten- 
tial Theory.  The  importance  of  the  divergence  theo- 
rem in  potential  theory  is  obvious  from  (7) — (9)  and 
Theorems  1-3.  To  emphasize  it  further,  consider 
functions  /and  g that  are  harmonic  in  some  domain  D 
containing  a region  T with  boundary  surface  S such  that 
T satisfies  the  assumptions  in  the  divergence  theorem. 
Prove,  and  illustrate  by  examples,  that  then: 

(a)  \\gVndA  = 

s 

(b)  If  dg/ dn  = 0 on  S,  then  g is  constant  in  T. 


(d)  If  df/ bn  = dg/dn  on  S,  then/  = g + cm  T,  where 
c is  a constant. 

(e)  The  Laplacian  can  be  represented  independently 
of  coordinate  systems  in  the  form 


HI \grad  g\2dV. 


V2/  = lim  — ' — \\jLdA 
J d(T)—*0  V(T)  j J dA 
sen 

where  d(T)  is  the  maximum  distance  of  the  points  of  a 
region  T bounded  by  S ( T ) from  the  point  at  which  the 
Laplacian  is  evaluated  and  V(T)  is  the  volume  of  T. 


Let  us  review  some  of  the  material  covered  so  far.  Double  integrals  over  a region  in  the  plane 
can  be  transformed  into  line  integrals  over  the  boundary  curve  of  that  region  and  conversely, 
line  integrals  into  double  integrals.  This  important  result  is  known  as  Green’s  theorem  in  the 
plane  and  was  explained  in  Sec.  10.4.  We  also  learned  that  we  can  transform  triple  integrals 
into  surface  integrals  and  vice  versa,  that  is,  surface  integrals  into  triple  integrals.  This  “big” 
theorem  is  called  Gauss’s  divergence  theorem  and  was  shown  in  Sec.  10.7. 

To  complete  our  discussion  on  transforming  integrals,  we  now  introduce  another  “big” 
theorem  that  allows  us  to  transform  surface  integrals  into  line  integrals  and  conversely, 
line  integrals  into  surface  integrals.  It  is  called  Stokes’s  Theorem,  and  it  generalizes 
Green’s  theorem  in  the  plane  (see  Example  2 below  for  this  immediate  observation).  Recall 
from  Sec.  9.9  that 


i 

j 

k 

(1) 

curl  F = 

d/dx 

d/dy 

d/dz 

Fi 

f2 

Fs 

which  we  will  need  immediately. 
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Stokes’s  Theorem9 

(Transformation  Between  Surface  and  Line  Integrals) 

Let  S be  a piecewise  smooth 9 oriented  surface  in  space  and  let  the  boundary  of  S 
be  a piecewise  smooth  simple  closed  curve  C.  Let  F (x,  y,  z)  be  a continuous  vector 
function  that  has  continuous  first  partial  derivatives  in  a domain  in  space  containing 
S.  Then 


(2) 


(curl  F)  • n dA 
J . 
s 


<>  F • r*  (s')  ds. 

c 


Here  n is  a unit  normal  vector  of  S and,  depending  on  n,  the  integration  around  C 
is  taken  in  the  sense  shown  in  Fig.  254.  Furthermore,  r = dr/ds  is  the  unit  tangent 
vector  and  s the  arc  length  of  C. 

In  components,  formula  (2)  becomes 


J 

R 

(2*) 


Here,  F = [Fi,  F2,  F3  ] . N = [/Vj . /V2,  N3],  n dA  = N dit  dv,  r'  ds  = 
[dx,  dy,  dz\ , and  R is  the  region  with  boundary  curve  C in  the  uv-plane 
corresponding  to  S represented  by  r (u,  v). 


dF3 

dy 


dz  J 1 \ dz 


dF3 

dx 


( dF2  dF] 

N + I 

2 \ dx  dy 


Nn 


du  dv 


= c)  (Fi  dx  + F2dy  + F3  dz). 
■ C 


The  proof  follows  after  Example  1 . 


Verification  of  Stokes’s  Theorem 

Before  we  prove  Stokes’s  theorem,  let  us  first  get  used  to  it  by  verifying  it  for  F = [y,  z,  x\  and  S the  paraboloid 
(Fig.  255) 


Z=f(x,y)=  1 ~(x2  + y\  z £0. 


9Sir  GEORGE  GABRIEL  STOKES  (1819-1903),  Irish  mathematician  and  physicist  who  became  a professor 
in  Cambridge  in  1849.  He  is  also  known  for  his  important  contribution  to  the  theory  of  infinite  series  and  to 
viscous  flow  (Navier-Stokes  equations),  geodesy,  and  optics. 

“Piecewise  smooth”  curves  and  surfaces  are  defined  in  Secs.  10.1  and  10.5. 
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PROOF 


Solution.  The  curve  C,  oriented  as  in  Fig.  255,  is  the  circle  rf.s)  = [cos  s,  sin  s,  0],  Its  unit  tangent  vector 
is  r (s)  = [—sin  s,  cos  s,  0].  The  function  F = | v . z,  x ] on  C is  F (r(s))  = [sin 0,  cos  ,vj.  Hence 

i'  r 277  r 2ff 

> F • dr  = F (r(s))  • r (5)  ds  = [(sin  s)(— sin  s)  + 0 + 0]  ds  = —77. 
c -\)  Jo 

We  now  consider  the  surface  integral.  We  have  F\  — y,  F2  — z,  F%  = x,  so  that  in  (2*)  we  obtain 

curlF  = curl  [F1?  F2,  F3]  = curl  [y,  z,  — [— 1,  — 1,  —1]. 

A normal  vector  of  S is  N = grad  (z  — f(x,y))  = [ 2x , 2 y,  1].  Hence  (curlF)  *N  = —2x  — 2y  — 1.  Now 
ndA  = N dx  dy  (see  (3*)  in  Sec.  10.6  with  x,  y instead  of  u,  v).  Using  polar  coordinates  r,  6 defined  by 
x = r cos  6,  y — r sin  6 and  denoting  the  projection  of  S into  the  jcy-plane  by  R , we  thus  obtain 


(curl  F)  • n dA  = (curl  F)  • N dx  dy  = (~2x  — 2y  — 1)  dx  dy 


(— 2r(cos0  + sin  6)  — l)r  dr  dd 


2 1\  1 

(cos  6 + sin  6) )d0  = 0 + 0 (277)  = 

3 2/2 


We  prove  Stokes’s  theorem.  Obviously,  (2)  holds  if  the  integrals  of  each  component  on 
both  sides  of  (2*)  are  equal;  that  is, 


(3) 


fi/’i  ()F\ 

N„ N ) du  dv  = F,dx 

dz  A dy  J- 


(4) 


/'2  -d'2 

-^Ni  + ^Nddl,dv-%F*dy 


(5) 


()  /'ij  ()  F‘>t 

— N - — - N )dudv  = (J)  F„dz. 
dy  1 dx  A)  J- 


We  prove  this  first  for  a surface  S that  can  be  represented  simultaneously  in  the  forms 
(6)  (a)  z=f(x,y),  (b)  y = g(x,  z),  (c)  x = h(y,z). 

We  prove  (3),  using  (6a).  Setting  u = x,  v = y,  we  have  from  (6a) 
r(w,  v ) = r (x,  y)  = [x,  y,f(x,  y)]  = x\  + vj  + / k 
and  in  (2),  Sec.  10.6,  by  direct  calculation 

N = ru  x r„  = rx  x ry  = [~fx,  -fy,  1]  = -fx  i - fy  j + k. 
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Note  that  N is  an  upper  normal  vector  of  S,  since  it  has  a positive  z-component.  Also, 
R = S*,  the  projection  of  S into  the  xy-plane,  with  boundary  curve  C = C*  (Fig.  256). 
Hence  the  left  side  of  (3)  is 


(7) 


s* 


dF.  dF. 

(-/) 

dz  y dy 


dx  dy. 


We  now  consider  the  right  side  of  (3).  We  transform  this  line  integral  over  C = C*  into 
a double  integral  over  S*  by  applying  Green’s  theorem  [formula  (1)  in  Sec.  10.4  with 
F2  = 0].  This  gives 


(>  F i dx 

'c* 


ff  dFi  , 

dx  dy. 

J J dy 

s* 


Fig.  256.  Proof  of  Stokes’s  theorem 


Here,  F i = F\  (x,  y,f(x,  y)).  Hence  by  the  chain  rule  (see  also  Prob.  10  in  Problem  Set  9.6), 


dFj(x,  y,f(x,  y)) 
dy 


BF^x,  y,  z)  _ dF i(x,  y,  z)  df 
dy  dz  dy 


[z  = f(x,  y)]. 


We  see  that  the  right  side  of  this  equals  the  integrand  in  (7).  This  proves  (3).  Relations 
(4)  and  (5)  follow  in  the  same  way  if  we  use  (6b)  and  (6c),  respectively.  By  addition  we 
obtain  (2*).  This  proves  Stokes’s  theorem  for  a surface  S that  can  be  represented 
simultaneously  in  the  forms  (6a),  (6b),  (6c). 

As  in  the  proof  of  the  divergence  theorem,  our  result  may  be  immediately  extended 
to  a surface  S that  can  be  decomposed  into  finitely  many  pieces,  each  of  which  is  of 
the  kind  just  considered.  This  covers  most  of  the  cases  of  practical  interest.  The  proof 
in  the  case  of  a most  general  surface  S satisfying  the  assumptions  of  the  theorem  would 
require  a limit  process;  this  is  similar  to  the  situation  in  the  case  of  Green’s  theorem 
in  Sec.  10.4. 


Green’s  Theorem  in  the  Plane  as  a Special  Case  of  Stokes’s  Theorem 

Let  F = [Fi,  F2 ] = F\  i + j be  a vector  function  that  is  continuously  differentiable  in  a domain  in  the 
xy-plane  containing  a simply  connected  bounded  closed  region  S whose  boundary  C is  a piecewise  smooth 
simple  closed  curve.  Then,  according  to  (1), 


dF2 

(curl  F)  • n = (curl  F)  • k = 

dx 


BFi 
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EXAMPLE  4 


Fig.  257. 

Example  4 


EXAMPLE  5 


Hence  the  formula  in  Stokes’s  theorem  now  takes  the  form 


((fdF2 

aFA  [ 

— 

)dA  = <b( 

JJV  dx 

3y  j 1 

(Fi  dx  + F2  dy ). 


This  shows  that  Green’s  theorem  in  the  plane  (Sec.  10.4)  is  a special  case  of  Stokes’s  theorem  (which  we  needed 
in  the  proof  of  the  latter!). 


Evaluation  of  a Line  Integral  by  Stokes's  Theorem 

Evaluate  J F*  r'ds,  where  C is  the  circle  x2  + y2  = 4,  z — ~ 3,  oriented  counterclockwise  as  seen  by  a person 
standing  at  the  origin,  and,  with  respect  to  right-handed  Cartesian  coordinates, 

F = [y,  xz 3,  —zy 3]  = yi  + *z3j  - zy3 k. 

Solution.  As  a surface  S bounded  by  C we  can  take  the  plane  circular  disk  x2  + y2  ^ 4 in  the  plane  z — —3. 
Then  n in  Stokes’s  theorem  points  in  the  positive  ^-direction;  thus  n = k.  Hence  (curl  F)  • n is  simply  the  component 
of  curl  F in  the  positive  z-direction.  Since  F with  z = — 3 has  the  components  Fi  — y,  F2  — ~21x,  F%  = 3y3,  we 
thus  obtain 


dF2  dFi 

(curl  F)  • n = = -27  - 1 = -28. 

dx  dy 

Hence  the  integral  over  S in  Stokes’s  theorem  equals  —28  times  the  area  47T  of  the  disk  S.  This  yields  the  answer 
—28  • 477  — —11277  ~ —352.  Confirm  this  by  direct  calculation,  which  involves  somewhat  more  work. 

Physical  Meaning  of  the  Curl  in  Fluid  Motion.  Circulation 

Let  STq  be  a circular  disk  of  radius  ro  and  center  P bounded  by  the  circle  Cro  (Fig.  257),  and  let  F ( Q ) = F (x,  y,  z) 
be  a continuously  differentiable  vector  function  in  a domain  containing  Sro.  Then  by  Stokes’s  theorem  and  the 
mean  value  theorem  for  surface  integrals  (see  Sec.  10.6), 


V F • r'ds 

>Cro 


jj(curl  F)  • n dA  = (curl  F)  • n(P*)Aro 

Sr0 


where  Aro  is  the  area  of  Sro  and  P*  is  a suitable  point  of  STq.  This  may  be  written  in  the  form 


(curl  F)  • n (P*)  = — <p  F • r'ds. 

Ar»  Jn 


In  the  case  of  a fluid  motion  with  velocity  vector  F = v,  the  integral 

<P  v • r'ds 


JCr o 

is  called  the  circulation  of  the  flow  around  Cro.  It  measures  the  extent  to  which  the  corresponding  fluid  motion 
is  a rotation  around  the  circle  CTq.  If  we  now  let  ro  approach  zero,  we  find 


(8) 


(curl  v)  • n (P)  = lim  — 
ro^o  Ar. 


r'  ds; 


that  is,  the  component  of  the  curl  in  the  positive  normal  direction  can  be  regarded  as  the  specific  circulation 
(circulation  per  unit  area)  of  the  flow  in  the  surface  at  the  corresponding  point. 


Work  Done  in  the  Displacement  around  a Closed  Curve 

Find  the  work  done  by  the  force  F = 2xy  sin  z i + 3x  y sin  z j + x y cos  z k in  the  displacement  around  the 
curve  of  intersection  of  the  paraboloid  z — x2  + y2  and  the  cylinder  (x  — l)2  + y2  — 1. 

Solution.  This  work  is  given  by  the  line  integral  in  Stokes’s  theorem.  Now  F = grad/,  where/  = x2y3  sin  z 
and  curl  (grad/)  = 0 (see  (2)  in  Sec.  9.9),  so  that  (curl  F)  • n — 0 and  the  work  is  0 by  Stokes’s  theorem.  This 
agrees  with  the  fact  that  the  present  field  is  conservative  (definition  in  Sec.  9.7). 
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Stokes’s  Theorem  Applied  to  Path  Independence 

We  emphasized  in  Sec.  10.2  that  the  value  of  a line  integral  generally  depends  not  only 
on  the  function  to  be  integrated  and  on  the  two  endpoints  A and  B of  the  path  of  integration 
C,  but  also  on  the  particular  choice  of  a path  from  A to  B.  In  Theorem  3 of  Sec.  10.2  we 
proved  that  if  a line  integral 


(9) 


F (r)  • dr 


(/•  ’]  dx  + F2  dy  + F3  dz ) 


(involving  continuous  F±,  F2,  F$  that  have  continuous  first  partial  derivatives)  is  path 
independent  in  a domain  D , then  curl  F = 0 in  D.  And  we  claimed  in  Sec.  10.2  that, 
conversely,  curl  F = 0 everywhere  in  D implies  path  independence  of  (9)  in  I)  provided  D 
is  simply  connected.  A proof  of  this  needs  Stokes’s  theorem  and  can  now  be  given  as  follows. 

Let  C be  any  closed  path  in  D.  Since  D is  simply  connected,  we  can  find  a surface  S 
in  D bounded  by  C.  Stokes’s  theorem  applies  and  gives 


> (Fi  dx  + F2dy  + F3  dz)  = c 

> F • r'ds  = 

Jc 

c JSJ 

(curl  F)  • n dA 


for  proper  direction  on  C and  normal  vector  n on  S.  Since  curl  F = 0 in  D,  the  surface 
integral  and  hence  the  line  integral  are  zero.  This  and  Theorem  2 of  Sec.  10.2  imply  that 
the  integral  (9)  is  path  independent  in  D.  This  completes  the  proof. 


PROBLEM  SET  10,9 


1-10 


DIRECT  INTEGRATION  OF  SURFACE 
INTEGRALS 


Evaluate  the  surface  integral  (curl  F)  • n dA  directly  for 
the  given  F and  S.  5 


1.  F = [z2,  —x2,  O],^  the  rectangle  with  vertices  (0,  0,  0), 
(1,0,  0),  (0,  4,  4),  (1,4,  4) 

2.  F = [— 13  sin  v,  3 sinhz.x],  S the  rectangle  with  vertices 
(0,  0,  2),  (4,  0,  2),  (4,  7T/2,  2),  (0,  tt/2,  2) 

3.  F = [e-z,  e~z  cos  y,  e~z  sin  y],  S:  z = y2/2, 

-IS*  SI,  OSySl 

4.  F as  in  Prob.  1,  z = xy  (0  S * S 1,  OSvS  4). 
Compare  with  Prob.  1 . 

5.  F = [z2,  §*,  0],  S:  0 S * S a,  0 S y S a, 
z = 1 

6.  F = [y3 4 5 6 7 8 9 10,  -x3,  0],  S:  x2  + /S1,  z = 0 

7.  F = [ey,  ez,  ex],  S:  z = x2  (0  S * S 2, 

0 Sy  S 1) 

8.  F = [z2,  x2,  y2],  S:  z = Vx2  + y2, 
y £ 0,  0 S z S h 

9.  Verify  Stokes’s  theorem  for  F and  S in  Prob.  5. 

10.  Verify  Stokes’s  theorem  for  F and  S in  Prob.  6. 


11.  Stokes’s  theorem  not  applicable.  Evaluate  <>F  • r'ds. 


F = (x1 2  + y2)_1[-y,  x],  C:  x2  + y2  = 1,  z = 0,  ori- 
ented clockwise.  Why  can  Stokes’s  theorem  not  be 
applied?  What  (false)  result  would  it  give? 


12.  WRITING  PROJECT.  Grad,  Div,  Curl  In 
Connection  with  Integrals.  Make  a list  of  ideas  and 
results  on  this  topic  in  this  chapter.  See  whether  you 
can  rearrange  or  combine  parts  of  your  material.  Then 
subdivide  the  material  into  3-5  portions  and  work  out 
the  details  of  each  portion.  Include  no  proofs  but  simple 
typical  examples  of  your  own  that  lead  to  a better 
understanding  of  the  material. 


13-20 


EVALUATION  OF  <>F  • r'  ds 


Calculate  this  line  integral  by  Stokes’s  theorem  for  the 
given  F and  C.  Assume  the  Cartesian  coordinates  to  be 
right-handed  and  the  z-component  of  the  surface  normal  to 
be  nonnegative. 

13.  F = [ — 5v,  4x,  z],  C the  circle  x2  + y2  = 16,  z = 4 
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14.  F = [z3,  x3,  y3],  C the  circle  x = 2,  y2  + z2  = 9 

15.  F = [ y 2,  x2,  z + x]  around  the  triangle  with  vertices 
(0,  0,  0),  (1,  0,  0),  (1,  1,  0) 

16.  F = [ey,  0,  e%  C as  in  Prob.  15 

17.  F = [0,  z3,  0],  C the  boundary  curve  of  the  cylinder 
x2  + y2  = 1,  x £ 0,  y £ 0,  OSzSl 


18.  F = [— y,  2 z,  0],  C the  boundary  curve  of  v2  + z2  = 4, 
z £ 0,  OSxS/i 

19.  F = [z,  ez,  0],  C the  boundary  curve  of  the  portion  of 

the  cone  z — \/x2  + y2,  x£0,  y£0,  0 £ z £ 1 

20.  F = [0,  cos  x,  0],  C the  boundary  curve  of  y2  + z2  = 4, 

y £ 0,  z £ 0,  0 £ x £ 7 t 


CHAPTER  TO  REVIE  WQUE  S T IONS  AND  PROBLEMS 


1.  State  from  memory  how  to  evaluate  a line  integral. 
A surface  integral. 

2.  What  is  path  independence  of  a line  integral?  What  is 
its  physical  meaning  and  importance? 

3.  What  line  integrals  can  be  converted  to  surface 
integrals  and  conversely?  How? 

4.  What  surface  integrals  can  be  converted  to  volume 
integrals?  How? 

5.  What  role  did  the  gradient  play  in  this  chapter?  The 
curl?  State  the  definitions  from  memory. 

6.  What  are  typical  applications  of  line  integrals?  Of 
surface  integrals? 

7.  Where  did  orientation  of  a surface  play  a role?  Explain. 

8.  State  the  divergence  theorem  from  memory.  Give 
applications. 

9.  In  some  line  and  surface  integrals  we  started  from 
vector  functions,  but  integrands  were  scalar  functions. 
How  and  why  did  we  proceed  in  this  way? 

10.  State  Laplace’s  equation.  Explain  its  physical  impor- 
tance. Summarize  our  discussion  on  harmonic  functions. 

LINE  INTEGRALS  (WORK  INTEGRALS) 

Evaluate  F (r)  • dr  for  given  F and  C by  the  method  that 

■'c 

seems  most  suitable.  Remember  that  if  F is  a force,  the 

integral  gives  the  work  done  in  the  displacement  along  C. 

Show  details. 

11.  F = [2x2,  — 4y2],  C the  straight-line  segment  from 
(4,  2)  to  (-6,  10) 

12.  F = [y  cos  xy,  x cos  xy,  ez],  C the  straight-line  segment 
from  (77,  1 , 0)  to  (2 , 77,  1) 

13.  F = [y2,  2xy  + 5 sin  x,  0],  C the  boundary  of 
0 S x S 77/2,  0 £ y fi  2,  z = 0 

14.  F = [— y3,  x3  + e~v,  0],  C the  circle  x2  + y2  = 25, 
z = 2 

15.  F = [x3,  e2y,  e~4z],  C:  x2  + 9v2  = 9,  z = x2 

16.  F = [x2,  y2,  y2x],  C the  helix  r = [2  cos  t,  2 sin  t,  37] 
from  (2,  0,  0)  to  (-2,  0,  3tt) 


11-20 


17.  F = [9z,  5x,  3y],  C the  ellipse  x2  + y2  = 9, 
z = x + 2 

18.  F = [sin  77y,  cos  77x,  sin  77x],  C the  boundary  curve  of 
0 S x £ 1,  0 S y S 2,  z = x 

19.  F = [z,  2y,  x],  C the  helix  r = [cos  t,  sin  t,  f]  from 
(1,0,  0)  to  (1,0,  277) 

20.  F = [zexz,  2 sinh  2 y,  xexz],  C the  parabola  y = x, 
z = x2,  -1  £ x £ 1 


21-25 


DOUBLE  INTEGRALS, 
CENTER  OF  GRAVITY 


Find  the  coordinates  x,  y of  the  center  of  gravity  of  a 
mass  of  density  /(x,  y)  in  the  region  R.  Sketch  R , show 
details. 

21.  / = xy,  R the  triangle  with  vertices  (0,  0),  (2,  0),  (2,  2) 

22.  / = x2  + y2,  R:  x2  + v2  S a2,  y £ 0 

23.  / = x2,  R:  - 1 S x £ 2,  x2  S y fi  x + 2.  Why  is 
x > 0? 

24.  /=  1,  R:  0SySl-x4 

25.  / = ky,  k>  0,  arbitrary,  0 fi  y fi  1 — x2, 

0 S x S 1 

26.  Why  are  x and  y in  Prob.  25  independent  of  fc? 


27-35 


SURFACE  INTEGRALS 


F • n dA. 


DIVERGENCE  THEOREM 


Evaluate  the  integral  diectly  or,  if  possible,  by  the  divergence 
theorem.  Show  details. 

27.  F = [ax,  by,  cz ],  S the  sphere  x1  + y2  + z2  = 36 

28.  F = [x  + yz,  y + z2,  z + x2],  S the  ellipsoid  with 

semi-axes  of  lengths  a,  b,  c 

29.  F = [y  + z,  20y,  2z3],  S the  surface  of  0 S x S 2, 
OSvSl,  OSzSy 

30.  F = [1,  1,  1],  S:  x2  + y2  + 4z2  = 4,  z £ 0 

31.  F = [ ex , ey,  ez],  S the  surface  of  the  box  |x|  S 1, 

lyl  s 1,  I z I £ 1 
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32.  F = \y2,  x2,  z2],  S the  portion  of  the  paraboloid 
z = x2  + y2,  z = 9 

33.  F = [y 2,  x2,  z2],  S:  r = [u,  u2,  i>],  0S«S2, 
-2SoS2 


34.  F = [x,  xy,  z],  S the  boundary  of  x2  + y2  S 1, 
0 S z S 5 

35.  F = [x  + z,  y + z,  x + y],  S the  sphere  of  radius  3 
with  center  0 


SUMMARY  QF  CHAPTtR  10 

Vector  Integral  Calculus.  Integral  Theorems 


Chapter  9 extended  differential  calculus  to  vectors,  that  is,  to  vector  functions  v (x,  y,  z) 
or  x(t).  Similarly,  Chapter  10  extends  integral  calculus  to  vector  functions.  This 
involves  line  integrals  (Sec.  10.1),  double  integrals  (Sec.  10.3),  surface  integrals  (Sec. 
10.6),  and  triple  integrals  (Sec.  10.7)  and  the  three  “big”  theorems  for  transforming 
these  integrals  into  one  another,  the  theorems  of  Green  (Sec.  10.4),  Gauss  (Sec.  10.7), 
and  Stokes  (Sec.  10.9). 

The  analog  of  the  definite  integral  of  calculus  is  the  line  integral  (Sec.  10.1) 


(1) 


F (r)  • dr 

c 


(l  'i  dx  + F2dy  + F3  dz) 
c 


,b 

F(r(f)) 


where  C:  r (f)  = [x(f),  y(t),  z(t)]  = x(t) i + y (f)j  + z(f)k  (a  g t ^ b)  is  a curve  in 
space  (or  in  the  plane).  Physically,  (1)  may  represent  the  work  done  by  a (variable) 
force  in  a displacement.  Other  kinds  of  line  integrals  and  their  applications  are  also 
discussed  in  Sec.  10.1. 

Independence  of  path  of  a line  integral  in  a domain  D means  that  the  integral 
of  a given  function  over  any  path  C with  endpoints  P and  Q has  the  same  value  for 
all  paths  from  P to  Q that  lie  in  D;  here  P and  Q are  fixed.  An  integral  (1)  is 
independent  of  path  in  D if  and  only  if  the  differential  form  F1  dx  + F2dy  + F3  dz 
with  continuous  F±,  F2,  F3  is  exact  in  I)  (Sec.  10.2).  Also,  if  curl  F = 0,  where 
F = [/’],  F2,  P3] , has  continuous  first  partial  derivatives  in  a simply  connected 
domain  D,  then  the  integral  (1)  is  independent  of  path  in  D (Sec.  10.2). 

Integral  Theorems.  The  formula  of  Green’s  theorem  in  the  plane  (Sec.  10.4) 


(2) 


/ ()F2 
\ dx 


< > (F1  dx  + F2dy) 
■ c 


transforms  double  integrals  over  a region  R in  the  xy-plane  into  line  integrals  over 
the  boundary  curve  C of  R and  conversely.  For  other  forms  of  (2)  see  Sec.  10.4. 
Similarly,  the  formula  of  the  divergence  theorem  of  Gauss  (Sec.  10.7) 


(3) 


div  F dV 
J J . 

T 


F • n dA 

J . 
s 


Summary  of  Chapter  10 
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transforms  triple  integrals  over  a region  T in  space  into  surface  integrals  over  the 
boundary  surface  S of  T,  and  conversely.  Formula  (3)  implies  Green’s  formulas 


(4) 

(5) 


(/V2g  + V/*  Vg)  dV 

J J . 

T 


,ds 

dn 


dA, 


(/V2g  - g\2f)  dV 

J J . 

T 


dA. 


Finally,  the  formula  of  Stokes’s  theorem  (Sec.  10.9) 


(6) 


(curl  F)  • n dA  = o F • r ' (s)  ds 


Jc 


transforms  surface  integrals  over  a surface  S into  line  integrals  over  the  boundary 
curve  C of  S and  conversely. 


PART  C 

Fourier  Analysis. 
Partial 
Differential 
Equations  (PDEs) 


CHAPTER  11  Fourier  Analysis 

CHAPTER  12  Partial  Differential  Equations  (PDEs) 

Chapter  11  and  Chapter  12  are  directly  related  to  each  other  in  that  Fourier  analysis  has 
its  most  important  applications  in  modeling  and  solving  partial  differential  equations 
(PDEs)  related  to  boundary  and  initial  value  problems  of  mechanics,  heat  flow, 
electrostatics,  and  other  fields.  However,  the  study  of  PDEs  is  a study  in  its  own  right. 
Indeed,  PDEs  are  the  subject  of  much  ongoing  research. 

Fourier  analysis  allows  us  to  model  periodic  phenomena  which  appear  frequently  in 
engineering  and  elsewhere — think  of  rotating  parts  of  machines,  alternating  electric  currents 
or  the  motion  of  planets.  Related  period  functions  may  be  complicated.  Now,  the  ingeneous 
idea  of  Fourier  analysis  is  to  represent  complicated  functions  in  terms  of  simple  periodic 
functions,  namely  cosines  and  sines.  The  representations  will  be  infinite  series  called 
Fourier  series.1  This  idea  can  be  generalized  to  more  general  series  (see  Sec.  11.5)  and 
to  integral  representations  (see  Sec.  11.7). 

The  discovery  of  Fourier  series  had  a huge  impetus  on  applied  mathematics  as  well  as  on 
mathematics  as  a whole.  Indeed,  its  influence  on  the  concept  of  a function,  on  integration 
theory,  on  convergence  theory,  and  other  theories  of  mathematics  has  been  substantial 
(see  [GenRef7]  in  App.  1). 

Chapter  12  deals  with  the  most  important  partial  differential  equations  (PDEs)  of  physics 
and  engineering,  such  as  the  wave  equation,  the  heat  equation,  and  the  Laplace  equation. 
These  equations  can  model  a vibrating  string/membrane,  temperatures  on  a bar,  and 
electrostatic  potentials,  respectively.  PDEs  are  very  important  in  many  areas  of  physics 
and  engineering  and  have  many  more  applications  than  ODEs. 

1JEAN-BAPTISTE  JOSEPH  FOURIER  (1768-1830),  French  physicist  and  mathematician,  lived  and  taught 
in  Paris,  accompanied  Napoleon  in  the  Egyptian  War,  and  was  later  made  prefect  of  Grenoble.  The  beginnings 
on  Fourier  series  can  be  found  in  works  by  Euler  and  by  Daniel  Bernoulli,  but  it  was  Fourier  who  employed 
them  in  a systematic  and  general  manner  in  his  main  work,  Theorie  analytique  de  la  chaleur  ( Analytic  Theory 
of  Heat,  Paris,  1822),  in  which  he  developed  the  theory  of  heat  conduction  (heat  equation;  see  Sec.  12.5),  making 
these  series  a most  important  tool  in  applied  mathematics. 
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CHAPTER 


Fourier  Analysis 


This  chapter  on  Fourier  analysis  covers  three  broad  areas:  Fourier  series  in  Secs.  11.1-11.4, 
more  general  orthonormal  series  called  Sturm-Liouville  expansions  in  Secs.  11.5  and  11.6 
and  Fourier  integrals  and  transforms  in  Secs.  11.7-11.9. 

The  central  starting  point  of  Fourier  analysis  is  Fourier  series.  They  are  infinite  series 
designed  to  represent  general  periodic  functions  in  terms  of  simple  ones,  namely,  cosines 
and  sines.  This  trigonometric  system  is  orthogonal,  allowing  the  computation  of  the 
coefficients  of  the  Fourier  series  by  use  of  the  well-known  Euler  formulas,  as  shown  in 
Sec.  11.1.  Fourier  series  are  very  important  to  the  engineer  and  physicist  because  they 
allow  the  solution  of  ODEs  in  connection  with  forced  oscillations  (Sec.  11.3)  and  the 
approximation  of  periodic  functions  (Sec.  1 1.4).  Moreover,  applications  of  Fourier  analysis 
to  PDEs  are  given  in  Chap.  12.  Fourier  series  are,  in  a certain  sense,  more  universal  than 
the  familiar  Taylor  series  in  calculus  because  many  discontinuous  periodic  functions  that 
come  up  in  applications  can  be  developed  in  Fourier  series  but  do  not  have  Taylor  series 
expansions. 

The  underlying  idea  of  the  Fourier  series  can  be  extended  in  two  important  ways.  We 
can  replace  the  trigonometric  system  by  other  families  of  orthogonal  functions,  e.g.,  Bessel 
functions  and  obtain  the  Sturm-Liouville  expansions.  Note  that  related  Secs.  11.5  and 
1 1.6  used  to  be  part  of  Chap.  5 but,  for  greater  readability  and  logical  coherence,  are  now 
part  of  Chap.  1 1 . The  second  expansion  is  applying  Fourier  series  to  nonperiodic 
phenomena  and  obtaining  Fourier  integrals  and  Fourier  transforms.  Both  extensions  have 
important  applications  to  solving  PDEs  as  will  be  shown  in  Chap.  12. 

In  a digital  age,  the  discrete  Fourier  transform  plays  an  important  role.  Signals,  such 
as  voice  or  music,  are  sampled  and  analyzed  for  frequencies.  An  important  algorithm,  in 
this  context,  is  the  fast  Fourier  transform.  This  is  discussed  in  Sec.  1 1.9. 

Note  that  the  two  extensions  of  Fourier  series  are  independent  of  each  other  and  may 
be  studied  in  the  order  suggested  in  this  chapter  or  by  studying  Fourier  integrals  and 
transforms  first  and  then  Sturm-Liouville  expansions. 

Prerequisite:  Elementary  integral  calculus  (needed  for  Fourier  coefficients). 

Sections  that  may  be  omitted  in  a shorter  course:  11.4-11.9. 

References  and  Answers  to  Problems:  App.  1 Part  C,  App.  2. 


n.i  Fourier  Series 


Fourier  series  are  infinite  series  that  represent  periodic  functions  in  terms  of  cosines  and 
sines.  As  such,  Fourier  series  are  of  greatest  importance  to  the  engineer  and  applied 
mathematician.  To  define  Fourier  series,  we  first  need  some  background  material. 
A function  f(x)  is  called  a periodic  function  if  fix)  is  defined  for  all  real  x,  except 
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possibly  at  some  points,  and  if  there  is  some  positive  number  p,  called  a period  of  fix), 
such  that 

(1)  f{x  + p)  = fix ) for  all  x. 

(The  function  fix)  = tan  x is  a periodic  function  that  is  not  defined  for  all  real  x but 
undefined  for  some  points  (more  precisely,  countably  many  points),  that  is  x = ±77"/ 2, 
±377/2,  • • • .) 

The  graph  of  a periodic  function  has  the  characteristic  that  it  can  be  obtained  by  periodic 
repetition  of  its  graph  in  any  interval  of  length  p (Fig.  258). 

The  smallest  positive  period  is  often  called  the  fundamental  period.  (See  Probs.  2-4.) 
Familiar  periodic  functions  are  the  cosine,  sine,  tangent,  and  cotangent.  Examples  of 
functions  that  are  not  periodic  are  x,  x ,x  , e , cosh  x,  and  In  x,  to  mention  just  a few. 

If  fix)  has  period  p,  it  also  has  the  period  2 p because  (1)  implies  fix  + 2 p)  = 
fi[x  + p\  + p)  = fix  + p)  = fix),  etc.;  thus  for  any  integer  n = 1,  2,  3,  ■ • • , 

(2)  fix  + np)  = fix)  for  all  x. 

Furthermore  if  fix)  and  g (x)  have  period  p,  then  afix)  + bg  (x)  with  any  constants  a and 
b also  has  the  period  p. 

Our  problem  in  the  first  few  sections  of  this  chapter  will  be  the  representation  of  various 
functions  fix)  of  period  277  in  terms  of  the  simple  functions 

(3)  1,  cosx,  sinx,  cos  2x,  sin  2x,  ■ • • , cos  nx,  sin  nx,  ■ ■ ■ . 

All  these  functions  have  the  period  277.  They  form  the  so-called  trigonometric  system. 
Figure  259  shows  the  first  few  of  them  (except  for  the  constant  1,  which  is  periodic  with 
any  period). 


sin  x sin  2x  sin  3x 

Fig.  259.  Cosine  and  sine  functions  having  the  period  277  (the  first  few  members  of  the 
trigonometric  system  (3),  except  for  the  constant  1) 
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The  series  to  be  obtained  will  be  a trigonometric  series,  that  is,  a series  of  the  form 
a0  + a\  cos  x + b±  sin  x + a 2 cos  2x  + b2  sin  2x  + • • ■ 

^ ^ = ao  +2  (fln cos  nx  + bn  sin  nx)- 

n= 1 

«i,  /?i,  a2,  /?2, ' ' ' are  constants,  called  the  coefficients  of  the  series.  We  see  that  each 
term  has  the  period  277.  Hence  if  the  coefficients  are  such  that  the  series  converges,  its 
sum  will  be  a function  of  period  277. 

Expressions  such  as  (4)  will  occur  frequently  in  Fourier  analysis.  To  compare  the 
expression  on  the  right  with  that  on  the  left,  simply  write  the  terms  in  the  summation. 
Convergence  of  one  side  implies  convergence  of  the  other  and  the  sums  will  be  the 
same. 

Now  suppose  that  fix)  is  a given  function  of  period  277  and  is  such  that  it  can  be 
represented  by  a series  (4),  that  is,  (4)  converges  and,  moreover,  has  the  sum/'(x).  Then, 
using  the  equality  sign,  we  write 


(5) 


fix)  = flo  + 2 (°n  cos  nx  + bn  sin  nx) 

71=1 


and  call  (5)  the  Fourier  series  of  fix).  We  shall  prove  that  in  this  case  the  coefficients 
of  (5)  are  the  so-called  Fourier  coefficients  of  f(x),  given  by  the  Euler  formulas 


(6) 


(0)  a0  = 


(a)  0,1 


(b)  bn  = 


I 

277  . 

1 

77 

1 

77 


fix)  dx 

- 7 T 

f(x)  cos  nx  dx 

TT 

fix)  sin  nx  dx 


n = 1,2, 
n = 1,2,- 


The  name  “Fourier  series”  is  sometimes  also  used  in  the  exceptional  case  that  (5)  with 
coefficients  (6)  does  not  converge  or  does  not  have  the  sum/(x) — this  may  happen  but 
is  merely  of  theoretical  interest.  (For  Euler  see  footnote  4 in  Sec.  2.5.) 

A Basic  Example 

Before  we  derive  the  Euler  formulas  (6),  let  us  consider  how  (5)  and  (6)  are  applied  in 
this  important  basic  example.  Be  fully  alert,  as  the  way  we  approach  and  solve  this 
example  will  be  the  technique  you  will  use  for  other  functions.  Note  that  the  integration 
is  a little  bit  different  from  what  you  are  familiar  with  in  calculus  because  of  the  n.  Do 
not  just  routinely  use  your  software  but  try  to  get  a good  understanding  and  make 
observations:  How  are  continuous  functions  (cosines  and  sines)  able  to  represent  a given 
discontinuous  function?  How  does  the  quality  of  the  approximation  increase  if  you  take 
more  and  more  terms  of  the  series?  Why  are  the  approximating  functions,  called  the 
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EXAMPLE  1 


partial  sums  of  the  series,  in  this  example  always  zero  at  0 and  77?  Why  is  the  factor 
1/n  (obtained  in  the  integration)  important? 

Periodic  Rectangular  Wave  (Fig.  260) 

Find  the  Fourier  coefficients  of  the  periodic  function /(x)  in  Fig.  260.  The  formula  is 
(— k if  —7 t < x < 0 

(7)  f{x)  = \ and  f(x  + 2ir)  = /(*). 

V k if  0 < x < 7 t 


Functions  of  this  kind  occur  as  external  forces  acting  on  mechanical  systems,  electromotive  forces  in  electric 
circuits,  etc.  (The  value  of  f(x)  at  a single  point  does  not  affect  the  integral;  hence  we  can  leave /(x)  undefined 
at  x = 0 and  x = ±77.) 

Solution.  From  (6.0)  we  obtain  ciq  = 0.  This  can  also  be  seen  without  integration,  since  the  area  under  the 
curve  of  /(x)  between  —77  and  77  (taken  with  a minus  sign  where /(x)  is  negative)  is  zero.  From  (6a)  we  obtain 
the  coefficients  a±,  a^,  • " of  the  cosine  terms.  Since  /(x)  is  given  by  two  expressions,  the  integrals  from  —77 
to  77  split  into  two  integrals: 


1 


1 


an  = — \ /(*)  cos  nxdx  = — 


(— k)  cos  nx  dx  + k cos  nxdx 


1 

77 


-k- 


= 0 


because  sin  nx  = 0 at  —77,  0,  and  77  for  all  n i = 1,  2,  • ■ • . We  see  that  all  these  cosine  coefficients  are  zero.  That 
is,  the  Fourier  series  of  (7)  has  no  cosine  terms,  just  sine  terms,  it  is  a Fourier  sine  series  with  coefficients 
/?i,  Z?2,  • • • obtained  from  (6b); 


bn  = — f(x)  sin  nx  dx  = — 


(—k)  sin  nx  dx  + k sin  nx  dx 


1 

77 


- k~ 


Since  cos  (—a)  = cos  a and  cos  0=1,  this  yields 


k 2k 

bn  = [cos  0 — COS  ( — «77)  — COS  7177  + COS  0]  = (1  — COS  7777). 


Now,  cos  77  = — 1,  cos  277  = 1,  cos  377  = — 1,  etc.;  in  general, 
( — 1 for  odd  77, 


COS  7777  = 


and  thus  1 — cos  nir  = 


1 for  even  n. 

Hence  the  Fourier  coefficients  br,  of  our  function  are 


2 for  odd  n, 
0 for  even  n. 


A l Al  Al 

bi=— , b 2 = 0,  *3  = —.  *4  = 0,  *5  = —••■■■ 

77  3i7  5tt 


fix) 

k 


-n  0 n 2n  * 


Fig.  260.  Given  function  f[x)  (Periodic  reactangular  wave) 
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Since  the  an  are  zero,  the  Fourier  series  of  f(x)  is 

4k  i i 

(8)  — (sin  jc  + 3 sin  3x  + 5 sin  5jc  + • • • )• 

The  partial  sums  are 


Si 


4k 

— sin  x, 

IT 


4k  ( 


1 


S2  — ~ I sin  x + — sin  3x 


etc. 


Their  graphs  in  Fig.  261  seem  to  indicate  that  the  series  is  convergent  and  has  the  sumf(x),  the  given  function. 
We  notice  that  at  x = 0 and  x = it,  the  points  of  discontinuity  of  f(x),  all  partial  sums  have  the  value  zero,  the 
arithmetic  mean  of  the  limits  —k  and  k of  our  function,  at  these  points.  This  is  typical. 

Furthermore,  assuming  that  f(x)  is  the  sum  of  the  series  and  setting  x — 7t/2,  we  have 


Thus 


1 - 


77 

4 ‘ 


This  is  a famous  result  obtained  by  Leibniz  in  1673  from  geometric  considerations.  It  illustrates  that  the  values 
of  various  series  with  constant  terms  can  be  obtained  by  evaluating  Fourier  series  at  specific  points. 


Fig.  261.  First  three  partial  sums  of  the  corresponding  Fourier  series 
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Derivation  of  the  Euler  Formulas  (6) 

The  key  to  the  Euler  formulas  (6)  is  the  orthogonality  of  (3),  a concept  of  basic  importance, 
as  follows.  Here  we  generalize  the  concept  of  inner  product  (Sec.  9.3)  to  functions. 


THEOREM  1 


Orthogonality  of  the  Trigonometric  System  (3) 

The  trigonometric  system  (3)  is  orthogonal  on  the  interval  —77  s§  x Is  77  (hence 
also  on  0 Si  x =§  277  or  any  other  interval  of  length  277  because  of  periodicity);  that 
is,  the  integral  of  the  product  of  any  two  functions  in  (3)  over  that  interval  is  0,  so 
that  for  any  integers  n and  m, 


(9) 


(a) 


7 T 

cos  nx  cos  mx  dx  = 0 

— 7 T 


(n  A m) 


(b) 


77" 

sin  nx  sin  mx  dx  = 0 

— 7 T 


(n  A m ) 


(c) 


sin  nx  cos  mx  dx  = 0 


(n  A m or  n = m ). 


PROOF  This  follows  simply  by  transforming  the  integrands  trigonometrically  from  products  into 
sums.  In  (9a)  and  (9b),  by  (11)  in  App.  A3.1, 


cos  nx  cos  mx  dx  = 


cos  ( n + m)x  dx  + 


cos  ( n — m)x  dx 


sin  nx  sin  mx  dx  = 


2 


cos  ( n — m)x  dx 


2 


cos  (n  + m) x dx. 


Since  m A n (integer!),  the  integrals  on  the  right  are  all  0.  Similarly,  in  (9c),  for  all  integer 
m and  n (without  exception;  do  you  see  why?) 


sin  nx  cos  mx  dx  = 


sin  (n  + m)x  dx  + 


1 

2 J 


sin  (n  — m)x  dx  = 0 + 0. 


Application  of  Theorem  1 to  the  Fourier  Series  (5) 

We  prove  (6.0).  Integrating  on  both  sides  of  (5)  from  —77  to  77,  we  get 


f(x)  dx  = 


flo  + 2 (tin  cos  nx  + bn  sin  nx) 

n=  1 


dx. 


We  now  assume  that  termwise  integration  is  allowed.  (We  shall  say  in  the  proof  of 
Theorem  2 when  this  is  true.)  Then  we  obtain 


f(x)  dx  = a0 


dx  + ^ 
rr  n= 1 


cos  nx  dx  + by. 


sin  nx  dx 
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The  first  term  on  the  right  equals  2TTa0.  Integration  shows  that  all  the  other  integrals  are  0. 
Hence  division  by  277  gives  (6.0). 

We  prove  (6a).  Multiplying  (5)  on  both  sides  by  cos  mx  with  any  fixed  positive  integer 
m and  integrating  from  —77  to  77,  we  have 


(10) 

f 77 

fix)  cos  mx  dx  = 

f 77 

00 

n0  + 2 (<2m  cos  nx  + bn  sin  nx) 

. 

. 

— 7 T 

— 77 ' 

n= 1 

We  now  integrate  term  by  term.  Then  on  the  right  we  obtain  an  integral  of  a0  cos  mx, 
which  is  0;  an  integral  of  an  cos  nx  cos  mx  , which  is  am7T  for  n = m and  0 for  n A m by 
(9a);  and  an  integral  of  bn  sin  nx  cos  mx,  which  is  0 for  all  n and  m by  (9c).  Hence  the 
right  side  of  (10)  equals  am7T.  Division  by  77  gives  (6a)  (with  m instead  of  n). 

We  finally  prove  (6b).  Multiplying  (5)  on  both  sides  by  sin  mx  with  any  fixed  positive 
integer  m and  integrating  from  —77  to  77,  we  get 


(11) 


7 T 

fix)  sin  mx  dx 

— 7 T 


flo  + 


2 ( an  cos  nx  + bn  sin  nx) 

n= 1 


sin  mx  dx. 


Integrating  term  by  term,  we  obtain  on  the  right  an  integral  of  cio  sin  mx,  which  is  0;  an 
integral  of  an  cos  nx  sin  mx,  which  is  0 by  (9c);  and  an  integral  of  bn  sin  nx  sin  mx,  which 
is  bm7T  if  n = m and  0 if  n ¥=  m,  by  (9b).  This  implies  (6b)  (with  n denoted  by  m).  This 
completes  the  proof  of  the  Euler  formulas  (6)  for  the  Fourier  coefficients. 


Convergence  and  Sum  of  a Fourier  Series 

The  class  of  functions  that  can  be  represented  by  Fourier  series  is  surprisingly  large  and 
general.  Sufficient  conditions  valid  in  most  applications  are  as  follows. 


THEOREM  2 


fix) 


Representation  by  a Fourier  Series 

Let  fix)  be  periodic  with  period  2jv  and  piecewise  continuous  (see  Sec.  6.1)  in  the 
inten’al  — 77  Si  x Si  77.  Furthermore,  let  fix)  have  a left-hand  derivative  and  a right- 
hand  derivative  at  each  point  of  that  interval.  Then  the  Fourier  series  (5)  off(x) 
[with  coefficients  (6)]  converges.  Its  sum  is  fix),  except  at  points  x0  where  fix)  is 
discontinuous.  There  the  sum  of  the  series  is  the  average  of  the  left-  and  right-hand 
limits2  of  fix)  at  xq. 


X 


2The  left-hand  limit  of/(x)  at  x0  is  defined  as  the  limit  of/(x)  as  x approaches  r0  from  the  left 
and  is  commonly  denoted  by  /(jc 0 — 0).  Thus 

f(x0  — 0)  = lim  f(x0  — h)  as  h — > 0 through  positive  values. 
h^->0 


Fig.  26  Left-  and  The  right-hand  limit  is  denoted  by  f(x0  + 0)  and 

right-hand  limits 

f(x0  + 0)  = lim  f(x0  + h)  as  h — » 0 through  positive  values. 
/(I  - 0)  = 1,  h^n 

/(I  + 0)  = - The  left-  and  right-hand  derivatives  of  f(x)  at  x0  are  defined  as  the  limits  of 


of  the  function 


f M 


r X2  if  x < 1 
lx/2  ifxai 


f(x 0 - h)  - f(x0  - 0)  f(x0  + h)  - f(x0  + 0) 

and  , 

-h  -h 

respectively,  as  h — » 0 through  positive  values.  Of  course  if  f(x ) is  continuous  at  Jt0,  the  last  term  in 
both  numerators  is  simply  f(x0). 


SEC.  11.1  Fourier  Series 


481 


PROOF 


EXAMPLE  2 


We  prove  convergence,  but  only  for  a continuous  function  /(x)  having  continuous  first 
and  second  derivatives.  And  we  do  not  prove  that  the  sum  of  the  series  is  /(x)  because 
these  proofs  are  much  more  advanced;  see,  for  instance,  Ref.  [C12]  listed  in  App.  1. 
Integrating  (6a)  by  parts,  we  obtain 


77 


/(x)  cos  nx  dx  = 


f(x ) sin  nx 


nir 


1 

IITT 


/ (x ) sin  nx  dx. 


The  first  term  on  the  right  is  zero.  Another  integration  by  parts  gives 


f'(x)  cos  nx 


2_ 
n 77 


2_ 
n 77 


/ (x)  cos  nx  dx. 


The  first  term  on  the  right  is  zero  because  of  the  periodicity  and  continuity  of  f'(x).  Since 
f"  is  continuous  in  the  interval  of  integration,  we  have 

l/"«|  < M 

for  an  appropriate  constant  M.  Furthermore,  |cos  nx\  g 1.  It  follows  that 


n2  it 


f (x)  cos  nx  dx 


< 


2 

n 77  J 


Mdx  = 


2 M 


Similarly,  \bn\  <2  M/n 2 for  all  n.  Hence  the  absolute  value  of  each  term  of  the  Fourier 
series  of/(x)  is  at  most  equal  to  the  corresponding  term  of  the  series 


kol  +2M(1  + 1+  — + — + — + — + 

T T 32  32 


which  is  convergent.  Hence  that  Fourier  series  converges  and  the  proof  is  complete. 
(Readers  already  familiar  with  uniform  convergence  will  see  that,  by  the  Weierstrass 
test  in  Sec.  15.5,  under  our  present  assumptions  the  Fourier  series  converges  uniformly, 
and  our  derivation  of  (6)  by  integrating  term  by  term  is  then  justified  by  Theorem  3 of 
Sec.  15.5.) 


Convergence  at  a Jump  as  Indicated  in  Theorem  2 

The  rectangular  wave  in  Example  1 has  a jump  at  x = 0.  Its  left-hand  limit  there  is  —k  and  its  right-hand  limit 
is  k (Fig.  261).  Hence  the  average  of  these  limits  is  0.  The  Fourier  series  (8)  of  the  wave  does  indeed  converge 
to  this  value  when  x = 0 because  then  all  its  terms  are  0.  Similarly  for  the  other  jumps.  This  is  in  agreement 
with  Theorem  2. 


Summary.  A Fourier  series  of  a given  function /(x)  of  period  277  is  a series  of  the  form 
(5)  with  coefficients  given  by  the  Euler  formulas  (6).  Theorem  2 gives  conditions  that  are 
sufficient  for  this  series  to  converge  and  at  each  x to  have  the  value  /(x),  except  at 
discontinuities  of  /(x),  where  the  series  equals  the  arithmetic  mean  of  the  left-hand  and 
right-hand  limits  of  /(x)  at  that  point. 
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gHEGE 


1-5 


PERIOD,  FUNDAMENTAL  PERIOD 


The  fundamental  period  is  the  smallest  positive  period.  Find 
it  for 


1.  cos  x,  sin  x,  cos  2x,  sin  2x,  cos  ttx,  sin  ttx, 
cos  2ttx,  sin  27 tx 

27 tx  27T.tr  27 rnx 

2.  cos  nx,  sin  nx,  cos  — - — , sin  — - — , cos  — - — , 

k k k 


sin 


2irnx 

k 


3.  If  /( x)  and  g(x)  have  period  p,  show  that  h(x)  — 
af{x)  + bg(x ) (a,  b,  constant)  has  the  period  p.  Thus 
all  functions  of  period  p form  a vector  space. 

4.  Change  of  scale.  If  fix)  has  period  p,  show  that 
fiax),  a A 0,  and /(*/&),  b + 0,  are  periodic  functions 
of  x of  periods  p/a  and  bp,  respectively.  Give  examples. 

5.  Show  that/  = const  is  periodic  with  any  period  but  has 
no  fundamental  period. 


6-10 


GRAPHS  OF  27T-PERIODIC  FUNCTIONS 


Sketch  or  graph  f(x)  which  for  — 7T  < x < 7T  is  given  as 
follows. 

6-  f(x)  = \x\ 

1.  fix)  = | sin  jc | , f{x ) = sin  \x\ 

8 .f(x)  = e~M,  f(x)=\e~x\ 

X if  —7 T < X < 0 

9-  fix)  = ' 

10.  f(x)  = 


7 T — X if  0 < X < TT 
— cos2x  if  —TT  < X < 0 


cosz  x if  0 < X < TT 


11.  Calculus  review.  Review  integration  techniques  for 
integrals  as  they  are  likely  to  arise  from  the  Euler 
formulas,  for  instance,  definite  integrals  of  x cos  nx, 

2 • — 2r 

x sin  nx,  e cos  nx,  etc. 


12-21 


FOURIER  SERIES 


Find  the  Fourier  series  of  the  given  function  fix),  which  is 
assumed  to  have  the  period  27 r.  Show  the  details  of  your 
work.  Sketch  or  graph  the  partial  sums  up  to  that  including 
cos  5x  and  sin  5x. 


12.  fix)  in  Prob.  6 

13.  fix)  in  Prob.  9 

14.  fix)  = X2  ( — 77  < X < TT) 

15.  fix)  = X2  iO<X<  27 T) 


b 

/ 

/ 

1 

n 

0 1 
2 

n 

K 

22.  CAS  EXPERIMENT.  Graphing.  Write  a program  for 
graphing  partial  sums  of  the  following  series.  Guess 
from  the  graph  what  fix)  the  series  may  represent. 
Confirm  or  disprove  your  guess  by  using  the  Euler 
formulas. 

(a)  2(sin  x + § sin  3x  + g sin  5x  + ■ ■ ■) 

— 2 (g  sin  2x  + \ sin  Ax  + g sin  6x  ■ ■ ■) 

(b)  — I o ( cos  x H — cos  3x  I cos  5x  + ■ ■ ■ ) 

2 7f\  9 25  / 

(c)  1 772  + 4(cOS  X — | COS  2x  + jg  COS  3x  — jg  COS  Ax 

+ _...) 

23.  Discontinuities.  Verify  the  last  statement  in  Theorem 
2 for  the  discontinuities  of/(x)  in  Prob.  21. 

24.  CAS  EXPERIMENT.  Orthogonality.  Integrate  and 
graph  the  integral  of  the  product  cos  mx  cos  nx  (with 
various  integer  m and  n of  your  choice)  from  — a to  a 
as  a function  of  a and  conclude  orthogonality  of  cos  mx 


SEC.  11.2  Arbitrary  Period.  Even  and  Odd  Functions.  Half-Range  Expansions 


483 


25.  CAS  EXPERIMENT.  Order  of  Fourier  Coefficients. 

The  order  seems  to  be  1 jn  if/is  discontinous,  and  1 /n2 


and  cos  nx  (m  A n)  for  a = 77  from  the  graph.  For  what 
m and  n will  you  get  orthogonality  for  a = tt/2,  77/3, 
7t/4?  Other  a?  Extend  the  experiment  to  cos  mx  sin  nx 
and  sin  mx  sin  nx. 


if/is  continuous  but/*  = df/dx  is  discontinuous,  l//i3 
if /and/  are  continuous  but/  is  discontinuous,  etc. 
Try  to  verify  this  for  examples.  Try  to  prove  it  by 
integrating  the  Euler  formulas  by  parts.  What  is  the 
practical  significance  of  this? 


11.2  Arbitrary  Period.  Even  and  Odd  Functions. 
Half-Range  Expansions 


We  now  expand  our  initial  basic  discussion  of  Fourier  series. 

Orientation.  This  section  concerns  three  topics: 

1.  Transition  from  period  277  to  any  period  2 L,  for  the  function  / simply  by  a 
transformation  of  scale  on  the  x-axis. 

2.  Simplifications.  Only  cosine  terms  if  / is  even  (“Fourier  cosine  series”).  Only  sine 
terms  if/is  odd  (“Fourier  sine  series”). 

3.  Expansion  of  / given  for  0 giSf  in  two  Fourier  series,  one  having  only  cosine 
terms  and  the  other  only  sine  terms  (“half-range  expansions”). 


Clearly,  periodic  functions  in  applications  may  have  any  period,  not  just  277  as  in  the  last 
section  (chosen  to  have  simple  formulas).  The  notation  p = 2 L for  the  period  is  practical 
because  L will  be  a length  of  a violin  string  in  Sec.  12.2,  of  a rod  in  heat  conduction  in 
Sec.  12.5,  and  so  on. 

The  transition  from  period  277  to  be  period  p = 2 L is  effected  by  a suitable  change  of 
scale,  as  follows.  Let  f(x)  have  period  p = 2 L.  Then  we  can  introduce  a new  variable  v 
such  that /(A),  as  a function  of  v,  has  period  277.  If  we  set 

P 277  77 

(1)  (a)  x = — — v,  so  that  (b)  v = x = — x 

277  P L 

then  v = ±77  corresponds  to  x = ±L.  This  means  that/,  as  a function  of  v,  has  period 
277  and,  therefore,  a Fourier  series  of  the  form 


1.  From  Period  27T  to  Any  Period  p — 2L 


(2) 


m=f 


a0  + 2 ( an  cos  nv  + bn  sin  nv) 


n= 1 


oo 


with  coefficients  obtained  from  (6)  in  the  last  section 


(3) 


— TT 
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EXAMPLE  1 


We  could  use  these  formulas  directly,  but  the  change  to  x simplifies  calculations.  Since 

77  77 

(4)  v = — x,  we  have  dv  = — dx 


and  we  integrate  over  x from  — L to  L.  Consequently,  we  obtain  for  a function  fix)  of 
period  2 L the  Fourier  series 


(5) 


m 


a0  + 2 

n=l 


n7T  . HTT  \ 

cos x + by,  sin x 

L n L J 


with  the  Fourier  coefficients  of  f(x)  given  by  the  Euler  formulas  (tt/L  in  dx  cancels 
1/77  in  (3)) 


(6) 


(0)  a0 

(a)  a,, 

(b)  bn 


2 L 

f 

L. 

I 

L. 


,L 


f{x)  dx 

J-L 

rL 


fix)  cos 

-L 

rL 


mrx 

L 


fix)  sin 

-L 


I17TX 

L 


dx 

dx 


n = 1,  2,  • • • 

n — 1,  2,  • • • . 


Just  as  in  Sec.  1 1.1,  we  continue  to  call  (5)  with  any  coefficients  a trigonometric  series. 
And  we  can  integrate  from  0 to  2 L or  over  any  other  interval  of  length  p = 2 L. 


Periodic  Rectangular  Wave 

Find  the  Fourier  series  of  the  function  (Fig.  263) 


10  if 
k if 
0 if 


—2  < x < — 1 
-1  <x<  1 

1 < x < 2 


p = 2L  = 4,  L = 2. 


Solution.  From  (6.0)  we  obtain  a0  = k/2  (verify!).  From  (6a)  we  obtain 


an  = — f(x ) cos dx  = — k cos dx  = — sin . 


2k 


Thus  an  = 0 if  n is  even  and 

an  = 2k/mr  if  n = 1,  5,  9,  • • • , an  = —2k/mr  if  n = 3,  7,  1 1,  • • • . 

From  (6b)  we  find  that  bn  = 0 for  n = 1,  2,  • • • . Hence  the  Fourier  series  is  a Fourier  cosine  series  (that  is,  it 
has  no  sine  terms) 


fix) 


k 2k  f 7 T 1 377  1 577 

— I COS  — X COS X H COS X — +••• 

2 77  V 2 3 2 5 2 
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EXAMPLE  2 


EXAMPLE  3 


-2-1012 

Fig.  263.  Example  1 


fix) 


fix) 

2 

1 

1 

-k  1 

Fig.  264.  Example  2 


Periodic  Rectangular  Wave.  Change  of  Scale 

Find  the  Fourier  series  of  the  function  (Fig.  264) 

{— k if  —2  < x < 0 
k if  0 < x < 2 

Solution.  Since  L = 2,  we  have  in  (3)  V = ttx/2  and  obtain  from  (8)  in  Sec.  11.1  with  v instead  of  x,  that  is. 


p = 2L  = 4.  L = 2. 


g(v) 


4k  ( 1 1 

= — sin  v H — sin  3l>  H — sin  5v  + 


the  present  Fourier  series 


4k  ( it  1 3ir  1 5tt 

fix)  = — sin  — x H — sin x H — sin > 

n-  V 2 3 2 5 2 


Confirm  this  by  using  (6)  and  integrating. 

Half-Wave  Rectifier 

A sinusoidal  voltage  E sin  cot,  where  t is  time,  is  passed  through  a half-wave  rectifier  that  clips  the  negative 
portion  of  the  wave  (Fig.  265).  Find  the  Fourier  series  of  the  resulting  periodic  function 


u(t)  = < 


0 if  ~L  < t < 0,  277  77 

p = 2L  = — , L = — . 


I E sin  cot  if  0 < t < L 
Solution.  Since  u = 0 when  — L < t < 0,  we  obtain  from  (6.0),  with  t instead  of  x. 


a0  = 


277 


tt/w 


E sin  cot  dt  = 


and  from  (6a),  by  using  formula  (11)  in  App.  A3.1  with  x = cot  and  y = ncot. 


tt/w 


E sin  cot  cos  ncot  dt  = 


coE 

277 


tt/w 


[sin  (1  + n)  cot  + sin  (1  — n ) cot ] dt. 


If  n = 1,  the  integral  on  the  right  is  zero,  and  if  n = 2,  3,  • • • , we  readily  obtain 


coE 

277 


cos  (1  + n)cot  cos  (1  — n)cot 


(1  + n)co 


(1  — n)co 


E /—cos  (1  + ri)7T  + 1 —cos  (1  — n)7T  + 1 
277  V 1 + n 1 — n 


1 + n 

If  n is  odd,  this  is  equal  to  zero,  and  for  even  n we  have 

E ( 2 2 \ 2 E 


277  \1  + n 1 — n 


( n — 1 )(n  + 1)77 


(n  = 2,  4,  • • • ). 
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In  a similar  fashion  we  find  from  (6b)  that  bi  = E/2  and  bn  = 0 for  n = 2,  3,  • ■ ■ . Consequently, 


nlc)  0 k!(o  t 


Fig.  265.  Half-wave  rectifier 


2.  Simplifications:  Even  and  Odd  Functions 

If  f(x)  is  an  even  function,  that  is,  /(—  x)  = fix)  (see  Fig.  266),  its  Fourier  series  (5) 
reduces  to  a Fourier  cosine  series 


(5*)  f(x)  = a0+'2jan  cos— x (/even) 

Fig.  266.  n=1  L 

Even  function 

with  coefficients  (note:  integration  from  0 to  L only!) 


Fig.  267. 

Odd  function 


(6*) 


«0 


L 


,L 

fix)  dx, 


Qn 


2 

L 


(L 

fix)  COS 

^0 


I1TTX 

L 


dx. 


n = 1,  2,  ■ ■ • . 


If/(x)  is  an  odd  function,  that  is,  /(—  x)  = —fix)  (see  Fig.  267),  its  Fourier  series  (5) 
reduces  to  a Fourier  sine  series 


(5**) 


fix)  = 2 bn  sin  Lx  if  odd) 

n=  1 


with  coefficients 
(6**) 


bn  = J 


f(x)  sin  — dx. 


These  formulas  follow  from  (5)  and  (6)  by  remembering  from  calculus  that  the  definite 
integral  gives  the  net  area  (=  area  above  the  axis  minus  area  below  the  axis)  under  the 
curve  of  a function  between  the  limits  of  integration.  This  implies 


(7) 


(a) 


fL  rL 

gix)  dx  = 2 g (x)  dx 
o 


(b) 


,L 

h ix)  dx  = 0 

J-L 


for  even  g 


for  odd  h 


Formula  (7b)  implies  the  reduction  to  the  cosine  series  (even/  makes  fix)  sin  ( mrx/L ) odd 
since  sin  is  odd)  and  to  the  sine  series  (odd/ makes  fix)  cos  (mrx/L)  odd  since  cos  is  even). 
Similarly,  (7a)  reduces  the  integrals  in  (6*)  and  (6**)  to  integrals  from  0 to  L.  These  reductions 
are  obvious  from  the  graphs  of  an  even  and  an  odd  function.  (Give  a formal  proof.) 
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EXAMPLE  4 


THEOREM  1 


EXAMPLE  5 


Summary 

Even  Function  of  Period  2tt.  If  / is  even  and  L = it.  then 


fix)  = cio  + 2 an  cos  nx 

n= 1 


with  coefficients 


ao 


77 


fix)  dx,  an  = — 


fix)  cos  nx  dx,  n = 1,2, 


Odd  Function  of  Period  277.  If/ is  odd  and  L = 77,  then 


fix)  = 2 bn  sin  nx 

n= 1 


with  coefficients 


, =2_ 

K 7T 


f(x)  sin  nx  dx. 


n =1,2, 


Fourier  Cosine  and  Sine  Series 

The  rectangular  wave  in  Example  1 is  even.  Hence  it  follows  without  calculation  that  its  Fourier  series  is  a 
Fourier  cosine  series,  the  bn  are  all  zero.  Similarly,  it  follows  that  the  Fourier  series  of  the  odd  function  in 
Example  2 is  a Fourier  sine  series. 

In  Example  3 you  can  see  that  the  Fourier  cosine  series  represents  u(t)  — E/tt  — \E  sin  a>t.  Can  you  prove 
that  this  is  an  even  function? 

Further  simplifications  result  from  the  following  property,  whose  very  simple  proof  is  left 
to  the  student. 


Sum  and  Scalar  Multiple 

The  Fourier  coefficients  of  a sumfi  + f%  are  the  sums  of  the  corresponding  Fourier 
coefficients  offi  and  fz- 

The  Fourier  coefficients  of  cf  are  c times  the  corresponding  Fourier  coefficients  off. 


Sawtooth  Wave 

Find  the  Fourier  series  of  the  function  (Fig.  268) 

fix)  = x + tt  if  —tt<x<tt  and  f{x  + 277)  = fix). 


Fig.  268.  The  function  f[x).  Sawtooth  wave 
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Solution.  We  have/  = + /2,  where /i  = x and/2  = 77  ■ The  Fourier  coefficients  of/2  are  zero,  except  for 

the  first  one  (the  constant  term),  which  is  77.  Hence,  by  Theorem  1,  the  Fourier  coefficients  an,  bn  are  those  of 
/1,  except  for  ciq,  which  is  77.  Since /1  is  odd,  cin  = 0 for  n = 1,  2,  • • • , and 

2 fn  . 2 r 

Z?n  = — /l  (x)  sin  nxdx  = — x sin  nx  dx. 

^0  ^0 


Integrating  by  parts,  we  obtain 


Hence  b\  = 2,  Z?2  = 


—x  cos  nx 
n 


cos  nx  dx 


= COS  ft 77. 

ft 


~ I,  ^3  = §,  ^4  = — ' • • , and  the  Fourier  series  of  /( x)  is 


/ • 1 1 

fix)  = 77  + 2 sin  x sin  2x  H — sin  3x  — + ■ 

' 2 3 


(Fig.  269)  ■ 


3.  Half-Range  Expansions 

Half-range  expansions  are  Fourier  series.  The  idea  is  simple  and  useful.  Figure  270 
explains  it.  We  want  to  represent /(x)  in  Fig.  270.0  by  a Fourier  series,  where /(x) 
may  be  the  shape  of  a distorted  violin  string  or  the  temperature  in  a metal  bar  of  length 
L,  for  example.  (Corresponding  problems  will  be  discussed  in  Chap.  12.)  Now  comes 
the  idea. 

We  could  extend /(x)  as  a function  of  period  L and  develop  the  extended  function  into 
a Fourier  series.  But  this  series  would,  in  general,  contain  both  cosine  and  sine  terms.  We 
can  do  better  and  get  simpler  series.  Indeed,  for  our  given  / we  can  calculate  Fourier 
coefficients  from  (6*)  or  from  (6**).  And  we  have  a choice  and  can  take  what  seems 
more  practical.  If  we  use  (6*),  we  get  (5*).  This  is  the  even  periodic  extension  of  / 
in  Fig.  270a.  If  we  choose  (6**)  instead,  we  get  (5**),  the  odd  periodic  extension  of 
/in  Fig.  270b. 

Both  extensions  have  period  2 L.  This  motivates  the  name  half-range  expansions: /is 
given  (and  of  physical  interest)  only  on  half  the  range,  that  is,  on  half  the  interval  of 
periodicity  of  length  2 L. 

Let  us  illustrate  these  ideas  with  an  example  that  we  shall  also  need  in  Chap.  12. 
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fix) 


X 


(0)  The  given  function  f(x) 


i 


i 


-L 


L 


x 


(a)  f(x ) continued  as  an  even  periodic  function  of  period  2 L 


(b)  fix)  continued  as  an  odd  periodic  function  of  period  2 L 

Fig.  270.  Even  and  odd  extensions  of  period  2 L 


“Triangle”  and  Its  Half-Range  Expansions 

^ L Find  the  two  half-range  expansions  of  the  function  (Fig.  271) 

^ i \ (2k 


LI  2 


L x 


Fig.  271.  The  given 
function  in  Example  6 


m = < 


if  0 < x < - 
L 2 

2k  L 

— (L  — x)  if  < x < L. 
L 2 


Solution,  (a)  Even  periodic  extension.  From  (6*)  we  obtain 


a o = 


2 

an  — ~ 


2k 


L/2 


2k 


— x dx  H 


iL  — x)  dx 


L/2 


2k 


L/2 


2k 


— x cos x dx  H 


L/2 


M T 

(L  — x)  cos  dx 


We  consider  an.  For  the  first  integral  we  obtain  by  integration  by  parts 


rL/2 

7777 

Lx 

7777 

L/2 

L f 

x cos  — x dx  = 

— 

sin  — x 

— 

— 

J L 

7777 

L 

0 

7777  J 

0 

0 

rL/2 


7777 

sin  x dx 

L 


L . 7777 

sin  


7777 

COS 1 

2 


Similarly,  for  the  second  integral  we  obtain 

rL 


L/2 


7777  L 7777 

(L  — x)  cos x dx  = (L  — x)  sin ; 

L 7777  L 


L/2 


L 

niT 


L/2 


7777 

sin  — x dx 
L 


= 0- 


L - 


COS  7777  — COS 
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We  insert  these  two  results  into  the  formula  for  an.  The  sine  terms  cancel  and  so  does  a factor  if.  This  gives 

4 k 


mr 

2 COS COS  7277  — 1 

2 


Thus, 


a2  = -16£/(22772),  a6  = — 16£/(62772),  «10  = — 1 6£/(l 02772),  ••  • 

and  an  = 0 if  n =£  2,  6,  10,  14,  • • • . Hence  the  first  half-range  expansion  of  f(x)  is  (Fig.  272a) 


fix)  = -- 


\6k  ( 1 27T 


77  \Z 


1 677 

— COS . 


This  Fourier  cosine  series  represents  the  even  periodic  extension  of  the  given  function  fix),  of  period  2 L. 
(b)  Odd  periodic  extension.  Similarly,  from  (6**)  we  obtain 


(5) 


&n. 


8 k 


Hence  the  other  half-range  expansion  of  f(x)  is  (Fig.  272b) 

8k  ( 1 . 77  1 377  1 . 577 

fix)  = — — sin  — x sin x -\ sin  x — 

77-2Vi2  L 32  L 5 2 L 

The  series  represents  the  odd  periodic  extension  of  fix),  of  period  2 L. 

Basic  applications  of  these  results  will  be  shown  in  Secs.  12.3  and  12.5. 


0 L 

(a)  Even  extension 


\ 

(b)  Odd  extension 

Fig.  272.  Periodic  extensions  of  f[x)  in  Example  6 


1-7 


EVEN  AND  ODD  FUNCTIONS 


Are  the  following  functions  even  or  odd  or  neither  even  nor 
odd? 


1.  ex,  e~^x[  x3  cos  nx,  x2  tan  7J%  sinhjc  — cosh.r 

2.  sin2.r,  sin  (.r2),  lnx,  x/(x2  + 1),  xcotx 

3.  Sums  and  products  of  even  functions 

4.  Sums  and  products  of  odd  functions 

5.  Absolute  values  of  odd  functions 

6.  Product  of  an  odd  times  an  even  function 

7.  Find  all  functions  that  are  both  even  and  odd. 


8-17 


FOURIER  SERIES  FOR  PERIOD  p = 2L 


Is  the  given  function  even  or  odd  or  neither  even  nor 
odd?  Find  its  Fourier  series.  Show  details  of  your 
work. 


8. 


-1 


o 


l 
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9. 


1 

1 

1 

-2 

2 

--1 

11.  /( x)  = xz  (—  1 < x < 1),  p — 2 

12.  fix)  = 1 - x2/4  (-2  < x < 2),  p = 4 


18.  Rectifier.  Find  the  Fourier  series  of  the  function 
obtained  by  passing  the  voltage  v(t)  = Vo  cos  1007T? 
through  a half-wave  rectifier  that  clips  the  negative 
half-waves. 

19.  Trigonometric  Identities.  Show  that  the  familiar 
identities  cos3  x = | cos  x + | cos  3x  and  sin3  x = | 
sin  x — 5 sin  3x  can  be  interpreted  as  Fourier  series 
expansions.  Develop  cos4x. 

20.  Numeric  Values.  Using  Prob.  11,  show  that  1 + 4 + 

21.  CAS  PROJECT.  Fourier  Series  of  2L-Periodic 
Functions,  (a)  Write  a program  for  obtaining  partial 
sums  of  a Fourier  series  (5). 


(b)  Apply  the  program  to  Probs.  8-11,  graphing  the  first 
few  partial  sums  of  each  of  the  four  series  on  common 
axes.  Choose  the  first  five  or  more  partial  sums  until 
they  approximate  the  given  function  reasonably  well. 
Compare  and  comment. 

22.  Obtain  the  Fourier  series  in  Prob.  8 from  that  in 
Prob.  17. 


23-29 


HALF-RANGE  EXPANSIONS 


Find  (a)  the  Fourier  cosine  series,  (b)  the  Fourier  sine  series. 
Sketch  f(x)  and  its  two  periodic  extensions.  Show  the 
details. 

23. 


24. 


1 - 


25. 


26. 


29.  f(x)  = sin  x (0  < x < 77) 

30.  Obtain  the  solution  to  Prob.  26  from  that  of 
Prob.  27. 
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13  Forced  Oscillations 


Fourier  series  have  important  applications  for  both  ODEs  and  PDEs.  In  this  section  we 
shall  focus  on  ODEs  and  cover  similar  applications  for  PDEs  in  Chap.  12.  All  these 
applications  will  show  our  indebtedness  to  Euler’s  and  Fourier’s  ingenious  idea  of  splitting 
up  periodic  functions  into  the  simplest  ones  possible. 

From  Sec.  2.8  we  know  that  forced  oscillations  of  a body  of  mass  m on  a spring  of 
modulus  k are  governed  by  the  ODE 

(1)  my"  + cy'  + ky  = r(t ) 

where  y = y (t)  is  the  displacement  from  rest,  c the  damping  constant,  k the  spring  constant 
(spring  modulus),  and  r(t)  the  external  force  depending  on  time  t.  Figure  274  shows  the 
model  and  Fig.  275  its  electrical  analog,  an  WLC-circuit  governed  by 

(1*)  LI"  + Rl'  + ~I  = E'  (0  (Sec.  2.9). 

We  consider  (1).  If  r(t)  is  a sine  or  cosine  function  and  if  there  is  damping  (c  > 0), 
then  the  steady-state  solution  is  a harmonic  oscillation  with  frequency  equal  to  that  of  r(t). 
However,  if  r(t)  is  not  a pure  sine  or  cosine  function  but  is  any  other  periodic  function, 
then  the  steady-state  solution  will  be  a superposition  of  harmonic  oscillations  with 
frequencies  equal  to  that  of  r(t)  and  integer  multiples  of  these  frequencies.  And  if  one  of 
these  frequencies  is  close  to  the  (practical)  resonant  frequency  of  the  vibrating  system  (see 
Sec.  2.8),  then  the  corresponding  oscillation  may  be  the  dominant  part  of  the  response  of 
the  system  to  the  external  force.  This  is  what  the  use  of  Fourier  series  will  show  us.  Of 
course,  this  is  quite  surprising  to  an  observer  unfamiliar  with  Fourier  series,  which  are 
highly  important  in  the  study  of  vibrating  systems  and  resonance.  Let  us  discuss  the  entire 
situation  in  terms  of  a typical  example. 


Fig.  274.  Vibrating  system 

under  consideration 


Fig.  275.  Electrical  analog  of  the  system 
in  Fig.  274  (RLC-circuit) 


Forced  Oscillations  under  a Nonsinusoidal  Periodic  Driving  Force 

In  (1),  let  m = 1 (g),  c = 0.05  (g/sec),  and  k = 25  (g/sec2),  so  that  (1)  becomes 


(2) 


y"  + 0.05/  + 25v  = r(f) 


SEC.  11.3  Forced  Oscillations 
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Fig.  276.  Force  in  Example  1 


where  r(f)  is  measured  in  g ■ cm/ sec2.  Let  (Fig.  276) 


r(t)  = 


7 7 

t + — 
2 


7T 

-1  + — 
2 


if  — 7T  < t < 0, 


r(t  + 277)  = r(f). 


if  0 < t < 77, 


Find  the  steady-state  solution  >'(/). 

Solution.  We  represent  r(f)  by  a Fourier  series,  finding 


(3) 


4 / 1 

r(t)  = — I cos  1 H — „ cos  3r  + 
77  V 32 


— ~ cos  5/  + • • ■ 
52 


Then  we  consider  the  ODE 

„ , 4 

(4)  y + 0.05;/  + 25y  = cos  nt  (n  = 1,  3,  • • • ) 

«277 

whose  right  side  is  a single  term  of  the  series  (3).  From  Sec.  2.8  we  know  that  the  steady-state  solution  yn(t) 
of  (4)  is  of  the  form 

(5)  yn  = An  cos  nt  + Bn  sin  nt. 

By  substituting  this  into  (4)  we  find  that 

4(25  - »2)  0.2  9 9 9 

(6)  An  = Bn  = , where  Dn  = (25  - n2)2  + (0.05»)2. 

nirDn  mrDn 

Since  the  ODE  (2)  is  linear,  we  may  expect  the  steady-state  solution  to  be 

(7)  y = yj  + y3  + y5  + ■ ■ • 

where  yn  is  given  by  (5)  and  (6).  In  fact,  this  follows  readily  by  substituting  (7)  into  (2)  and  using  the  Fourier 
series  of  r{t ),  provided  that  termwise  differentiation  of  (7)  is  permissible.  (Readers  already  familial*  with  the 
notion  of  uniform  convergence  [Sec.  15.5]  may  prove  that  (7)  may  be  differentiated  term  by  term.) 

From  (6)  we  find  that  the  amplitude  of  (5)  is  (a  factor  VZ\  cancels  out) 

Cn  = VA2n  + Bl  = 4 

77.  77" "v  Dn 

Values  of  the  first  few  amplitudes  are 

Ci  = 0.0531  C3  = 0.0088  C5  = 0.2037  C7  = 0.0011  C9  = 0.0003. 

Figure  277  shows  the  input  (multiplied  by  0.1)  and  the  output.  For  n = 5 the  quantity  Dn  is  very  small,  the 
denominator  of  C5  is  small,  and  C5  is  so  large  that  315  is  the  dominating  term  in  (7).  Hence  the  output  is  almost 
a harmonic  oscillation  of  five  times  the  frequency  of  the  driving  force,  a little  distorted  due  to  the  term  yi,  whose 
amplitude  is  about  25%  of  that  of  y$.  You  could  make  the  situation  still  more  extreme  by  decreasing  the  damping 
constant  c.  Try  it. 
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Fourier  Analysis 


PRO  B n M~  S E T ~n  i 


1.  Coefficients  Cn.  Derive  the  formula  for  Cn  from  An 
and  Bn. 

2.  Change  of  spring  and  damping.  In  Example  1,  what 
happens  to  the  amplitudes  Cn  if  we  take  a stiffer  spring, 
say,  of  k = 49?  If  we  increase  the  damping? 

3.  Phase  shift.  Explain  the  role  oftheflK’s.  What  happens 
if  we  let  c —*  0? 


4.  Differentiation  of  input.  In  Example  1,  what  happens 
if  we  replace  r (f)  with  its  derivative,  the  rectangular  wave? 
What  is  the  ratio  of  the  new  Cn  to  the  old  ones? 

5.  Sign  of  coefficients.  Some  of  the  An  in  Example  1 are 
positive,  some  negative.  All  Bn  are  positive.  Is  this 
physically  understandable? 


6-11 


GENERAL  SOLUTION 


Find  a general  solution  of  the  ODE  y"  + <u2y  = r(t ) with 
r(t)  as  given.  Show  the  details  of  your  work. 

6.  r{t)  = sin  at  + sin  /3f,  co2  + a2,  f32 

7.  r(t)  = sin  t,  co  = 0.5,  0.9,  1.1,  1.5,  10 


8.  Rectifier.  r{t)  = 7t/4  |cos  f|  if  —7 r < t < tt  and 
r(t  + 27t)  = r(f),  |cu|  A 0,  2,  4,  • • ■ 


9.  What  kind  of  solution  is  excluded  in  Prob.  8 by 
| co | =A  0,2,4,-  ■•? 

10.  Rectifier.  r{t)  = 7t/4  | sin  t\  if  0 < t < 277  and 
r(t  + 27 r)  = r(f),  |&)|  A 0,  2,  4,  • • ■ 


f—  1 if  —77  < t < 0 

11.  r(t)  = { |cu|  A 1,3,5, 

l 1 if  0 < t < 77, 


12.  CAS  Program.  Write  a program  for  solving  the  ODE 
just  considered  and  for  jointly  graphing  input  and  output 
of  an  initial  value  problem  involving  that  ODE.  Apply 


the  program  to  Probs.  7 and  1 1 with  initial  values  of  your 
choice. 


13-16 


STEADY-STATE  DAMPED  OSCILLATIONS 


Find  the  steady-state  oscillations  of  y"  + cy  + y = r{t) 
with  c > 0 and  r(t)  as  given.  Note  that  the  spring  constant 
is  k = 1.  Show  the  details.  In  Probs.  14-16  sketch  r{t). 

N 

13.  r(t)  = 2(an  cos  nt  + bn  sin  nt) 


n= 1 


f—  1 if  —77  < t<  0 

14.  r(t)  = s and  r(f  + 277)  = r(t) 

l 1 if  0 < 7 < 77 

15.  r{t)  = f(772  — t2)  if  —77  < t < 77  and 
r(t  + 277)  = r(f) 

16.  r(t ) = 

r t if  —77/2  < t < 77/2 

s and  r(t  + 277)  = r(t) 

1 77  — t if  77/2  < t < 377/2 


17-19 


RLC-  CIRCUIT 


Find  the  steady-state  current  I{t)  in  the  ??LC-circuit  in 
Fig.  275,  where  R = 10O,L=  1 H,  C = 10-1  F and  with 
E(t)  V as  follows  and  periodic  with  period  277.  Graph  or 
sketch  the  first  four  partial  sums.  Note  that  the  coefficients 
of  the  solution  decrease  rapidly.  Hint.  Remember  that  the 
ODE  contains  E'(t),  not  E(t),  cf.  Sec.  2.9. 

f — 50r2  if  —77  < t < 0 


17.  E{t)  = 


50f2  if 


0 < t < 77 


SEC.  11.4  Approximation  by  Trigonometric  Polynomials 
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18.  E(t)  = 


100  ( t - t2) 
100  ( t + t2) 


19.  E{t)  = 200f(7T2  - t2) 


if  —7 r < t < 0 

if  0 < t < 77 
(— 7T  < t < 7r) 


20.  CAS  EXPERIMENT.  Maximum  Output  Term. 

Graph  and  discuss  outputs  of  y"  + cy  + ky  = r(t)  with 
r(t)  as  in  Example  1 for  various  c and  k with  emphasis  on 
the  maximum  Cn  and  its  ratio  to  the  second  largest  \Cn\. 


11.4  Approximation 

by  Trigonometric  Polynomials 

Fourier  series  play  a prominent  role  not  only  in  differential  equations  but  also  in 
approximation  theory,  an  area  that  is  concerned  with  approximating  functions  by 
other  functions — usually  simpler  functions.  Here  is  how  Fourier  series  come  into  the 
picture. 

Let  f(x ) be  a function  on  the  interval  — 7T  iig  tt  that  can  be  represented  on  this 
interval  by  a Fourier  series.  Then  the  Mh  partial  sum  of  the  Fourier  series 

N 

( 1 ) f(x)  ~ flo  + 2 (an  cos  nx  + bn  sin  nx) 

n=  1 

is  an  approximation  of  the  given  fix).  In  (1)  we  choose  an  arbitrary  N and  keep  it  fixed. 
Then  we  ask  whether  (1)  is  the  “best”  approximation  of  / by  a trigonometric  polynomial 
of  the  same  degree  N,  that  is,  by  a function  of  the  form 

N 

(2)  F(x)  = Aq  + 2 ( An  cos  nx  + Bn  sin  nx)  ( N fixed). 

71=1 

Here,  “best”  means  that  the  “error”  of  the  approximation  is  as  small  as  possible. 

Of  course  we  must  first  define  what  we  mean  by  the  error  of  such  an  approximation. 
We  could  choose  the  maximum  of  \f(x)  — F(x)  |.  But  in  connection  with  Fourier  series 
it  is  better  to  choose  a definition  of  error  that  measures  the  goodness  of  agreement  between 
/ and  F on  the  whole  interval  —tt  x ^ tt.  This  is  preferable  since  the  sum/of  a Fourier 
series  may  have  jumps:  F in  Fig.  278  is  a good  overall  approximation  off,  but  the  maximum 
of  | f{x)  — F(x)|  (more  precisely,  the  supremum)  is  large.  We  choose 


(3) 


E = 


7 T 

(/-  Ffdx. 

— 77" 


Fig.  278.  Error  of  approximation 
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This  is  called  the  square  error  of  F relative  to  the  function/on  the  interval  — 77  2=  x = tt. 
Clearly,  £§0. 

N being  fixed,  we  want  to  determine  the  coefficients  in  (2)  such  that  E is  minimum. 
Since  (/  — F)z  = f2  — 2 fF  + F2,  we  have 


(4) 


E = 


fzdx  - 2 


fFdx  + 


Fzdx. 


We  square  (2),  insert  it  into  the  last  integral  in  (4),  and  evaluate  the  occurring  integrals. 
This  gives  integrals  of  cos 2 nx  and  sin2  nx  (n  =g  1),  which  equal  7 r,  and  integrals  of 
cos  nx,  sin  nx,  and  (cos  «x)(sin  mi),  which  are  zero  (just  as  in  Sec.  11.1).  Thus 


7 T 

C 77 

r N 1 

F2  dx  = 

A0  + ^ ( An  cos  nx  + Bn  sin  nx) 

— 7T 

— 7 T 

n=  1 

— 77(2Aq  + A 2 + • ■ • + An  + B 2 + ■ ■ • + Bn). 


We  now  insert  (2)  into  the  integral  of  fF  in  (4).  This  gives  integrals  of /cos  nx  as  well 
as /sin  nx,  just  as  in  Euler’s  formulas,  Sec.  11.1,  for  an  and  bn  (each  multiplied  by  An  or 
Bn).  Hence 


fFdx  — 77(2Aoflo  4 A]flq  + ■ ■ • + Ajyfljv  T B\b\  + ■ • ■ + B^n)- 

T 

With  these  expressions,  (4)  becomes 


E = 


fz  dx  - 27 T 


(5) 


N 


^Aqciq  (Anan  + Bnbn) 


n= 1 


+ 77 


N 


2 A20  + ^ (A2  + B l) 

n= 1 


We  now  take  An  = an  and  Bn  = bn  in  (2).  Then  in  (5)  the  second  line  cancels  half  of  the 
integral-free  expression  in  the  first  line.  Hence  for  this  choice  of  the  coefficients  of  F the 
square  error,  call  it  E*,  is 


(6) 


E* 


7T 

f2dx-TT 

— IT 


N 1 
2«0  + 2 (an  + b2)  . 

n= 1 


We  finally  subtract  (6)  from  (5).  Then  the  integrals  drop  out  and  we  get  terms 
A2  — 2 Anan  + = ( An  — an)2  and  similar  terms  (Bn  — bn)2: 

f N 

E - E*  = 77 1 2(A0  - fl0)2  + 2 Wn  - dnf  + (Bn  ~ bnf  ] 

^ n=l 

Since  the  sum  of  squares  of  real  numbers  on  the  right  cannot  be  negative, 

E — E*  0,  thus  E g E*, 

and  E = E*  if  and  only  if  A0  = a0,  ■ ■ ■ , BN  = bN.  This  proves  the  following  fundamental 
minimum  property  of  the  partial  sums  of  Fourier  series. 


SEC.  11.4  Approximation  by  Trigonometric  Polynomials 
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THEOREM 


Minimum  Square  Error 

The  square  error  ofF  in  (2)  (with  fixed  N)  relative  tofon  the  interval  — 77  Si  x 77 
is  minimum  if  and  only  if  the  coefficients  of  F in  (2)  are  the  Fourier  coefficients  off. 
This  minimum  value  E*  is  given  by  (6). 


From  (6)  we  see  that  E*  cannot  increase  as  N increases,  but  may  decrease.  Hence  with 
increasing  N the  partial  sums  of  the  Fourier  series  of  f yield  better  and  better  approxi- 
mations to  f considered  from  the  viewpoint  of  the  square  error. 

Since  E*  is  0 and  (6)  holds  for  every  N,  we  obtain  from  (6)  the  important  Bessel’s 
inequality 


(7) 


2«0  + 2 (°n  + bn) 

n= 1 


77 


7 T 

f(x)z  dx 

— 7 T 


for  the  Fourier  coefficients  of  any  function /for  which  integral  on  the  right  exists.  (For 
F.  W.  Bessel  see  Sec.  5.5.) 

It  can  be  shown  (see  [C12]  in  App.  1)  that  for  such  a function/,  Parseval’s  theorem  holds; 
that  is,  formula  (7)  holds  with  the  equality  sign,  so  that  it  becomes  Parseval’s  identity3 


(8) 


2«0  + 2 + bn) 

n= 1 


77 


7 T 

f(x)z  dx. 

— 77 


E X A M P L Minimum  Square  Error  for  the  Sawtooth  Wave 

Compute  the  minimum  square  error  E*  of  Fix)  with  N = 1,  2,  ■ ■ ■ , 10,  20,  • • • , 100  and  1000  relative  to 

f(x)  = x + tt  {—tt  < x < tt) 


on  the  interval  -tr  S r S Tt. 

Solution.  Fix)  = 7 t + 2 (sin  x 
Sec.  11.3.  From  this  and  (6), 


1 1 (-1)N+1 

— sin  2x  H — sin  3a  — + • • ■ H sin  Nx)  by  Example  3 in 

2 3 N 


E* 


(a  + tt)1 2  dx  - 7T  I 2t t2  + 4 ^ 


=1  n 


Numeric  values  are: 


x 


Fig.  279.  F with 
N = 20  in  Example  1 


N 

E* 

N 

E* 

N 

E* 

N 

E * 

1 

8.1045 

6 

1.9295 

20 

0.6129 

70 

0.1782 

2 

4.9629 

7 

1.6730 

30 

0.4120 

80 

0.1561 

3 

3.5666 

8 

1.4767 

40 

0.3103 

90 

0.1389 

4 

2.7812 

9 

1.3216 

50 

0.2488 

100 

0.1250 

5 

2.2786 

10 

1.1959 

60 

0.2077 

1000 

0.0126 

3MARC  ANTOINE  PARSEVAL  (1755-1836),  French  mathematician.  A physical  interpretation  of  the  identity 
follows  in  the  next  section. 
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F — Si,  S2,  S3  are  shown  in  Fig.  269  in  Sec.  11.2,  and  F = S20  is  shown  in  Fig.  279.  Although  | f(x)  — F(x)| 
is  large  at  ±77  (how  large?),  where /is  discontinuous,  F approximates /quite  well  on  the  whole  interval,  except 
near*  ±77,  where  “waves”  remain  owing  to  the  “Gibbs  phenomenon,”  which  we  shall  discuss  in  the  next  section. 
Can  you  think  of  functions  / for  which  F*  decreases  more  quickly  with  increasing  A? 


FR-Q'B^y^SET^H~4 


1.  CAS  Problem.  Do  the  numeric  and  graphic  work  in 
Example  1 in  the  text. 


2-5 


MINIMUM  SQUARE  ERROR 


Find  the  trigonometric  polynomial  F(x)  of  the  form  (2)  for 
which  the  square  error  with  respect  to  the  given /(x)  on  the 
interval  — 77  < x < 77  is  minimum.  Compute  the  minimum 
value  for  N = 1,  2,  ■ ■ ■ , 5 (or  also  for  larger  values  if  you 
have  a CAS). 


2.  f(x)  = x ( — 77  < x < 77) 

3.  /(x)  = \x\  ( — 77  < X < 77) 

4.  f{x)  = X2  ( — 77  < X < 77) 

( — 1 if  —77  < X < 0 

5.  fix)  = \ 

l 1 if  0 < X < 77 


6.  Why  are  the  square  errors  in  Prob.  5 substantially  larger 
than  in  Prob.  3? 

7.  f{x)  = xa  ( — 77  < x < 77) 

8.  fix)  = |sinx|  ( — 77  < x < 77),  full-wave  rectifier 

9.  Monotonicity.  Show  that  the  minimum  square  error 
(6)  is  a monotone  decreasing  function  of  N.  How  can 
you  use  this  in  practice? 

10.  CAS  EXPERIMENT.  Size  and  Decrease  of  E*. 

Compare  the  size  of  the  minimum  square  error  E*  for 
functions  of  your  choice.  Find  experimentally  the 


factors  on  which  the  decrease  of  E*  with  N depends. 
For  each  function  considered  find  the  smallest  N such 
that  E*  < 0. 1 . 


11-15 


PARSEVALS’S  IDENTITY 


Using  (8),  prove  that  the  series  has  the  indicated  sum. 
Compute  the  first  few  partial  sums  to  see  that  the  convergence 
is  rapid. 


1 1 77z 

11.  1 + — + — + ■■■  = — = 1.233700550 

32  52  8 

Use  Example  1 in  Sec.  11.1. 

1 1 774 

12.  1 + — + — + ■■•  = — = 1.082323234 

24  34  90 

Use  Prob.  14  in  Sec.  11.1. 


, 1 1 1 _ 774 

13.  1 H H H — + ■■■  — 

34  54  74  96 

Use  Prob.  17  in  Sec.  11.1. 

377 


14. 


15. 


cos4  x dx  = 


cos 6 x dx  = 


4 

577 


1.014678032 


11.5  Sturm-Liouville  Problems. 

Orthogonal  Functions 

The  idea  of  the  Fourier  series  was  to  represent  general  periodic  functions  in  terms  of 
cosines  and  sines.  The  latter  formed  a trigonometric  system.  This  trigonometric  system 
has  the  desirable  property  of  orthogonality  which  allows  us  to  compute  the  coefficient  of 
the  Fourier  series  by  the  Euler  formulas. 

The  question  then  arises,  can  this  approach  be  generalized?  That  is,  can  we  replace  the 
trigonometric  system  of  Sec.  11.1  by  other  orthogonal  systems  ( sets  of  other  orthogonal 
functions )?  The  answer  is  “yes”  and  will  lead  to  generalized  Fourier  series,  including  the 
Fourier-Legendre  series  and  the  Fourier-Bessel  series  in  Sec.  11.6. 

To  prepare  for  this  generalization,  we  first  have  to  introduce  the  concept  of  a Sturm- 
Liouville  Problem.  (The  motivation  for  this  approach  will  become  clear  as  you  read  on.) 
Consider  a second-order  ODE  of  the  form 


SEC.  11.5  Sturm-Liouville  Problems.  Orthogonal  Functions 
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EXAMPLE  1 


(1)  I >(*)/]'  + [qix)  + Ar(x)]v  = 0 
on  some  interval  a Si  x b,  satisfying  conditions  of  the  form 

(a)  k\y  + k2y'  =0  at  x = a 

(2) 

(b)  l\y  + l2y  =0  at  x = b. 

Here  A is  a parameter,  and  k\,  k2,  l\,  l2  are  given  real  constants.  Furthermore,  at  least  one 
of  each  constant  in  each  condition  (2)  must  be  different  from  zero.  (We  will  see  in  Example 
1 that,  if  p (x)  = r(x)  = 1 and  q(x)  = 0,  then  sin  VAv  and  cos  VXr  satisfy  (1)  and  constants 
can  be  found  to  satisfy  (2).)  Equation  (1)  is  known  as  a Sturm-Liouville  equation.4 
Together  with  conditions  2(a),  2(b)  it  is  know  as  the  Sturm-Liouville  problem.  It  is  an 
example  of  a boundary  value  problem. 

A boundary  value  problem  consists  of  an  ODE  and  given  boundary  conditions 
referring  to  the  two  boundary  points  (endpoints)  x = a and  x = b of  a given  interval 
a Ss  x Si  b. 

The  goal  is  to  solve  these  type  of  problems.  To  do  so,  we  have  to  consider 

Eigenvalues,  Eigenfunctions 

Clearly,  y = 0 is  a solution — the  “trivial  solution” — of  the  problem  (1),  (2)  for  any  A 
because  (1)  is  homogeneous  and  (2)  has  zeros  on  the  right.  This  is  of  no  interest.  We  want 
to  find  eigenfunctions  y (x),  that  is,  solutions  of  (1)  satisfying  (2)  without  being  identically 
zero.  We  call  a number  A for  which  an  eigenfunction  exists  an  eigenvalue  of  the  Sturm- 
Liouville  problem  (1),  (2). 

Many  important  ODEs  in  engineering  can  be  written  as  Sturm-Liouville  equations.  The 
following  example  serves  as  a case  in  point. 

Trigonometric  Functions  as  Eigenfunctions.  Vibrating  String 

Find  the  eigenvalues  and  eigenfunctions  of  the  Sturm-Liouville  problem 
(3)  / + Ay  = 0,  y( 0)  - 0,  y(TT ) = 0. 

This  problem  arises,  for  instance,  if  an  elastic  string  (a  violin  string,  for  example)  is  stretched  a little  and  fixed 
at  its  ends  x = 0 and  x = v and  then  allowed  to  vibrate.  Then  y(x)  is  the  “space  function”  of  the  deflection 
u ( x , t)  of  the  string,  assumed  in  the  form  u ( x , t)  = y (x)w  (t),  where  t is  time.  (This  model  will  be  discussed  in 
great  detail  in  Secs,  12.2-12.4.) 

Solution.  From  (1)  nad  (2)  we  see  that  p = 1,  q = 0,  r = 1 in  (1),  and  a = 0,  b = 77,  ki  = Zi  = 1, 
^2  = ^2  = 0 in  (2).  For  negative  A = — v2  a general  solution  of  the  ODE  in  (3)  is  y(x)  = C\evX  + c^e~vX . From 
the  boundary  conditions  we  obtain  Ci  — c%  — 0,  so  that  y = 0,  which  is  not  an  eigenfunction.  For  A = 0 the 
situation  is  similar.  For  positive  A = v2  a general  solution  is 

y(x)  = A cos  vx  + B sin  vx. 


4JACQUES  CHARLES  FRANCOIS  STURM  (1803-1855)  was  born  and  studied  in  Switzerland  and  then 
moved  to  Paris,  where  he  later  became  the  successor  of  Poisson  in  the  chair  of  mechanics  at  the  Sorbonne  (the 
University  of  Paris). 

JOSEPH  LIOUVILLE  (1809-1882),  French  mathematician  and  professor  in  Paris,  contributed  to  various 
fields  in  mathematics  and  is  particularly  known  by  his  important  work  in  complex  analysis  (Liouville’s  theorem; 
Sec.  14.4),  special  functions,  differential  geometry,  and  number  theory. 
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From  the  first  boundary  condition  we  obtain  y (0)  = A = 0.  The  second  boundary  condition  then  yields 

y(n)  = B sin  vir  = 0,  thus  v = 0,  ± 1,  ± 2,  • • • . 

For  v = 0 we  have  y = 0.  For  A = v2  = 1,  4,  9,  16,  ■ ■ ■ . taking  B = 1,  we  obtain 

y(x)  = sin  vx  (v  = \/a  = 1,  2,  ■ • • ). 

Hence  the  eigenvalues  of  the  problem  are  A = v2,  where  v = 1 , 2,  ■ ■ ■ , and  corresponding  eigenfunctions  are 
y(x)  = sin  vx,  where v = 1,  2 ■ ■ ■ . 

Note  that  the  solution  to  this  problem  is  precisely  the  trigonometric  system  of  the  Fourier 
series  considered  earlier.  It  can  be  shown  that,  under  rather  general  conditions  on  the 
functions  p,  q,  r in  (1),  the  Sturm-Liouville  problem  (1),  (2)  has  infinitely  many  eigenvalues. 
The  corresponding  rather  complicated  theory  can  be  found  in  Ref.  [All]  listed  in  App.  1. 

Furthermore,  if  p,  q,  r,  and  p in  (1)  are  real-valued  and  continuous  on  the  interval 
a Si  x Si  b and  r is  positive  throughout  that  interval  (or  negative  throughout  that  interval), 
then  all  the  eigenvalues  of  the  Sturm-Liouville  problem  (1),  (2)  are  real.  (Proof  in  App.  4.) 
This  is  what  the  engineer  would  expect  since  eigenvalues  are  often  related  to  frequencies, 
energies,  or  other  physical  quantities  that  must  be  real. 

The  most  remarkable  and  important  property  of  eigenfunctions  of  Sturm-Liouville 
problems  is  their  orthogonality , which  will  be  crucial  in  series  developments  in  terms  of 
eigenfunctions,  as  we  shall  see  in  the  next  section.  This  suggests  that  we  should  next 
consider  orthogonal  functions. 

Orthogonal  Functions 

Functions  y\(x),  y2 (x),  • • • defined  on  some  interval a Si  x Si  h arc  called  orthogonal  on  this 
interval  with  respect  to  the  weight  function  r (x)  > 0 if  for  all  m and  all  n different  from  in, 


(4) 


( y'rnx  Vn ) 


,b 

r(x)ym(x)yn(x)  dx  = 0 


(m  A n). 


(ym,  yn)  is  a standard  notation  for  this  integral.  The  norm  | y.m||  of  ym  is  defined  by 


(5) 


llymll  = V(yTO,ym)  = r(x)y^l(x)  dx. 


Note  that  this  is  the  square  root  of  the  integral  in  (4)  with  n = m. 

The  functions  Vi,  >’2,  • ■ ■ are  called  orthonormal  011  a ^ x Li  h if  they  are  orthogonal 
on  this  interval  and  all  have  norm  1.  Then  we  can  write  (4),  (5)  jointly  by  using  the 

Kronecker  symbol5  Smn,  namely, 


( ym,yn ) 


b 

r(x)ym(x)yn(x)  dx  = 8, 


( 0 if  m i=-  n 

1 1 if  m = n. 


5LEOPOLD  KRONECKER  (1823-1891).  German  mathematician  at  Berlin  University,  who  made  important 
contributions  to  algebra,  group  theory,  and  number  theory. 
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EXAMPLE  2 


THEOREM  1 


If  r (x)  = 1,  we  more  briefly  call  the  functions  orthogonal  instead  of  orthogonal  with 
respect  to  r(x)  = 1;  similarly  for  orthognormality.  Then 


(jm;  J'Y/j 


b 

ym(x)yn(x)  dx  = 0 (m  + ri), 


ll^mll  "V^ (ym>  yn) 


ym(x)  dx. 


The  next  example  serves  as  an  illustration  of  the  material  on  orthogonal  functions  just 
discussed. 


Orthogonal  Functions.  Orthonormal  Functions.  Notation 


The  functions  ym(x)  = sin  mx,  m = 1,  2,  • • • form  an  orthogonal  set  on  the  interval  because  for 

m =£  n we  obtain  by  integration  [see  (11)  in  App.  A3.1] 


(ym,  yn)  = sin  mx  sin  nx  dx 


cos  (m  — ri)x  dx  ~ 


1 

2 


7 T 

cos  (m  + n) x dx  = 0, 

-7 T 


(m  j=  n). 


The  norm  ||  ym ||  = V(ym,  ym)  equals  Vtt  because 

bmf  = (ym,ym)  = J sin  ■'■mxdx  = tt 
Hence  the  coiTesponding  orthonormal  set,  obtained  by  division  by  the  norm,  is 


sin  x sin  2x  sin  3x 
\Tl T V77 " y/lT 


(m  = 


1>  2,  • • • ) 


Theorem  1 shows  that  for  any  Sturm-Liouville  problem,  the  eigenfunctions  associated  with 
these  problems  are  orthogonal.  This  means,  in  practice,  if  we  can  formulate  a problem  as  a 
Sturm-Liouville  problem,  then  by  this  theorem  we  are  guaranteed  orthogonality. 


Orthogonality  of  Eigenfunctions  of  Sturm-Liouville  Problems 

Suppose  that  the  functions  p,  q,  r,  and  p'  in  the  Sturm-Liouville  equation  (1)  are 
real-valued  and  continuous  and  r(x)  > 0 on  the  inten’al  a = x = b.  Let  ym{x)  and 
yn(x)  be  eigenfunctions  of  the  Sturm-Liouville  problem  (1),  (2)  that  correspond  to 
different  eigenvalues  Am  and  \n,  respectively.  Then  ym,  yn  are  orthogonal  on  that 
interval  with  respect  to  the  weight  function  r,  that  is. 


(6) 


}'n) 


,b 

r(x)ym(x)yn(x)  dx  = 0 


( m n ). 


If  p (a)  = 0,  then  (2a)  can  be  dropped  from  the  problem.  If  p(b)  = 0,  then  (2b) 
can  be  dropped.  [It  is  then  required  that  y and  v remain  bounded  at  such  a point, 
and  the  problem  is  called  singular,  as  opposed  to  a regular  problem  in  which  (2) 
is  used.] 

If  p(a)  = p{b),  then  (2)  can  be  replaced  by  the  “periodic  boundary  conditions” 
(7)  y(a)  = y(b),  y'(a)  = y'(b). 


The  boundary  value  problem  consisting  of  the  Sturm-Liouville  equation  (1)  and  the  periodic 
boundary  conditions  (7)  is  called  a periodic  Sturm-Liouville  problem. 
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PROOF 


EXAMPLE  3 


By  assumption,  ym  and  yn  satisfy  the  Sturm-Liouville  equations 

(py'm)'  + (q  + a mr)ym  = 0 
ipy'n)'  + (<7  + A nr)yn  = 0 


respectively.  We  multiply  the  first  equation  by  yn,  the  second  by  — ym,  and  add, 
(Am  - A n)iymyn  = ymipy'n)'  ~ VnC/^m)'  = [( FAn^m  ~ Kpy'rJynY 


where  the  last  equality  can  be  readily  verified  by  performing  the  indicated  differentiation 
of  the  last  expression  in  brackets.  This  expression  is  continuous  on  a x b since  p and 
p'  are  continuous  by  assumption  and  ym,  yn  are  solutions  of  (1).  Integrating  over  x from 
a to  b,  we  thus  obtain 


(8) 


(Am  A n) 


rymyn  dx  = [ p(y'nym  - yrmyn)]ba 


(i a < b). 


The  expression  on  the  right  equals  the  sum  of  the  subsequent  Lines  1 and  2, 

^ p(b)[y'n(b)ym(b)  - ym(b)yn(b)]  (Line  1) 

- p(a)[y'n(a)ym(a ) - y'm(a)yn(a)]  (Line  2). 


Hence  if  (9)  is  zero,  (8)  with  Am  — A„  A 0 implies  the  orthogonality  (6).  Accordingly, 
we  have  to  show  that  (9)  is  zero,  using  the  boundary  conditions  (2)  as  needed. 

Case  1.  p(a ) = p (b)  = 0.  Clearly,  (9)  is  zero,  and  (2)  is  not  needed. 

Case  2.  p(a)  ^ 0 ,p(b)  = 0.  Line  1 of  (9)  is  zero.  Consider  Line  2.  From  (2a)  we  have 

k\  yn(a)  + k2yn(a)  = 0, 

kiym(a)  + k2y'm(a)  = 0. 


Let  k2  A 0.  We  multiply  the  first  equation  by  ym(a ),  the  last  by  —yn(a)  and  add, 

ki [y'n(a)ym(a)  - y'm(a)yn(aj\  = 0. 

This  is  k2  times  Line  2 of  (9),  which  thus  is  zero  since  k2  A 0.  If  k2  = 0,  then  k±  A 0 
by  assumption,  and  the  argument  of  proof  is  similar. 

Case  3.  p(a)  = 0 ,p(b)  ^ 0.  Line  2 of  (9)  is  zero.  From  (2b)  it  follows  that  Line  1 of  (9) 
is  zero;  this  is  similar  to  Case  2. 

Case  4.  p(a)  4=  0 ,p(b)  4=  0.  We  use  both  (2a)  and  (2b)  and  proceed  as  in  Cases  2 and  3. 
Case  5.  p(a)  = p(b).  Then  (9)  becomes 

P(b)[yn(b)ym(b)  - yL(b)yn(b)  - y'n{a)ym(a)  + }4(a).yn(a)]. 

The  expression  in  brackets  [ ■ • • ] is  zero,  either  by  (2)  used  as  before,  or  more  directly  by 
(7).  Hence  in  this  case,  (7)  can  be  used  instead  of  (2),  as  claimed.  This  completes  the 
proof  of  Theorem  1 . 


Application  of  Theorem  1.  Vibrating  String 

The  ODE  in  Example  1 is  a Sturm-Liouville  equation  with  p = 1 , q = 0,  and  r = 1 . From  Theorem  1 it  follows 
that  the  eigenfunctions  ym  = sin  mx  (m  = 1 , 2,  • • • ) are  orthogonal  on  the  interval  0 = x ^ 77. 
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Example  3 confirms,  from  this  new  perspective,  that  the  trigonometric  system  underlying 
the  Fourier  series  is  orthogonal,  as  we  knew  from  Sec.  11.1. 

EXAMPLE  4 Application  of  Theorem  1.  Orthogonlity  of  the  Legendre  Polynomials 

Legendre’s  equation  (1  — x2)y"  — 2 xy'  + n(n  + l)y  = 0 may  be  written 

[(1  - x2)y'Y  + Ay  = 0 A = n(n  + 1). 

Hence,  this  is  a Sturm-Liouville  equation  (1)  with p = 1 — x2,  q = 0,  and  r = 1.  Since  p(—  1)  = p{  1)  = 0,  we 
need  no  boundary  conditions,  but  have  a “ singular ” Sturm-Liouville  problem  on  the  interval  — 1 ^ ^ 1.  We 

know  that  for  n = 0,  1,  • • • , hence  A = 0,  1 • 2,  2 • 3,  • • • , the  Legendre  polynomials  Pn(x ) are  solutions  of  the 
problem.  Hence  these  are  the  eigenfunctions.  From  Theorem  1 it  follows  that  they  are  orthogonal  on  that  interval, 
that  is, 


(10)  Pm(x)Pn(x)  dx  = 0 (m  A n).  ■ 

■'  l 

What  we  have  seen  is  that  the  trigonometric  system,  underlying  the  Fourier  series,  is 
a solution  to  a Sturm-Liouville  problem,  as  shown  in  Example  1,  and  that  this 
trigonometric  system  is  orthogonal,  which  we  knew  from  Sec.  11.1  and  confirmed  in 
Example  3. 


1.  Proof  of  Theorem  1.  Carry  out  the  details  in  Cases  3 
and  4. 


2-6 


ORTHOGONALITY 


2.  Normalization  of  eigenfunctions  ym  of  (1),  (2)  means 
that  we  multiply  ym  by  a nonzero  constant  cm  such  that 
cmym  has  norm  1.  Show  that  zm  — cym  with  any  c ¥=  0 
is  an  eigenfunction  for  the  eigenvalue  corresponding 
to  ym. 

3.  Change  of  x.  Show  that  if  the  functions  y0  (jc),  >’i  (x),  ■ ■ • 
form  an  orthogonal  set  on  an  interval  a = x = b (with 
r(x)  = 1),  then  the  functions  y0(ct  + k),yi(ct  + k), 
■■■  ,c>  0,  form  an  orthogonal  set  on  the  interval 
( a - k)/c  S t S (b  — k)/c. 

4.  Change  of  x.  Using  Prob.  3,  derive  the  orthogonality 
of  1,  cos  7Tx,  sin  7rx,  cos  2 t rx,  sin  2irx,  ■ ■ ■ on 
— 1 S x S 1 (r(x)  =1)  from  that  of  1,  cos  x,  sin  x, 
cos  2x,  sin  2x,  ■ ■ • on  —tt  SxS  it. 


5.  Legendre  polynomials.  Show  that  the  functions 
P„(cos  6),  n = 0,  1,  ■ ■ ■ , from  an  orthogonal  set  on  the 
interval  0 = 6 S tt  with  respect  to  the  weight  function 
sin  0. 


6.  Tranformation  to  Sturm-Liouville  form.  Show  that 
y”  + fy’  + (g  + A h)  y = 0 takes  the  form  (1)  if  you 


set  p = exp  (Jfdx),  q = pg,  r = hp.  Why  would  you 
do  such  a transformation? 


7-15 


STURM-LIOUVILLE  PROBLEMS 


Find  the  eigenvalues  and  eigenfunctions.  Verify  orthogo- 
nality. Start  by  writing  the  ODE  in  the  form  (1),  using 
Prob.  6.  Show  details  of  your  work. 


7. 

tr 

y 

+ Ay  = 0, 

y(0)  = 

0,  y(10)  = 0 

8. 

n 

y 

+ Ay  = 0, 

y(0)  = 

0,  y(L)  = 0 

9. 

!f 

y 

+ Ay  = 0, 

y(0)  = 

: 0,  y'(L)  = 0 

10. 

tt 

y 

+ Ay  = 0, 

y(0)  = 

o 

II 

/(I) 

11. 

(V 

jx)f  + (A  A 

- LkA3 4 5 6 

II 

o 

V! 

II 

O 

yie77)  = 0. 

(Set  x = e\) 

12. 

tt 

y 

1 

< 

+ 

>■ 

+ l)y  = 

= 0,  y (0)  = 0, 

y(i)  = o 

13. 

tt 

y 

+ 

00 

+ 

> 

+ 16)y 

© 

II 

o 

o 

y(7T)  = 0 

14. 

TEAM  PROJECT.  Special  Functions. 

Orthogonal 

polynomials  play  a great  role  in  applications.  For 
this  reason,  Legendre  polynomials  and  various  other 
orthogonal  polynomials  have  been  studied  extensively; 
see  Refs.  [GenRefl],  [GenReflO]  in  App.  1.  Consider 
some  of  the  most  important  ones  as  follows. 
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(a)  Chebyshev  polynomials6  of  the  first  and  second 
kind  are  defined  by 

Tn  ( x ) = cos  ( n arccos  x ) 


sin  \{n  + 1 ) arccos  x] 


respectively,  where  n = 0,  1,  ■ ■ ■ . Show  that 

T0  = 1,  7i(x)  = x,  T2(x)  = 2x2  - 1. 

T3(x)  = 4x3  — 3x, 

U0  = 1,  U\(x)  = 2x,  U2(x)  = 4x2  - 1, 

U3(x)  = 8.x3  - 4x. 

Show  that  the  Chebyshev  polynomials  Tn(x)  are 
orthogonal  on  the  interval  — 1 S x £ 1 with  respect 
to  the  weight  function  r(x)  = \/\/ 1 — x2.  (Hint. 
To  evaluate  the  integral,  set  arccos  x = 6.)  Verify 


that  Tn(x),  n = 0,  1,  2,  3,  satisfy  the  Chebyshev 
equation 

(1  — x2)y”  — xy;  + n2y  = 0. 

(b)  Orthogonality  on  an  infinite  interval:  Laguerre 
polynomials7 *  are  defined  by  L0  = 1,  and 

ex  dn  (xne~x) 

Ln(x)  = — —7 , n = 1,  2,  ■ ■ • . 

n\  dx11 

Show  that 

Ln(x)  = 1 — x,  L2(x)  = 1 — 2x  + x2/2, 

L3(x)  = 1 - 3x  + 3x2/2  - x3/6. 

Prove  that  the  Laguerre  polynomials  are  orthogonal  on 
the  positive  axis  0 £ x < 00  with  respect  to  the  weight 
function  r(x)  = e~x.  Hint.  Since  the  highest  power  in 
Lm  is  xm,  it  suffices  to  show  that  Je~xxkLndx  = 0 
for  k < n.  Do  this  by  k integrations  by  parts. 


6 Orthogonal  Series. 

Generalized  Fourier  Series 


Fourier  series  are  made  up  of  the  trigonometric  system  (Sec.  11.1),  which  is  orthogonal, 
and  orthogonality  was  essential  in  obtaining  the  Euler  formulas  for  the  Fourier  coefficients. 
Orthogonality  will  also  give  us  coefficient  formulas  for  the  desired  generalized  Fourier 
series,  including  the  Fourier-Legendre  series  and  the  Fourier-Bessel  series.  This  gener- 
alization is  as  follows. 

Let  >’o,  >'i,  y2,  ■ • • be  orthogonal  with  respect  to  a weight  function  r (x)  on  an  interval 
a Si  x Si  b,  and  let  fix)  be  a function  that  can  be  represented  by  a convergent  series 

CXD 

(1)  f(x)  = 2 amym(x)  = a0y0(x)  + aij’i (x)  + • • • . 

m= 0 


This  is  called  an  orthogonal  series,  orthogonal  expansion,  or  generalized  Fourier  series. 

If  the  yr)l  are  the  eigenfunctions  of  a Sturm-Liouville  problem,  we  call  (1)  an  eigenfunction 
expansion.  In  (1)  we  use  again  m for  summation  since  n will  be  used  as  a fixed  order  of 
Bessel  functions. 

Given  /(x),  we  have  to  determine  the  coefficients  in  (1),  called  the  Fourier  constants 
of  fix)  with  respect  to  y0,  yi,  • • • . Because  of  the  orthogonality,  this  is  simple.  Similarly 
to  Sec.  11.1,  we  multiply  both  sides  of  (1)  by  r(x)yn(x)  infixed)  and  then  integrate  on 


6PAFNUTI  CHEBYSHEV  (1821-1894),  Russian  mathematician,  is  known  for  his  work  in  approximation 
theory  and  the  theory  of  numbers.  Another  transliteration  of  the  name  is  TCHEBICHEF. 

7EDMOND  LAGUERRE  (1834-1886),  French  mathematician,  who  did  research  work  in  geometry  and  in 

the  theory  of  infinite  series. 
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both  sides  from  a to  b.  We  assume  that  term-by-term  integration  is  permissible.  (This  is 
justified,  for  instance,  in  the  case  of  “uniform  convergence,”  as  is  shown  in  Sec.  15.5.) 
Then  we  obtain 


(f  yn) 


b 

rfyn  dx 


.Ym  I }:n  dx 


m=  o 


oo 


rb 


rymyndx  = 


m= 0 a 


yn)- 

m=o 


Because  of  the  orthogonality  all  the  integrals  on  the  right  are  zero,  except  when  m = n. 
Hence  the  whole  infinite  series  reduces  to  the  single  term 

an(ym  yn)  ~ ^nlbnll  ■ Thus  (f>yn)  ~ ^*n||  Trail  ■ 

Assuming  that  all  the  functions  yn  have  nonzero  norm,  we  can  divide  by  [|yn| 2:  writing  again 
m for  n,  to  be  in  agreement  with  (1),  we  get  the  desired  formula  for  the  Fourier  constants 


(2) 


am 


if  ym) 

IlyJI2 


rb 

r(x)f(x)ym(x)  dx 


(n  = 0, 


This  formula  generalizes  the  Euler  formulas  (6)  in  Sec.  11.1  as  well  as  the  principle  of 
their  derivation,  namely,  by  orthogonality. 

Fourier-Legendre  Series 

A Fourier-Legendre  series  is  an  eigenfunction  expansion 


fix)  = 2 ampm(x ) = a0P0  + aiAOO  + azP2(x)  + ■ • • = a0  + axx  + a2(zx2  — |)  + • ■ ■ 

m= 0 

in  terms  of  Legendre  polynomials  (Sec.  5.3).  The  latter  are  the  eigenfunctions  of  the  Sturm-Liouville  problem 
in  Example  4 of  Sec.  11.5  on  the  interval  —1  ^ x 1.  We  have  r(x)  = 1 for  Legendre’s  equation,  and  (2) 
gives 


(3) 


2m  + 1 
2 


f(x)Pm(x)  dx. 


m = 0,  1 , • • • 


because  the  norm  is 


(4) 


(m  = 0,  !,-••) 


as  we  state  without  proof.  The  proof  of  (4)  is  tricky;  it  uses  Rodrigues’s  formula  in  Problem  Set  5.2  and  a 
reduction  of  the  resulting  integral  to  a quotient  of  gamma  functions. 

For  instance,  let  f(x)  = sin  77A'.  Then  we  obtain  the  coefficients 


2m  + 1 


(sin  7rx)Pm(x)  dx,  thus  cq  = — I x sin  ttx  dx  = — = 0.95493. 
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Hence  the  Fourier-Legendre  series  of  sin  nx  is 

sin  7 tx  = 0.95493P,  (x)  - 1.15824P3(x)  + 0.21929P5(a-)  - 0.01664P7(x)  + 0.00068 A)  (x) 

- 0.00002PU (a)  + 

The  coefficient  of  l\ 3 is  about  3 • 10-7.  The  sum  of  the  first  three  nonzero  terms  gives  a curve  that  practically 
coincides  with  the  sine  curve.  Can  you  see  why  the  even-numbered  coefficients  are  zero?  Why  a3  is  the  absolutely 
biggest  coefficient? 


Fourier-Bessel  Series 

These  series  model  vibrating  membranes  (Sec.  12.9)  and  other  physical  systems  of  circular  symmetry.  We  derive 
these  series  in  three  steps. 

Step  1.  Bessel’s  equation  as  a Sturm-Liouville  equation.  The  Bessel  function  Jn(x)  with  fixed  integer  n & 0 
satisfies  Bessel’s  equation  (Sec.  5.5) 

x2Jn(x ) + xjn  (x)  + ( X 2 - n2)Jn(x)  = 0 

where  jn  = dJn/dx  and  ) n = d2Jn/dx2.  We  set  x = kx.  Then  x = x/Ic  and  by  the  chain  rule,  jn  = dJn/dx  = 
(dJn/dx)/k  and  Jn  = Jn/k2.  In  the  first  two  terms  of  Bessel’s  equation,  k2  and  k drop  out  and  we  obtain 

x2 Jn(kx)  + xj'n(kx ) + ( k2x 2 — n2)Jn(kx)  = 0. 

Dividing  by  x and  using  (xJlfkx))'  = xJ (kx)  + j'n  (kx)  gives  the  Sturm-Liouville  equation 


(5) 


[xJn(kx)Y  + 


Jn(kx)  = 0 


A = k2 


with  p(x)  = x,  q(x)  = ~n2/x , r(x)  = x,  and  parameter  A = k2.  Since  p(0)  = 0,  Theorem  1 in  Sec.  11.5 
implies  orthogonality  on  an  interval  0 ^ x ^ R (R  given,  fixed)  of  those  solutions  Jn(kx)  that  are  zero  at 
x = R,  that  is, 


(6)  Jn(kR)  = 0 (n  fixed). 

Note  that  q(x)  = —n2/x  is  discontinuous  at  0,  but  this  does  not  affect  the  proof  of  Theorem  1. 

Step  2.  Orthogonality.  It  can  be  shown  (see  Ref.  [A  13])  that  Jn(x)  has  infinitely  many  zeros,  say, 

x — 0n,i  < 0n,2  < ■“  (see  Fig.  110  in  Sec.  5.4  for  n = 0 and  1).  Hence  we  must  have 

(7)  kR  &n,m  thus  kn  rn  cxn  m/R  (in  1, 2,  • • • ). 

This  proves  the  following  orthogonality  property. 


Orthogonality  of  Bessel  Functions 

For  each  fixed  nonnegative  integer  n the  sequence  of  Bessel  functions  of  the  first 
kind  Jn(kn  ix ),  Jn(kn, 2*),  * * * with  kn  m as  in  (7)  forms  an  orthogonal  set  on  the 
interval  0 ^ x ^ R with  respect  to  the  weight  function  r(x)  = x , that  is, 


(8) 


rR 

xJn  (kn,mx)Jn(kn,jx)  dx  = 0 (j  =£  m,  n fixed). 

'0 


Hence  we  have  obtained  infinitely  many  orthogonal  sets  of  Bessel  functions,  one  for  each  of  Jq,  Ji,  J2, 
Each  set  is  orthogonal  on  an  interval  0 ^ x ^ R with  a fixed  positive  R of  our  choice  and  with  respect  to 
the  weight  x.  The  orthogonal  set  for  Jn  is  Jn(kn  ix),  Jn(kn,2x)^  JrSkn&t)* ' " > where  n is  fixed  and  kn  rn  is 
given  by  (7). 
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Step  3.  Fourier-Bessel  series.  The  Fouriei-Bessel  series  corresponding  to  Jn  (n  fixed)  is 

(9)  f(x)  = 2 am.Jn(kn,mX ) = a\Jn(kn,ix)  + a2Jn(kn2x)  + a3Jn(kn3x)  + ■■■  (n  fixed). 

m=  1 

The  coefficients  are  (with  anyn  = kn  mR ) 


(10) 


R J n+l(an,m)  ^0 


Xf(x)  Jn(kn,mx)  dx. 


m = 1,2, 


because  the  square  of  the  norm  is 


f ^ R 2 

(11)  xJn  {kn,mx)  dx  — J n+l(kn?inR) 

J0  2 

as  we  state  without  proof  (which  is  tricky;  see  the  discussion  beginning  on  p.  576  of  [A  13]). 

Special  Fourier-Bessel  Series 

For  instance,  let  us  consider /(x)  = 1 — x2  and  take  R = 1 and  n = 0 in  the  series  (9),  simply  writing  A for 
ao,m-  Then  kn  m = o;o,m  — A = 2.405,  5.520,  8.654,  1 1.792,  etc.  (use  a CAS  or  Table  A1  in  App.  5).  Next  we 
calculate  the  coefficients  am  by  (10) 

2 f 1 2 

am  = — — x(l  - x )J0(Xx)  dx. 

jfw  0 

This  can  be  integrated  by  a CAS  or  by  formulas  as  follows.  First  use  [x/i(Ax)]/  = Ax7q(Ax)  from  Theorem  1 in 
Sec.  5.4  and  then  integration  by  parts, 


2 r 

9 2 

1 9 

1 i r1 

am  ~ n 41 

xVo(Ax)  dx  - 

T(1  -aVi(Aa) 

x/i(Ax)(— 2x)  dx 

7i2(A)J0 

/f(A) 

A 

0 A J0 

The  integral-free  part  is  zero.  The  remaining  integral  can  be  evaluated  by  [x2J2(hx)]'  = Xx2Ji(\x)  from  Theorem  1 
in  Sec.  5.4.  This  gives 


a 


m 


4/2(A) 
A2/? (A) 


(A  cxq  m). 


Numeric  values  can  be  obtained  from  a CAS  (or  from  the  table  on  p.  409  of  Ref.  [GenRefl]  in  App.  1,  together 
with  the  formula  J 2 = 2x-1/i  — Jq  in  Theorem  1 of  Sec.  5.4).  This  gives  the  eigenfunction  expansion  of  1 — x 
in  terms  of  Bessel  functions  Jq,  that  is, 


1 -X2  = 1.1081/o(2.405a-)  - 0.1398/o(5.520a-)  + 0.04557o(8.654.v)  - 0.02 107o(l  1.792a)  + ■••. 


A graph  would  show  that  the  curve  of  1 — xz  and  that  of  the  sum  of  first  three  terms  practically  coincide. 


Mean  Square  Convergence.  Completeness 

Ideas  on  approximation  in  the  last  section  generalize  from  Fourier  series  to  orthogonal  series 
(1)  that  are  made  up  of  an  orthonormal  set  that  is  “complete,”  that  is,  consists  of  “sufficiently 
many”  functions  so  that  (1)  can  represent  large  classes  of  other  functions  (definition  below). 

In  this  connection,  convergence  is  convergence  in  the  norm,  also  called  mean-square 
convergence;  that  is,  a sequence  of  functions  />  is  called  convergent  with  the  limit  f if 


LmJ4  - 0; 


(12*) 
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written  out  by  (5)  in  Sec.  11.5  (where  we  can  drop  the  square  root,  as  this  does  not  affect 
the  limit) 


(12) 


lim 

h — >co 


r(x)[fk(x)  -f(.x)fdx  = 0. 


Accordingly,  the  series  (1)  converges  and  represents  / if 


(13) 


lim 

h — »oo 


r(x)[sk(x)  - fix)]  dx  = 0 


where  sk  is  the  kth  partial  sum  of  (1). 


fc 

(14)  sk(x)  = ^ amym(x). 

m=0 

Note  that  the  integral  in  (13)  generalizes  (3)  in  Sec.  11.4. 

We  now  define  completeness.  An  orthonormal  set  vo,  >’i,  ■ • • on  an  interval  a r i b 
is  complete  in  a set  of  functions  S defined  on  a Si  x Si  b if  we  can  approximate  every 
/ belonging  to  S arbitrarily  closely  in  the  norm  by  a linear  combination  «0vo  + 
aiyi  + • ■ ■ + akyk,  that  is,  technically,  if  for  every  e > 0 we  can  find  constants  ciq,  ■ • • , ak 
(with  k large  enough)  such  that 

(15)  ||/  - (a0y0  + • • • + akyk)\\  < e. 

Ref.  [GenRef7]  in  App.  1 uses  the  more  modern  term  total  for  complete. 

We  can  now  extend  the  ideas  in  Sec.  1 1.4  that  guided  us  from  (3)  in  Sec.  1 1.4  to  Bessel’s 
and  Parseval’s  formulas  (7)  and  (8)  in  that  section.  Performing  the  square  in  (13)  and 
using  (14),  we  first  have  (analog  of  (4)  in  Sec.  11.4) 


rb 

rb 

rb 

r(x)[sk(x)  - f(x)}2  dx  = 

rsk  clx  — 2 

tfsk  dx  + 

rb 


rfz  dx 


b 

r k i 

2 k 

rb 

r 

dx  2 Cl  ytT, 

rfymdx  + 

a 

- m=0 

m= 0 

a 

rf2  dx. 


The  first  integral  on  the  right  equals  because  J rymyi  dx  = 0 for  m A /,  and 

/ rym  dx  = 1 . In  the  second  sum  on  the  right,  the  integral  equals  am,  by  (2)  with  ||ym||2  = 1. 
Hence  the  first  term  on  the  right  cancels  half  of  the  second  term,  so  that  the  right  side 
reduces  to  (analog  of  (6)  in  Sec.  11.4) 


“ 2 arn  + 
m= 0 


b 

rf 2 dx. 


This  is  nonnegative  because  in  the  previous  formula  the  integrand  on  the  left  is  nonnegative 
(recall  that  the  weight  r(x)  is  positive!)  and  so  is  the  integral  on  the  left.  This  proves  the 
important  Bessel’s  inequality  (analog  of  (7)  in  Sec.  11.4) 


m= 0 


rb 

r(x)f{x)2  dx 


(16) 


ik=  1,2,---), 


SEC.  11.6 


Orthogonal  Series.  Generalized  Fourier  Series 


509 


Here  we  can  let  because  the  left  sides  form  a monotone  increasing  sequence  that 

is  bounded  by  the  right  side,  so  that  we  have  convergence  by  the  familiar  Theorem  1 in 
App.  A.3.3  Hence 


(i7)  ll/ll2 

m=0 

Furthermore,  if  yo,  y±,  ■ ■ ■ is  complete  in  a set  of  functions  .S',  then  (13)  holds  for  every  / 
belonging  to  S.  By  (13)  this  implies  equality  in  (16)  with  fc— >°°.  Hence  in  the  case  of 
completeness  every /in  S saisfies  the  so-called  Parseval  equality  (analog  of  (8)  in  Sec.  11.4) 


(18) 


2 Qm=  ll/ll2 

m= 0 


rb 

r(x)f(x)2  dx. 


As  a consequence  of  (18)  we  prove  that  in  the  case  of  completeness  there  is  no  function 
orthogonal  to  every  function  of  the  orthonormal  set,  with  the  trivial  exception  of  a function 
of  zero  norm: 


THEOREM  2 


Completeness 

Let  yo,  yi,  ■ ■ ■ be  a complete  orthonormal  set  on  a = x = b in  a set  of  functions  S. 
Then  if  a function  f belongs  to  S and  is  orthogonal  to  every  ym,  it  must  have  norm 
zero.  In  particular,  iffis  continuous,  then  f must  be  identically  zero. 


PROOF  Since/  is  orthogonal  to  every  ym,  the  left  side  of  (18)  must  be  zero.  If  / is  continuous, 
then  ||/||  = 0 implies/(x)  = 0,  as  can  be  seen  directly  from  (5)  in  Sec.  1 1.5  with/instead 
of  ym  because  r (x)  > 0 by  assumption. 


PRQBLEf^S^-T~T1T6 


1-7 


FOURIER-LEGENDRE  SERIES 


Showing  the  details,  develop 

1.  63x5  - 90.x3  + 35x 

2.  (x  + if 

3.  1 - x4 


4.  1, 


5.  Prove  that  if  f(x)  is  even  (is  odd,  respectively),  its 
Fourier-Legendre  series  contains  only  Pm  (x)  with  even 
m (only  Pm  (x)  with  odd  m,  respectively).  Give  examples. 

6.  What  can  you  say  about  the  coefficients  of  the  Fourier- 
Legendre  series  of/(x)  if  the  Maclaurin  series  of/(x) 
contains  only  powers  x4m  (m  = 0,  1 , 2,  ■ ■ ■ )? 

7.  What  happens  to  the  Fourier-Legendre  series  of  a 
polynomial  /(x)  if  you  change  a coefficient  of  /(x)? 
Experiment.  Try  to  prove  your  answer. 


8-13 


CAS  EXPERIMENT 


FOURIER-LEGENDRE  SERIES.  Find  and  graph  (on 
common  axes)  the  partial  sums  up  to  Smo  whose  graph 
practically  coincides  with  that  of  /(x)  within  graphical 
accuracy.  State  m0.  On  what  does  the  size  of  m 0 seem  to 
depend? 

8.  /(x)  = sin  7rx 


9.  /(x)  = sin  27 rx 

10.  fix)  = e"x2 

11.  fix)  = (1  + x2)"1 


12.  fix)  = 70(“o,i  *),  “o,i  = the  first  positive  zero 
of  J0ix) 


13.  fix)  = Jo(“o,2  x),  ao,2  = the  second  positive  zero 
of  J0(x) 


510 


CHAP.  11  Fourier  Analysis 


14.  TEAM  PROJECT.  Orthogonality  on  the  Entire  Real 
Axis.  Hermite  Polynomials.8  These  orthogonal  polyno- 
mials are  defined  by  He0(  1)  = 1 and 

dn 

(19)  Hen(x)  = (- 1 IV^2  — ((T*2/2),  n=  1,  2,  • • • . 

dxn 

REMARK.  As  is  true  for  many  special  functions,  the 
literature  contains  more  than  one  notation,  and  one  some- 
times defines  as  Hermite  polynomials  the  functions 

in  — rc2 

Hq  = 1,  H*{x)  = (-1)V — . 

dxn 

This  differs  from  our  definition,  which  is  preferred  in 
applications. 

(a)  Small  Values  of  n.  Show  that 

He  i(x)  = x,  He2fx)  = x2  — 1, 

Hea(x)  = x3  - 3 x,  He^ix)  = x4  — 6x2  + 3. 

(b)  Generating  Function.  A generating  function  of  the 
Hermite  polynomials  is 

(20)  gtx-*2/2  = ^an(x)tn 

n= 0 

because  Hen(x)  = n\  an(x).  Prove  this.  Hint : Use  the 
formula  for  the  coefficients  of  a Maclaurin  series  and 
note  that  tx  - 2t2  = |x2  - \(x  - t)2. 

(c)  Derivative.  Differentiating  the  generating  func- 
tion with  respect  to  x,  show  that 

(21)  He  fix)  = nHen-,(x). 

(d)  Orthogonality  on  thex-Axis  needs  a weight  function 
that  goes  to  zero  sufficiently  fast  as  x— * ±°°,  (Why?) 


Show  that  the  Hermite  polynomials  are  orthogonal  on 
— oo  < x < 00  with  respect  to  the  weight  function 
r(x)  = e~x  /2.  Hint.  Use  integration  by  parts  and  (21). 
(e)  ODEs.  Show  that 

(22)  He’n(x)  = xHerix)  - Hen+1(x). 

Using  this  with  n — 1 instead  of  n and  (21),  show  that 
y = Hen(x)  satisfies  the  ODE 

(23)  y"  = xy'  + ny  = 0. 

Show  that  w = is  a solution  of  Weber’s 

equation 

(24)  w"  + (w  + 2 - \x2)w  = 0 (n  = 0,  1,  ■■•). 

15.  CAS  EXPERIMENT.  Fourier-Bessel  Series.  Use 

Example  2 and  R = 1,  so  that  you  get  the  series 

(25)  f{x)  = «i-/o(“o,ix)  + a2J0(oto,2x) 

+ a^JoiotQ-^x)  + ■ ■ ■ 

With  the  zeros  cr0,i  “o,2> ' ' ' from  your  CAS  (see  also 
Table  A1  in  App.  5). 

(a)  Graph  the  terms  J0{a0  lx),  ■■  ■ ,J0(ao,io-r)  f°r 
0 S x £ 1 on  common  axes. 

(b)  Write  a program  for  calculating  partial  sums  of  (25). 
Find  out  for  what  fix)  your  CAS  can  evaluate  the 
integrals.  Take  two  such/(x)  and  comment  empirically 
on  the  speed  of  convergence  by  observing  the  decrease 
of  the  coefficients. 

(c)  Take  fix)  = 1 in  (25)  and  evaluate  the  integrals 
for  the  coefficients  analytically  by  (21a),  Sec.  5.4,  with 
v = 1.  Graph  the  first  few  partial  sums  on  common 
axes. 


Fourier  Integral 

Fourier  series  are  powerful  tools  for  problems  involving  functions  that  are  periodic  or  are  of 
interest  on  a finite  interval  only.  Sections  1 1 .2  and  11.3  first  illustrated  this,  and  various  further 
applications  follow  in  Chap.  12.  Since,  of  course,  many  problems  involve  functions  that  are 
nonperiodic  and  are  of  interest  on  the  whole  x-axis,  we  ask  what  can  be  done  to  extend  the 
method  of  Fourier  series  to  such  functions.  This  idea  will  lead  to  “Fourier  integrals.” 

In  Example  1 we  start  from  a special  function  /L  of  period  2 L and  see  what  happens  to 
its  Fourier  series  if  we  let  L — » °°.  Then  we  do  the  same  for  an  arbitrary  function  of 
period  2 L.  This  will  motivate  and  suggest  the  main  result  of  this  section,  which  is  an 
integral  representation  given  in  Theorem  1 below. 


8CHARLES  HERMITE  (1822-1901).  French  mathematician,  is  known  for  his  work  in  algebra  and  number 
theory.  The  great  HENRI  POINCARE  (1854-1912)  was  one  of  his  students. 
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EXAMPLE  1 


Rectangular  Wave 

Consider  the  periodic  rectangular  wave  fa  (x)  of  period  2L  > 2 given  by 


!0  if  -L<x<- 1 
1 if  — 1 < x < 1 

0 if  1 < x < L. 

The  left  part  of  Fig.  280  shows  this  function  for  2 L = 4,  8,  16  as  well  as  the  nonperiodic  function /(x),  which 
we  obtain  from  if  we  let  L — > <», 


f(x)  = Um  fL(x) 


( 1 if  — 1 < x < 1 
lO  otherwise. 


We  now  explore  what  happens  to  the  Fourier  coefficients  of  fa  as  L increases, 
all  n.  For  an  the  Euler  formulas  (6),  Sec.  11.2,  give 


1 

f1  1 

1 

r1 

flTTX 

2 

r 1 

YITTX 

dx  = — , 

Clyi 

cos dx  = 

- 

cos dx 

2 L. 

l_i  L 

L. 

l-i  L 

L J 

^1 
_ o 

Since  fa  is  even,  bn  = 0 for 

2 sin  ( mr/L ) 

L mr/L 


This  sequence  of  Fourier  coefficients  is  called  the  amplitude  spectrum  of  because  \an\  is  the  maximum 
amplitude  of  the  wave  an  cos  ( mrx/L ).  Figure  280  shows  this  spectrum  for  the  periods  2 L = 4,  8,  16.  We  see 
that  for  increasing  L these  amplitudes  become  more  and  more  dense  on  the  positive  wn-axis,  where  wn  = mr/L. 
Indeed,  for  2L  = 4,  8,  16  we  have  1,  3,  7 amplitudes  per  “half-wave”  of  the  function  (2  sin  wn)/(Lwn ) (dashed 
in  the  figure).  Hence  for  2 L = 2k  we  have  2fe_1  — 1 amplitudes  per  half-wave,  so  that  these  amplitudes  will 
eventually  be  everywhere  dense  on  the  positive  wn-axis  (and  will  decrease  to  zero). 

The  outcome  of  this  example  gives  an  intuitive  impression  of  what  about  to  expect  if  we  turn  from  our  special 
function  to  an  arbitrary  one,  as  we  shall  do  next. 


Waveform  fL(x ) 


fL(x) 


I 

4 


H 2 L = 8 


.1  I 


Amplitude  spectrum  an(u>n) 


J 

-8 


8 


2 L = 16 


1 

4 


n = 20 


S n = 28  ^ 


n = 12  - 


f(x) 


-1  0 1 


x 


Fig.  280.  Waveforms  and  amplitude  spectra  in  Example  1 
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From  Fourier  Series  to  Fourier  Integral 

We  now  consider  any  periodic  function  //(x)  of  period  2 L that  can  be  represented  by  a 
Fourier  series 


oo 

/lW  = «0  + 2 (an  C0S  wnx  + bn  sin  Wnx), 
n= 1 


nrr 


and  find  out  what  happens  if  we  let  L— > oo.  Together  with  Example  1 the  present 
calculation  will  suggest  that  we  should  expect  an  integral  (instead  of  a series)  involving 
cos  wx  and  sin  wx  with  w no  longer  restricted  to  integer  multiples  w = wn  = rnr/L 
of  7 t/L  but  taking  all  values.  We  shall  also  see  what  form  such  an  integral  might 
have. 

If  we  insert  an  and  bn  from  the  Euler  formulas  (6),  Sec.  11.2,  and  denote  the  variable 
of  integration  by  V,  the  Fourier  series  of  becomes 


/lW  = 


2 L . 


fL(v)clv  + y 2 

-L  n= 1 


COS  WnX 


fL(y)  cos  wnvdv 


— L 

rL 


+ sin  wnx 


/l(v)  sin  wnv  dv 


— L 


We  now  set 


Aw  = wn+ 1 ~ wn  = 


( n + 1)7 T 


T17T  _ 7 T 

L ~ L 


Then  1/E  = Aw/77,  and  we  may  write  the  Fourier  series  in  the  form 


(1)  fL(x)  = 


2 L . 


rL  00 

fL(v)dv  + — 2 
-I.  77 , 


n= 1 


rL 


(cos  wnx ) Aw 


/l(v)  cos  w nv  dv 


+ (sin  wnx)Aw 


fL(v)  sin  wnv  dv 


-L 


This  representation  is  valid  for  any  fixed  L,  arbitrarily  large,  but  finite. 

We  now  let  L — > °o  and  assume  that  the  resulting  nonperiodic  function 

f(x)  = lim  fL(x) 

I , — >00 

is  absolutely  integrable  on  the  x-axis;  that  is,  the  following  (finite!)  limits  exist: 


(2) 


lim 

a— >—00 


|/(x)|  dx  + lim  |/(x)|  dx  ( written  l/(x)|  dxj. 


Jo 


Then  1 j L 0,  and  the  value  of  the  first  term  on  the  right  side  of  (1)  approaches  zero. 

Also  Aw  = 77/E— >0  and  it  seems  plausible  that  the  infinite  series  in  (1)  becomes  an 
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EXAMPLE  2 


integral  from  0 to  °°,  which  represents /(x),  namely, 
1 


(3)  fix)  = 


77 


cos  wx 

fiv)  cos  wv  dv  + sin  wx 

f(v)  sin  wv  dv 

•'o 

_ 

. 

—00 

-00 

dw. 


If  we  introduce  the  notations 

„ co 

1 


(4) 


A(w)  = 


77 


f(v ) cos  wv  dv,  B(w ) = 


77 


f(v)  sin  wv  dv 


we  can  write  this  in  the  form 


(5) 


fix) 


r 00 

[A  (w)  cos  wx  + B (w)  sin  wx\  dw. 
•'o 


This  is  called  a representation  of/(x)  by  a Fourier  integral. 

It  is  clear  that  our  naive  approach  merely  suggests  the  representation  (5),  but  by  no 
means  establishes  it;  in  fact,  the  limit  of  the  series  in  (1)  as  Aw  approaches  zero  is  not 
the  definition  of  the  integral  (3).  Sufficient  conditions  for  the  validity  of  (5)  are  as  follows. 


Fourier  Integral 

If  fix)  is  piecewise  continuous  (see  Sec.  6.1)  in  every  finite  interval  and  has  a right- 
hand  derivative  and  a left-hand  derivative  at  every  point  (see  Sec  11.1)  and  if  the 
integral  (2)  exists,  then  fix)  can  be  represented  by  a Fourier  integral  (5)  with  A and 
B given  by  (4).  At  a point  where  f(x)  is  discontinuous  the  value  of  the  Fourier  integral 
equals  the  average  of  the  left-  and  right-hand  limits  of  fix)  at  that  point  (see  Sec.  11.1). 
(Proof  in  Ref.  [Cl 2];  see  App.  1.) 


Applications  of  Fourier  Integrals 

The  main  application  of  Fourier  integrals  is  in  solving  ODEs  and  PDEs,  as  we  shall  see 
for  PDEs  in  Sec.  12.6.  However,  we  can  also  use  Fourier  integrals  in  integration  and  in 
discussing  functions  defined  by  integrals,  as  the  next  example. 


Single  Pulse,  Sine  Integral.  Dirichlet’s  Discontinuous  Factor.  Gibbs  Phenomenon 

Find  the  Fourier  integral  representation  of  the  function 


fix)  = 


i 


if 

if 


H < i 
M > i 


(Fig.  281) 


fix) 

1 

-10  1 * 


Fig.  281.  Example  2 
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Solution.  From  (4)  we  obtain 


1 . 

A(w)  = — f(v ) cos  wvdv  = 


cos  wvdv  = 


2 sin  w 
7TW 


B(w ) = 


sin  wv  dv  = 0 


and  (5)  gives  the  answer 

(6) 


fix)  = 


cos  wx  sin  w 


- dw. 


The  average  of  the  left-  and  right-hand  limits  of  f(x)  at  x = 1 is  equal  to  (1  + 0)/2,  that  is, 
Furthermore,  from  (6)  and  Theorem  1 we  obtain  (multiply  by  7t/2) 


(7) 


cos  wx  sin  w 


-dw 


tt/2 

if 

0Si<1, 

IT /4 

if 

x = 1, 

0 

if 

x>  l. 

We  mention  that  this  integral  is  called  Dirichlet’s  discontinous  factor.  (For  P.  L.  Dirichlet  see  Sec.  10.8.) 
The  case  x = 0 is  of  particular  interest.  If  x = 0,  then  (7)  gives 


(8*) 


sin  w 

dw 

w 


77 

2 


We  see  that  this  integral  is  the  limit  of  the  so-called  sine  integral 


(8) 


Si(w) 


dw 


as  u — > °o.  The  graphs  of  Si(w)  and  of  the  integrand  are  shown  in  Fig.  282. 

In  the  case  of  a Fourier  series  the  graphs  of  the  partial  sums  are  approximation  curves  of  the  curve  of  the 
periodic  function  represented  by  the  series.  Similarly,  in  the  case  of  the  Fourier  integral  (5),  approximations  are 
obtained  by  replacing  °°  by  numbers  a.  Hence  the  integral 


(9) 


cos  wx  sin  w 


- dw 


approximates  the  right  side  in  (6)  and  therefore  fix). 


Fig.  282.  Sine  integral  Si(u)  and  integrand 
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Fig.  283.  The  integral  (9)  for  a = 8, 16,  and  32,  illustrating 
the  development  of  the  Gibbs  phenomenon 


Figure  283  shows  oscillations  near  the  points  of  discontinuity  of/(x).  We  might  expect  that  these  oscillations 
disappear  as  a approaches  infinity.  But  this  is  not  true;  with  increasing  a,  they  are  shifted  closer  to  the  points 
x = ±1.  This  unexpected  behavior,  which  also  occurs  in  connection  with  Fourier  series  (see  Sec.  1 1.2),  is  known 
as  the  Gibbs  phenomenon.  We  can  explain  it  by  representing  (9)  in  terms  of  sine  integrals  as  follows.  Using 
(11)  in  App.  A3.1,  we  have 


2_ 

7 T 


cos  wx  sin  w 


- dw 


sin  (w  + wx) 


- dw  ■ 


' sin  (w  — wx) 


- dw. 


In  the  first  integral  on  the  right  we  set  w + wx  = t.  Then  dw/w  = dt/t,  and  0 ^ w ^ a corresponds  to 
0 ^ t ^ (x  + \)a.  In  the  last  integral  we  set  w — wx  = —t.  Then  dw/w  = dt/t , and  0 ^ w ^ a corresponds  to 
0 ^ 7 ^ (x  - 1 )a.  Since  sin  (— t)  = —sin  t,  we  thus  obtain 


2_ 

77 


cos  wx  sin  w 


- dw 


, r(x+l)a  . . , 

1 sin  t 1 

— dt  — 
7T  t 77 


From  this  and  (8)  we  see  that  our  integral  (9)  equals 

^Si(a[*+  1])  -^Si(fl[x-  1]) 

and  the  oscillations  in  Fig.  283  result  from  those  in  Fig.  282.  The  increase  of  a amounts  to  a transformation 
of  the  scale  on  the  axis  and  causes  the  shift  of  the  oscillations  (the  waves)  toward  the  points  of  discontinuity 
— 1 and  1.  I 


Fourier  Cosine  Integral  and  Fourier  Sine  Integral 

Just  as  Fourier  series  simplify  if  a function  is  even  or  odd  (see  Sec.  11.2),  so  do  Fourier 
integrals,  and  you  can  save  work.  Indeed,  if/ has  a Fourier  integral  representation  and  is 
even,  then  B (w)  = 0 in  (4).  This  holds  because  the  integrand  of  B(w)  is  odd.  Then  (5) 
reduces  to  a Fourier  cosine  integral 


(10)  fix) 


r 00 

A (w)  cos  wx  dw 

U) 


where 


A(w) 


2_ 

7T 


~ 00 

f(v)  cos  wv  dv. 
u> 


Note  the  change  in  A(w):  for  even/the  integrand  is  even,  hence  the  integral  from  — °°  to 
oo  equals  twice  the  integral  from  0 to  °oj  just  as  in  (7a)  of  Sec.  11.2. 

Similarly,  if/has  a Fourier  integral  representation  and  is  odd,  then  A (w)  = 0 in  (4).  This 
is  true  because  the  integrand  of  A (w)  is  odd.  Then  (5)  becomes  a Fourier  sine  integral 


(ID 


B(w) 


2^ 

IT 


r 00 

f(v)  sin  wv  dv. 


’ o 


B (w)  sin  wx  dw 


where 


o 
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1 


o1 

Fig.  284.  f(x) 
in  Example  3 


Note  the  change  of  B(w ) to  an  integral  from  0 to  °c  because  B(w)  is  even  (odd  times  odd 
is  even). 

Earlier  in  this  section  we  pointed  out  that  the  main  application  of  the  Fourier  integral 
representation  is  in  differential  equations.  However,  these  representations  also  help  in 
evaluating  integrals,  as  the  following  example  shows  for  integrals  from  0 to  °°. 

Laplace  Integrals 

We  shall  derive  the  Fourier  cosine  and  Fourier  sine  integrals  of/(jc)  = e_kx,  where  x > 0 and  k > 0 (Fig.  284). 
The  result  will  be  used  to  evaluate  the  so-called  Laplace  integrals. 

Solution,  (a)  From  (10)  we  have  A (w)  = — I e~kv  cos  wv  dv.  Now,  by  integration  by  parts, 

Jo 


e u cos  wv  dv  = — ~ ~ e sin  wv  + cos  wv 

k2  + w2 


If  v = 0,  the  expression  on  the  right  equals  —k/(k2  + w2).  If  v approaches  infinity,  that  expression  approaches 
zero  because  of  the  exponential  factor.  Thus  2/ it  times  the  integral  from  0 to  °°  gives 


(12) 


A(w)  = 


2 k/ir 
k2  + w2 


By  substituting  this  into  the  first  integral  in  (10)  we  thus  obtain  the  Fourier  cosine  integral  representation 


,,  v _ -kx  _ 2*  f cos  wx 

/(■*)  e , 2 , 2 
7 T J0  k + W 


dw 


(x  > 0,  k > 0). 


From  this  representation  we  see  that 


(13) 


COS  WX  7T 

dw  = — i 

o k2  + w2 


2k 


(x  > 0,  k > 0). 


(b)  Similarly,  from  (11)  we  have  B(w)  = — e u sin  wv  dv.  By  integration  by  parts, 

Jo 


e kv  sin  wv  dv  = — ^ ^ e Ku  ( — sin  wv  + cos  wv 
kz  + wz  \w 


—kv 


This  equals  — w/(k 2 + w2)  if  v = 0,  and  approaches  0 as  v —■ ► Thus 


(14) 


B(w)  = 


2w/t7 


k2  + w2 

From  (14)  we  thus  obtain  the  Fourier  sine  integral  representation 


2 f w sin  wx 


m = e-™  = - 

Jo  k + w 


From  this  we  see  that 


(15) 


W Sin  WX  77 

dw  = — e 

'o  k2  + w2  2 


(x  > 0,  k > 0). 


The  integrals  (13)  and  (15)  are  called  the  Laplace  integrals. 
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P=RQ-Bl-E-M=S^T— 11~7 


1-6 


EVALUATION  OF  INTEGRALS 


Show  that  the  integral  represents  the  indicated  function. 
Hint.  Use  (5),  (10),  or  (1 1);  the  integral  tells  you  which  one, 
and  its  value  tells  you  what  function  to  consider.  Show  your 
work  in  detail. 


1. 


2. 


3. 


4. 


5. 


6. 


cos  xw  + w sin  xw 
1 + w2 


0 

if 

x < 0 

77/2 

if 

x = 0 

ire~x 

if 

x > 0 

sin  7 rw  sin  xw 
1 — w2 

I — COS  7 TW 


dw  = 


5 sin  x if  0 S x S 7T 

0 if  x > 7 r 


w 


■ sin  xw  dw  = 


j77  if  0 < X < 7T 


0 if 


X > 7 7 


'0  1 - w 


cos  xw  dw  = 


sin  w — w cos  w 

...2 


g77COSX  if  0 < \x\  < \i T 
0 if  |x|  £ \tt 

tx  if  0 < x < 1 
sin  xw  dw  = ^ ^77  if  x = 1 

0 if  x > 1 


w sin  xw 


o h-’  + 4 


dw  = hire  xcosx  if  x>0 


7-12 


FOURIER  COSINE  INTEGRAL 
REPRESENTATIONS 

Represent /(x)  as  an  integral  (10). 

1 if  0 < x < 1 


7-  fix)  = 


8.  f(x)  = 


0 if 


x > 1 


if  0 < x < 1 


10  if  x > 1 
9.  fix)  = 1/(1  + x2)  [x  > 0.  Hint.  See  (13).] 


10.  fix)  = 

11-  fix)  = 
12.  fix)  = 


a2  — x2  if  0 < x < a 


0 if  x > a 

sin  x if  0 < x < 7 r 

0 if  x > 7 T 

e~x  if  0 < x < a 


0 if 


x > a 


13.  CAS  EXPERIMENT.  Approximate  Fourier  Cosine 
Integrals.  Graph  the  integrals  in  Prob.  7,  9,  and  1 1 as 


functions  of  x.  Graph  approximations  obtained  by 
replacing  <*>  with  finite  upper  limits  of  your  choice. 
Compare  the  quality  of  the  approximations.  Write  a 
short  report  on  your  empirical  results  and  observations. 

14.  PROJECT.  Properties  of  Fourier  Integrals 

(a)  Fourier  cosine  integral.  Show  that  (10)  implies 


(al) 


(a2) 


(a3) 


fiax)  = - 
ia  > 0) 


A(  a ) cos  xw  dw 
iScale  change) 


xfix)  = B (w)  sin  xw  dw. 


B*  = — 


dA 

dw 


A as  in  (10) 


x fix)  = A*iw)  cos  xw  dw. 


A*  = — 


dzA 
dw 2 


(b)  Solve  Prob.  8 by  applying  (a3)  to  the  result  of  Prob.  7. 

(c)  Verify  (a2)  for  fix)  =1  if  0 < x < a and 
fix)  = 0 if  x > a. 

(d)  F ourier  sine  integral.  Find  formulas  for  the  Fourier 
sine  integral  similar  to  those  in  (a). 

15.  CAS  EXPERIMENT.  Sine  Integral.  Plot  Si(n)  for 
positive  u.  Does  the  sequence  of  the  maximum  and 
minimum  values  give  the  impression  that  it  converges 
and  has  the  limit  7t/2?  Investigate  the  Gibbs  phenomenon 
graphically. 


16-20 


FOURIER  SINE  INTEGRAL 
REPRESENTATIONS 


Represent /(x)  as  an  integral  (11). 


16.  fix) 

17.  fix) 

18.  fix) 

19.  fix) 

20.  fix) 


{x  if  0 < x < a 
0 if  x > a 
1 1 if  0 < x < 1 
lo  if  x > 1 

{cos  x if  0 < X < 77 

0 if  X > 77 

( ex  if  0 < x < 1 

1 0 if  x > 1 

( e~x  if  0 < x < 1 
1 0 if  x > 1 
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11.8  Fourier  Cosine  and  Sine  Transforms 

An  integral  transform  is  a transformation  in  the  form  of  an  integral  that  produces  from 
given  functions  new  functions  depending  on  a different  variable.  One  is  mainly  interested 
in  these  transforms  because  they  can  be  used  as  tools  in  solving  ODEs,  PDEs,  and  integral 
equations  and  can  often  be  of  help  in  handling  and  applying  special  functions.  The  Laplace 
transform  of  Chap.  6 serves  as  an  example  and  is  by  far  the  most  important  integral 
transform  in  engineering. 

Next  in  order  of  importance  are  Fourier  transforms.  They  can  be  obtained  from  the 
Fourier  integral  in  Sec.  11.7  in  a straightforward  way.  In  this  section  we  derive  two  such 
transforms  that  are  real,  and  in  Sec.  11.9  a complex  one. 

Fourier  Cosine  Transform 

The  Fourier  cosine  transform  concerns  even  functions  f(x).  We  obtain  it  from  the  Fourier 
cosine  integral  [(10)  in  Sec.  10.7] 


fix) 


A{w)  cos  wx  dw. 


where 


A{w) 


2_ 

77 


f(v)  cos  wv  dv. 

^0 


Namely,  we  set  A(w)  = “s/ 2/ tt  fc(w),  where  c suggests  “cosine.”  Then,  writing  v = x in 
the  formula  for  A(w),  we  have 


(la) 


and 


fd  w) 


f(x ) cos  wx  dx 

0 


(lb) 


fix) 


cos  wx  dw. 


Formula  (la)  gives  from  fix)  a new  function  fc(w),  called  the  Fourier  cosine  transform 
of f(x).  Formula  (lb)  gives  us  back/fx)  from  fc(w),  and  we  therefore  call  f(x)  the  inverse 
Fourier  cosine  transform  of/c(w). 

The  process  of  obtaining  the  transform  fc  from  a given  / is  also  called  the  Fourier 
cosine  transform  or  the  Fourier  cosine  transform  method. 

Fourier  Sine  Transform 

Similarly,  in  (11),  Sec.  11.7,  we  set  B(w)  = \Z2/tt  fs(w),  where  5 suggests  “sine.”  Then, 
writing  v = x,  we  have  from  (11),  Sec.  11.7,  the  Fourier  sine  transform,  of/(x)  given  by 


fsM 


f(x)  sin  wx  dx, 


(2a) 


'o 
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EXAMPLE  1 


k 


Fig.  285.  f(x)  in 
Example  1 


EXAMPLE  2 


and  the  inverse  Fourier  sine  transform  of/s(w),  given  by 


(2b) 


m 


fs(w)  sin  wx  dw. 
T) 


The  process  of  obtaining  fs  (w)  from  f(x)  is  also  called  the  Fourier  sine  transform  or 
the  Fourier  sine  transform  method. 

Other  notations  are 


9?c(/)=/c,  »,(/)=/, 

and  '.'f’c  1 and  Ts7" 1 for  the  inverses  of  Tc  and  '.¥s,  respectively. 

Fourier  Cosine  and  Fourier  Sine  Transforms 

Find  the  Fourier  cosine  and  Fourier  sine  transforms  of  the  function 


m 


f k if  0 < x < a 
\ 0 if  x>  a 


(Fig.  285). 


Solution.  From  the  definitions  (la)  and  (2a)  we  obtain  by  integration 


fc  (w)  = A / — k | cos  wxdx  = ^f  — k 


[2  I 

^sin  aw\ 

— k[ 

— 

\]  77  ' 

, w J 

fs (w)  = k.  | sin  wx  dx  = 


1 — cos  aw 
w 


This  agrees  with  formulas  1 in  the  first  two  tables  in  Sec.  11.10  (where  k = 1). 

Note  that  for  f(x)  = k = const  (0  < x < °°),  these  transforms  do  not  exist.  (Why?) 

Fourier  Cosine  Transform  of  the  Exponential  Function 

Find 

Solution.  By  integration  by  parts  and  recursion. 


&c{e-x)  = 


e cos  wxdx  = 


[2 

e~x 

V 7T 

1 + w2 

— 2 (—cos  wx  + w sin  wx)  = 
V 0 

This  agrees  with  formula  3 in  Table  I,  Sec.  11.10,  with  a = 1.  See  also  the  next  example. 


V2/^ 

1 + w2' 


What  did  we  do  to  introduce  the  two  integral  transforms  under  consideration?  Actually 
not  much:  We  changed  the  notations  A and  B to  get  a “symmetric”  distribution  of  the 
constant  2/tt  in  the  original  formulas  (1)  and  (2).  This  redistribution  is  a standard  con- 
venience, but  it  is  not  essential.  One  could  do  without  it. 

What  have  we  gained?  We  show  next  that  these  transforms  have  operational  properties 
that  permit  them  to  convert  differentiations  into  algebraic  operations  (just  as  the  Laplace 
transform  does).  This  is  the  key  to  their  application  in  solving  differential  equations. 
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Linearity,  Transforms  of  Derivatives 

If  /(x)  is  absolutely  integrable  (see  Sec.  11.7)  on  the  positive  x-axis  and  piecewise 
continuous  (see  Sec.  6.1)  on  every  finite  interval,  then  the  Fourier  cosine  and  sine 
transforms  of/ exist. 

Furthermore,  if  /and  g have  Fourier  cosine  and  sine  transforms,  so  does  af  + bg  for 
any  constants  a and  b,  and  by  (la) 


^c(«/+  bg) 


[af{x)  + bg  (a)]  cos  wx  dx 


= a 


cos  wx  dx  + b 


r 00 

g (a)  cos  wx  dx. 


The  right  side  is  a3/(/)  + bi¥c(g).  Similarly  for  8FS,  by  (2).  This  shows  that  the  Fourier 
cosine  and  sine  transforms  are  linear  operations. 


(a)  ’S'dqf  + bg)  = a?¥c(f)  + bi¥c(g), 

(b)  9 '>s{af  + bg)  = a&df)  + b^s(g)- 


THEOREM  1 


Cosine  and  Sine  Transforms  of  Derivatives 

Letf( x)  be  continuous  and  absolutely  integrable  on  the  x-axis,  letf'(x)  be  piecewise 
continuous  on  every  finite  interval , and  Zef/(x)— *0  as  x— >0°.  Then 


(4) 


(a)  9e{f'(x)}  = w8Fs{/(x)}  - yj^m, 

(b)  &s{f  (x)}  = -wSFc{/(x)}. 


PROOF  This  follows  from  the  definitions  and  by  using  integration  by  parts,  namely, 


9e{f\x)}  =J^ 


f\x)  cos  wx  dx 


/(a)  cos  wx 


+ w 
0 J 


f{x)  sin  wx  dx 


and  similarly, 


} = 


/(0)  + w3?s{/(x)}; 


f'  (x)  sin  wx  dx 


/(x)  sin  wx 
= 0 - w3/{/(x)}. 


w 


/(a)  cos  wx  dx 
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EXAMPLE  3 


Formula  (4a)  with/,  instead  of/ gives  (when  f',  f"  satisfy  the  respective  assumptions 
for  / f in  Theorem  1) 

9e{f{x)}  = w3/{/'(x)}  - ^/|/'(0); 

hence  by  (4b) 

(5a)  9 /{/"Ml  = -w2®c{f(x)}  - (0). 

Similarly, 

(5b)  &s{f"(x)}  = ~w2^s{f(x)}  + 

A basic  application  of  (5)  to  PDEs  will  be  given  in  Sec.  12.7.  For  the  time  being  we 
show  how  (5)  can  be  used  for  deriving  transforms. 

An  Application  of  the  Operational  Formula  (5) 

Find  the  Fourier  cosine  transform  3Fc(e  r°)  of  f (x)  C ax,  where  a > 0. 

Solution.  By  differentiation,  (e~ax)"  = a2ea%\  thus 

«2/W  =/"(*)■ 


From  this,  (5a),  and  the  linearity  (3a), 


a2®c(f)  = ®c(f") 


= ~w2®c(f)  - J|/'( 0) 

= + aJ^. 


Flence 


(a2  + wPyty  c(f)  = «V2/tt. 


The  answer  is  (see  Table  I,  Sec.  11.10) 


(a  > 0). 


Tables  of  Fourier  cosine  and  sine  transforms  are  included  in  Sec.  11.10. 
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gHhgBB=KBS^H=8; 


1-8 


FOURIER  COSINE  TRANSFORM 


1.  Find  the  cosine  transform  fc(w)  of  fix ) = 1 if 
0 < x < 1,  fix)  = -1  if  1 < x < 2,  fix)  = 0 if 
x > 2. 


2.  Find /in  Prob.  1 from  the  answer  fc. 

3.  Find  fc(w ) for  /( x)  = x if  0 < x < 2,  fix)  = 0 if 
x > 2. 


4.  Derive  formula  3 in  Table  I of  Sec.  1 1 . 10  by  integration. 

5.  Find/.(w)  for/(x)  = x2  if  0 < x < 1,  fix)  = Oifx  > 1. 

6.  Continuity  assumptions.  Find  gc(w)  for  g(x)  = 2 if 
0 < x < 1,  g{x)  = 0 if  x > 1.  Try  to  obtain  from  it 
jc(w)  for  /(x)  in  Prob.  5 by  using  (5a). 

7.  Existence?  Does  the  Fourier  cosine  transform  of 
x_1  sin  x (0  < x < °°)  exist?  Of  x_1cosx?  Give 
reasons. 

8.  Existence?  Does  the  Fourier  cosine  transform  of 
f(x)  = k = const  (0  < x < 00 ) exist?  The  Fourier  sine 
transform? 


9-15 


FOURIER  SINE  TRANSFORM 


9.  Find  S'gle-0*),  a > 0,  by  integration. 

10.  Obtain  the  answer  to  Prob.  9 from  (5b). 

11.  Find  fs(w)  for  /(x)  = x2  if  0 < x < 1, 
x > 1. 


m = o if 


12.  Find  8Fs(xe  1 ^2)  from  (4b)  and  a suitable  formula  in 
Table  I of  Sec.  11.10. 


13.  Find  9s(e  x)  from  (4a)  and  formula  3 of  Table  I in 
Sec.  11.10. 


14.  Gamma  function.  Using  formulas  2 and  4 in  Table  II 
of  Sec.  11.10,  prove  T(g)  = V7 t [(30)  in  App.  A3.1], 
a value  needed  for  Bessel  functions  and  other 
applications. 

15.  WRITING  PROJECT.  Finding  Fourier  Cosine  and 
Sine  Transforms.  Write  a short  report  on  ways  of 
obtaining  these  transforms,  with  illustrations  by 
examples  of  your  own. 


11.9  Fourier  Transform. 

Discrete  and  Fast  Fourier  Transforms 


In  Sec.  1 1.8  we  derived  two  real  transforms.  Now  we  want  to  derive  a complex  transform 
that  is  called  the  Fourier  transform.  It  will  be  obtained  from  the  complex  Fourier  integral, 
which  will  be  discussed  next. 


Complex  Form  of  the  Fourier  Integral 

The  (real)  Fourier  integral  is  [see  (4),  (5),  Sec.  11.7] 


m 


r 00 

[A(w)  cos  wx  + B(xv)  sin  wx\  dw 


where 


Mw)  = — 


f(v ) cos  wvdv, 


B(w)  = — 


f(v)  sin  wvdv. 


Substituting  A and  B into  the  integral  for/,  we  have 


fix) 


1 

77 


f(v)[ cos  wv  cos  wx  + sin  wv  sin  wx]  dvdw. 


'o 
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By  the  addition  formula  for  the  cosine  [(6)  in  App.  A3.1]  the  expression  in  the  brackets 
[ • • • ] equals  cos  ( wv  — wx)  or,  since  the  cosine  is  even,  cos  (wx  — wv).  We  thus  obtain 


(1*) 


/M  = ¥ 


00 

^0 

_ . 

f(v)  cos  (wx  — wv)dv 


dw. 


The  integral  in  brackets  is  an  even  function  of  w,  call  it  F(w),  because  cos  (wx  — wv)  is 
an  even  function  of  w,  the  function  / does  not  depend  on  w,  and  we  integrate  with  respect 
to  v (not  w).  Hence  the  integral  of  F(w)  from  w = 0 to  °°  is  \ times  the  integral  of  F ( w) 
from  —oo  to  oo.  Thus  (note  the  change  of  the  integration  limit!) 


(1) 


/(*) 


1 

277  . 


f(v)  cos  (wx  — wv)  dv 


dw. 


We  claim  that  the  integral  of  the  form  (1)  with  sin  instead  of  cos  is  zero: 


(2) 


1 

277 


f(v)  sin  (wx  — wv)  dv 


dw  = 0. 


This  is  true  since  sin  (wx  — wv)  is  an  odd  function  of  w,  which  makes  the  integral  in 
brackets  an  odd  function  of  w,  call  it  G(w).  Hence  the  integral  of  G(w)  from  — oo  to  oo 
is  zero,  as  claimed. 

We  now  take  the  integrand  of  (1)  plus  i (=  V—  1 ) times  the  integrand  of  (2)  and  use 
the  Euler  formula  [(11)  in  Sec.  2.2] 

(3)  elx  = cos  x + i sin  x. 


Taking  wx  — wv  instead  of  x in  (3)  and  multiplying  by  f(v)  gives 

f(v)  cos  (wx  - wv)  + if(v)  sin  (wx  - wv)  = f(v)elCwx~wv) . 

Hence  the  result  of  adding  (1)  plus  i times  (2),  called  the  complex  Fourier  integral,  is 


(4) 


fix) 


1 

277  . 


f(v)eMx-v)  dv  dw 


(i  = V=T). 


To  obtain  the  desired  Fourier  transform  will  take  only  a very  short  step  from  here. 


Fourier  Transform  and  Its  Inverse 


Writing  the  exponential  function  in  (4)  as  a product  of  exponential  functions,  we  have 


(5) 


fix)  = 


1 

’ 00 

1 

V^77  • 

-\ZlTT  - 

f(v)e~ 


' dv 


" dw. 


The  expression  in  brackets  is  a function  of  w,  is  denoted  by  f(w),  and  is  called  the  Fourier 
transform  of/;  writing  v = x,  we  have 


f(w) 


1 


V277 


f(x)e~iwxdx. 


(6) 


— GO 
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EXAMPLE  2 


With  this,  (5)  becomes 


(7) 


m 


f (yv)eiwx  dw 


and  is  called  the  inverse  Fourier  transform  of/(w). 

Another  notation  for  the  Fourier  transform  is 

/=  mx 

so  that 

/=  ®~\fX 

The  process  of  obtaining  the  Fourier  transform  2F(/)  = / from  a given /is  also  called 
the  Fourier  transform  or  the  Fourier  transform  method. 

Using  concepts  defined  in  Secs.  6.1  and  11.7  we  now  state  (without  proof)  conditions 
that  are  sufficient  for  the  existence  of  the  Fourier  transform. 


Existence  of  the  Fourier  Transform 

Iff  (pc)  is  absolutely  integrable  on  the  x-axis  and  piecewise  continuous  on  every  finite 
interval,  then  the  Fourier  transform  f(w)  of  f(x)  given  by  (6)  exists. 


Fourier  Transform 

Find  the  Fourier  transform  of  fix)  = 1 if  |jr|  < 1 and  fix)  = 0 otherwise. 
Solution.  Using  (6)  and  integrating,  we  obtain 


f(w)  = 


1 


—IWX  J ...  _ 


dx 


1 


1 


(e~lw  - elw). 


s/Itt  J_l  V27 T —iw  -1  — /wV2t7 

As  in  (3)  we  have  elw  = cos  w + i sin  w,  e~lw  = cos  w — i sin  w,  and  by  subtraction 

e — e = 2i  sin  w. 

Substituting  this  in  the  previous  formula  on  the  right,  we  see  that  i drops  out  and  we  obtain  the  answer 


f(w)  = 


TT  sin  w 
2 w 


Fourier  Transform 

Find  the  Fourier  transform  of/(x)  = e~ax  if  x > 0 and/(x)  = 0 if  x < 0;  here  a > 0. 

Solution.  From  the  definition  (6)  we  obtain  by  integration 


9te^ax)  = 


1 


V2tt  J0 

i <r 


y/ffr  — (a  + iw) 


x=o  V27 T(a  + iw) 


This  proves  formula  5 of  Table  III  in  Sec.  11.10. 
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Physical  Interpretation:  Spectrum 

The  nature  of  the  representation  (7)  of  f(x)  becomes  clear  if  we  think  of  it  as  a superposition 
of  sinusoidal  oscillations  of  all  possible  frequencies,  called  a spectral  representation. 
This  name  is  suggested  by  optics,  where  light  is  such  a superposition  of  colors 
(frequencies).  In  (7),  the  “spectral  density”  /(w)  measures  the  intensity  of  fix)  in  the 
frequency  interval  between  w and  w + Aw  (Aw  small,  fixed).  We  claim  that,  in  connection 
with  vibrations,  the  integral 


I f (w)  1 2 dw 


can  be  interpreted  as  the  total  energy  of  the  physical  system.  Hence  an  integral  of  | / (w)|2 
from  a to  b gives  the  contribution  of  the  frequencies  w between  a and  b to  the  total  energy. 

To  make  this  plausible,  we  begin  with  a mechanical  system  giving  a single  frequency, 
namely,  the  harmonic  oscillator  (mass  on  a spring,  Sec.  2.4) 

my"  + ky  = 0. 

Here  we  denote  time  t by  x.  Multiplication  by  y gives  my' y"  + ky' y = 0.  By  integration, 

\mv2  + izky2  = Eq  = const 

where  u = y'  is  the  velocity.  The  first  term  is  the  kinetic  energy,  the  second  the  potential 
energy,  and  Eo  the  total  energy  of  the  system.  Now  a general  solution  is  (use  (3)  in 
Sec.  1 1.4  with  t = x) 

y = fli  cos  wo*  + b\  sin  wox  = c\elWoX  + c_1£,_m'°x,  w§  = k/m 

where  Ci  = («i  — ib\)/2,  c_i  = c±  = (a±  + ibf)/2.  We  write  simply  A = c\elWopc, 
B = c-ie~lWoX.  Then  y = A + B.  By  differentiation,  v = y'  = A1  + B'  = iwo (A  — B ). 
Substitution  of  v and  y on  the  left  side  of  the  equation  for  Eo  gives 

E0  = Imv2  + \ky2  = |m(/w0)2(A  - B)2  + \k{A  + B)2. 

Here  Wo  = k/m,  as  just  stated;  hence  rawjj  = k.  Also  i2  = —1,  so  that 

E0  = hk[-(A  - B)2  + (A  + B)2]  = 2 kAB  = 2kc1eiWoXc^1e~iWoX  = 2kc1c_1  = 2k|Cl|2. 

Hence  the  energy  is  proportional  to  the  square  of  the  amplitude  \ C\  \ . 

As  the  next  step,  if  a more  complicated  system  leads  to  a periodic  solution  y = f(x) 
that  can  be  represented  by  a Fourier  series,  then  instead  of  the  single  energy  term  Icql2 
we  get  a series  of  squares  \cn  of  Fourier  coefficients  cn  given  by  (6),  Sec.  11.4.  In  this 
case  we  have  a “discrete  spectrum”  (or  “point  spectrum”)  consisting  of  countably  many 
isolated  frequencies  (infinitely  many,  in  general),  the  corresponding  \cn\2  being  the 
contributions  to  the  total  energy. 

Finally,  a system  whose  solution  can  be  represented  by  an  integral  (7)  leads  to  the  above 
integral  for  the  energy,  as  is  plausible  from  the  cases  just  discussed. 
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Linearity.  Fourier  Transform  of  Derivatives 

New  transforms  can  be  obtained  from  given  ones  by  using 


Linearity  of  the  Fourier  Transform 

The  Fourier  transform  is  a linear  operation;  that  is,  for  any  functions  f(x)  and  g(x) 
whose  Fourier  transforms  exist  and  any  constants  a and  b,  the  Fourier  transform 
of  af  + bg  exists,  and 

(8)  ®(af  + bg)  = a&if)  + b&(g). 


This  is  true  because  integration  is  a linear  operation,  so  that  (6)  gives 

1 


&{af  (x)  + bg(x)}  = 


V277-  . 

1 


W(x)  + bg(x)]e 

00 

f(x)e~lwx  dx  + b 


lwxdx 

1 


V277  . 

= air- {f(x}}  + b&{g(x)}. 


\Z2tt  . 


g(x)e~ 


' dx 


In  applying  the  Fourier  transform  to  differential  equations,  the  key  property  is  that 
differentiation  of  functions  corresponds  to  multiplication  of  transforms  by  iw: 


Fourier  Transform  of  the  Derivative  of  f (x) 

Let  fix)  be  continuous  on  the  x-axis  and  fix)  —*0  as  |x|  — » °o.  Furthermore,  letf'ix) 
be  absolutely  integrable  on  the  x-axis.  Then 

(9)  9{f'(A}  = h&{fix)). 


From  the  definition  of  the  Fourier  transform  we  have 

1 


9{f\x)}  = 


V27T  . 


f ix)e~iwx  dx. 


Integrating  by  parts,  we  obtain 

9 {f' (pc) } = 1 


fix)e~ 


- i~iw) 


V277  L 

Since  fix)  — >0  as  \x\  °°,  the  desired  result  follows,  namely, 

®{f'(x))  = 0 + tw&{f(pc)}. 


fix)e~lwxdx 
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EXAMPLE  3 


THEOREM  4 


Two  successive  applications  of  (9)  give 

9 'if")  = ^w2F(/,)  = (iwf&if). 

Since  ( iw )2  = — w2,  we  have  for  the  transform  of  the  second  derivative  off 
(10)  nf'ix)}  = -w23 ;{/(*)}• 

Similarly  for  higher  derivatives. 

An  application  of  (10)  to  differential  equations  will  be  given  in  Sec.  12.6.  For  the  time 
being  we  show  how  (9)  can  be  used  to  derive  transforms. 

Application  of  the  Operational  Formula  (9) 

Find  the  Fourier  transform  of  xe~^  from  Table  III,  Sec  11.10. 

Solution.  We  use  (9).  By  formula  9 in  Table  III 

9(xe-^)  = sq-lor*2)'} 

= -|3T(e-*V) 


= _J"  -«*/4 

2V2 


Convolution 

The  convolution/*  g of  functions / and  g is  defined  by 


(ID 


h(x)  = (/*  g)(x) 


00 

f(p)g(x  ~ p)  dp 


00 

fix  ~ p)g(p)  dp. 


The  purpose  is  the  same  as  in  the  case  of  Laplace  transforms  (Sec.  6.5):  taking  the 
convolution  of  two  functions  and  then  taking  the  transform  of  the  convolution  is  the  same 
as  multiplying  the  transforms  of  these  functions  (and  multiplying  them  by  V277): 


Convolution  Theorem 

Suppose  that  f(x)  and  g(x)  are  piecewise  continuous,  bounded,  and  absolutely 
integrable  on  the  x-axis.  Then 

(12)  ^(f*g)  = V27T^(f)^(g). 
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By  the  definition. 


9 Hf*g)  = 


1 

V277  . 


f(P)g(x 


An  interchange  of  the  order  of  integration  gives 


p ) dp  e lwx  dx. 


1 

V277 


00 


f(p)g  (x 


— oc  — ot 


„\  —iwx  1 1 

p)  e dx  dp. 


Instead  of  x we  now  take  x — p = q as  a new  variable  of  integration.  Then  x = p + q 
and 


®(f*  g) 


1 

V27 T 


f{p)g{q)e-iw^dqdp. 


This  double  integral  can  be  written  as  a product  of  two  integrals  and  gives  the  desired 
result 


9{f*g) 


1 

V277 


f(.P)e~iwp  dp 


g(q)e-iwUq 


= — l=[V2^®(f)][V2^®(g)]  = Vto9(J)9(g). 

V2tt 

By  taking  the  inverse  Fourier  transform  on  both  sides  of  (12),  writing  / = 8F(/)  and 
g = 8F(g)  as  before,  and  noting  that  V27T  and  1/V27T  in  (12)  and  (7)  cancel  each  other, 
we  obtain 


(13) 


if*  g)  (x) 


f(w)g  (w)elwx  dw, 


a formula  that  will  help  us  in  solving  partial  differential  equations  (Sec.  12.6). 

Discrete  Fourier  Transform  (DFT), 

Fast  Fourier  Transform  (FFT) 

In  using  Fourier  series,  Fourier  transforms,  and  trigonometric  approximations  (Sec.  11.6) 
we  have  to  assume  that  a function /(x),  to  be  developed  or  transformed,  is  given  on  some 
interval,  over  which  we  integrate  in  the  Euler  formulas,  etc.  Now  very  often  a function /(x) 
is  given  only  in  terms  of  values  at  finitely  many  points,  and  one  is  interested  in  extending 
Fourier  analysis  to  this  case.  The  main  application  of  such  a “discrete  Fourier  analysis” 
concerns  large  amounts  of  equally  spaced  data,  as  they  occur  in  telecommunication,  time 
series  analysis,  and  various  simulation  problems.  In  these  situations,  dealing  with  sampled 
values  rather  than  with  functions,  we  can  replace  the  Fourier  transform  by  the  so-called 
discrete  Fourier  transform  (DFT)  as  follows. 
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Let  fix)  be  periodic,  for  simplicity  of  period  277.  We  assume  that  N measurements  of 
/(x)  are  taken  over  the  interval  0 Si  x g 277  at  regularly  spaced  points 

277  k 

(14)  xk  = —^~,  k = 0,  l,  • • ■ , N — 1. 

We  also  say  that  /(x)  is  being  sampled  at  these  points.  We  now  want  to  determine  a 

complex  trigonometric  polynomial 

JV-l 

(15)  q(x)  = 2 cnemXk 

n= 0 

that  interpolates  f(x)  at  the  nodes  (14),  that  is,  q(xk)  = f(xk),  written  out,  with  //,  denoting 


f(xk), 

(16) 

JV-l 

fk  =f(xk)  = q(xk)  = 2 cnemXk, 

n=  0 

k = 0,l,-",N-  1 

Hence  we  must  determine  the  coefficients  Co,  ■ ■ • , oy_i  such  that  (16)  holds.  We  do  this 
by  an  idea  similar  to  that  in  Sec.  11.1  for  deriving  the  Fourier  coefficients  by  using  the 
orthogonality  of  the  trigonometric  system.  Instead  of  integrals  we  now  take  sums.  Namely, 
we  multiply  (16)  by  e~lmXk  (note  the  minus!)  and  sum  over  k from  0 to  N — 1.  Then  we 
interchange  the  order  of  the  two  summations  and  insert  xk  from  (14).  This  gives 

2V-1  1V-1  JV-l  JV-l  JV-l 

(17)  ^he~imXk  = 22  c™ei(n“m)Xfc  = 2 cn  2 eiCn~m)2lTk/N. 

k=0  k= 0 n= 0 n= 0 k= 0 

Now 


e 


i(n—m)2rrk/N  ^i(n—m)2rr/N^k 


r. 


We  donote  [ ■ • • ] by  r.  For  n = m we  have  r = e = 1.  The  sum  of  these  terms  over  k 
equals  N,  the  number  of  these  terms.  For  n # in  we  have  r ¥=  l and  by  the  formula  for  a 
geometric  sum  [(6)  in  Sec.  15.1  with  q = r and  n = N — 1] 


N-l 


1 - r 


N 


2rfc  = Vr 

!/• — n 1 


= o 


k= 0 


because  rN  = 1;  indeed,  since  k,  m,  and  n are  integers. 


rN  = el(n  m)Z7rk  = cos  277 k{ii  — m)  + i sin  277 k(n  — m)  = 1 + 0 = 1. 


This  shows  that  the  right  side  of  (17)  equals  cmN.  Writing  n for  m and  dividing  by  N,  we 
thus  obtain  the  desired  coefficient  formula 


, JV-l 

(18*)  cn  = - 2 fke~inXk  fk  = f(xk),  n = 0,  1,  ■ ■ • , N ~ 1. 

/v  fc=0 

Since  computation  of  the  cn  (by  the  fast  Fourier  transform,  below)  involves  successive 
halfing  of  the  problem  size  N,  it  is  practical  to  drop  the  factor  1 / N from  cn  and  define  the 
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discrete  Fourier  transform  of  the  given  signal  f = [f0  ■ • • /jv-ilT  to  be  the  vector 

f = [/ o f n—i\  with  components 

N-l 

(18)  fn  = Ncn=  2 fke~inx\  A =f(xk),  n = 0,  ■ • • , N « 1 . 

k= 0 


This  is  the  frequency  spectrum  of  the  signal. 

In  vector  notation,  f = F,vf,  where  the  N X N Fourier  matrix  FlV  = [enk]  has  the 
entries  [given  in  (18)] 

(19)  enk  = e~inXk  = e-2mnk'N  = wnk,  w = wN  = e~2rri/N, 

where  n,  k = 0,  • ■ • , N — 1 . 


Discrete  Fourier  Transform  (DFT).  Sample  of  N = 4 Values 

LetJV  = 4 measurements  (sample  values)  be  given.  Then  w = e~2m/N  = e_7rl/2  = — and  thus  wnk  = (—i)nk. 
Let  the  sample  values  be,  say  f = [0  1 4 9]T.  Then  by  (18)  and  (19), 


(20) 


f = F4f  = 


f = 


1 1 
-i  -1 
-1  1 
i -1 


-1 

—i 


14 

-4  + 8( 
-6 

-4  - 8i 


From  the  first  matrix  in  (20)  it  is  easy  to  infer  what  Fn  looks  like  for  arbitrary  N,  which  in  practice  may  be 
1000  or  more,  for  reasons  given  below. 

From  the  DFT  (the  frequency  spectrum)  f = F^f  we  can  recreate  the  given  signal 

f = F.v  *f,  as  we  shall  now  prove.  Here  lAy  and  its  complex  conjugate  FN  = — [wnk] 
satisfy 

(21a)  FjyFjy  = FjvFjv  = AT 


where  I is  the  N X N unit  matrix;  hence  Fjv  has  the  inverse 


(21b) 


Fn1  - 


We  prove  (21).  By  the  multiplication  rule  (row  times  column)  the  product  matrix 
GN  = FjvFjv  = [gjkl  in  (21a)  has  the  entries  gjk  = Row  j of  Fjv  times  Column  k of  Fjv- 
That  is,  writing  W = w:'wk,  we  prove  that 


gjk  = (wAk)°  + (Wjwkf  + • • • + (H'JWfc)iV  1 


= w°  + w1  + ■ ■ ■ + WN~k 


0 if  j ¥=  k 

N if  j = k. 
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Indeed,  when  j = k,  then  wkwk  = (ww)k  = (e2m/N 'e~2'n"l/N'jk  = so  that  the  sum 

of  these  N terms  equals  /V;  these  are  the  diagonal  entries  of  G,v-  Also,  when  j A k,  then 
W A 1 and  we  have  a geometric  sum  (whose  value  is  given  by  (6)  in  Sec.  15.1  with  q = W 
and  n = N — \) 


W°  + W1  + • ■ ■ + = — = 0 

1 — W 

because  WN  = (yPwY  = (e2,rV(e_2,ri)fc  = lj  ■ lk  = 1. 

We  have  seen  that  f is  the  frequency  spectrum  of  the  signal  fix).  Thus  the  components 
fn  of  f give  a resolution  of  the  277-periodic  function /(x)  into  simple  (complex)  harmonics. 
Here  one  should  use  only  n’ s that  are  much  smaller  than  N/2,  to  avoid  aliasing.  By  this 
we  mean  the  effect  caused  by  sampling  at  too  few  (equally  spaced)  points,  so  that,  for 
instance,  in  a motion  picture,  rotating  wheels  appear  as  rotating  too  slowly  or  even  in  the 
wrong  sense.  Hence  in  applications,  N is  usually  large.  But  this  poses  a problem.  Eq.  (18) 
requires  0{N)  operations  for  any  particular  n,  hence  0(NZ ) operations  for,  say,  all 
n < N/2.  Thus,  already  for  1000  sample  points  the  straightforward  calculation  would 
involve  millions  of  operations.  However,  this  difficulty  can  be  overcome  by  the  so-called 
fast  Fourier  transform  (FFT),  for  which  codes  are  readily  available  (e.g.,  in  Maple).  The 
FFT  is  a computational  method  for  the  DFT  that  needs  only  O ( N)  log2  N operations 
instead  of  0(N2).  It  makes  the  DFT  a practical  tool  for  large  N.  Here  one  chooses  N = 2P 
(p  integer)  and  uses  the  special  form  of  the  Fourier  matrix  to  break  down  the  given  problem 
into  smaller  problems.  For  instance,  when  N = 1000,  those  operations  are  reduced  by  a 
factor  1000/log2  1000  ~ 100. 

The  breakdown  produces  two  problems  of  size  M = N/2.  This  breakdown  is  possible 
because  for  N = 2 M we  have  in  (19) 

tvjv  — W2m  ~~  (fi  ) — e — e — wm- 

The  given  vector  f = [f0  ■ ■ ■ f^-iV  is  split  into  two  vectors  with  M components  each, 
namely,  fev  = [f0  f2  • • • /jv-2]T  containing  the  even  components  of  f,  and  focj  = 
ifi  fz  ' ' ' /jv-i]T  containing  the  odd  components  of  f.  For  fev  and  focj  we  determine 
the  DFTs 


fev  ifev,0  fev, 2 ' ' ' fev,N— 2!  fev 

and 

fod  — Ifo d,l  fo d,3  ' ’ ' ,/od,JV— l]  — f^Mfod 

involving  the  same  M X M matrix  Fm-  From  these  vectors  we  obtain  the  components  of 
the  DFT  of  the  given  vector /by  the  formulas 

(a)  fn  — fev,n  A wN.foA,n 

(b)  fn+M  — fev,n  ~ wtslfoA,n 


(22) 


n = 0,  • ■ • , M - 1 
n = - 1. 
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For  N = 2P  this  breakdown  can  be  repeated  p — I times  in  order  to  finally  arrive  at  N/2 
problems  of  size  2 each,  so  that  the  number  of  multiplications  is  reduced  as  indicated 
above. 

We  show  the  reduction  from  N = 4 to  M = N/2  = 2 and  then  prove  (22). 

Fast  Fourier  Transform  (FFT).  Sample  of  N = 4 Values 

Wlien  N = 4,thenw  = Wjv  = — / as  in  Example  4 and  M = N/2  = 2,hencew  = = e~2m^2  = e~m  = — 1. 

Consequently, 


fo 

"l  f 

fo 

fo  +A 

f ev  = 

- F2fev  — 

= 

f 2_ 

1 -1 

hm 

fo  - h_ 

fl 

T T 

'fi 

>i+  h 

fod  = 

A 

= F2fod  = 

= 

_f  3 

i -l 

A 

A - A. 

From  this  and  (22a)  we  obtain 

fo  = /ev,  o + w%foa,0  = (/o  + A)  + ( A + A)  = fo  + A + A + A 
fi  = /ev, i + «’N/"od,i  = (/o  - /2)  - «(A  + h)  = fo  - ifi  -fz+  fo- 
Similarly,  by  (22b), 

k = /ev, o - wS/od,0  = (/o  + /a)  - (A  + A)  = /o  - fi  + h - h 
k =/ev,  i — v''w/od,i  = (fo  “A)  ~ (~0(fi  “A)  =/o  + ifi  ~ A — !A- 
This  agrees  with  Example  4,  as  can  be  seen  by  replacing  0,  1 , 4,  9 with  /0,  /i,  A>  A- 


We  prove  (22).  From  (18)  and  (19)  we  have  for  the  components  of  the  DFT 

iV-l 

/n  = 2 wNlfk- 

k= 0 

Splitting  into  two  sums  of  M = A/2  terms  each  gives 

M-l  M-l 

/m  = H H'ftlc  + 2W®fc  + 1)”/2fc  + l. 

fe=0  fe=0 

We  now  use  vv,v  = wm  and  pull  out  wjv  from  under  the  second  sum,  obtaining 

M-l  M-l 

(23)  fn  = 2 wMUfev,k  + 2 vvM>Yod,fc- 

fc=0  fc=0 

The  two  sums  are  /ev  r,  and/otjn,  the  components  of  the  “half-size”  transforms  Ffev  and 

Ffod. 

Formula  (22a)  is  the  same  as  (23).  In  (22b)  we  have  n + M instead  of  n.  This  causes 
a sign  changes  in  (23),  namely  — wjv  before  the  second  sum  because 

, M —27 riM/N  „—2Tri/2  — tti  i 

w’jv  — e ' = e ' = e = —1. 


This  gives  the  minus  in  (22b)  and  completes  the  proof. 
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FRQB1=EZM=SIET— 11=» 


1.  Review  in  complex.  Show  that  1 /i  = —i,  e lx  = 

■ • ix  i —ix  o ix  —ix 

cos  x — i sin  x,  e + e =2  cos  x,  e — e = 
2 i sin  x,  etkx  = cos  la c + i sin  la r. 


2-11 


FOURIER  TRANSFORMS  BY 


INTEGRATION 


Find  the  Fourier  transform  of/(x)  (without  using  Table 
III  in  Sec.  1 1.10).  Show  details. 


2.  f(x) 


3.  fix) 


(e2ix  if  — 1 < x < 1 
1 0 otherwise 

f 1 if  a < x < b 
1 0 otherwise 


4-  fix) 


je^  if  x < 0 (k  > 0) 
Id  if  x > 0 


5.  fix) 


( ex  if  —a  < x < a 
1 0 otherwise 


6.  fix)  = e (—oo  < x < oo) 


7.  fix)  = 


if  0 < x < a 

otherwise 


8-  m 


( xe  x if  — 1 < x < 0 
1 0 otherwise 


9.  fix) 


\ x | if  — 1 < x < 1 
0 otherwise 


10.  fix) 


11.  fix) 


ix  if  — 1 < x < 1 
1 0 otherwise 

{-1  if  — 1 < x < 0 
1 if  0 < x < 1 
0 otherwise 
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USE  OF  TABLE  III  IN  SEC.  11.10. 


OTHER  METHODS 

12.  Find  S F(/(x))  for  fix)  = xe-x  if  x > 0 ,/(x)  = 0 if 
x < 0,  by  (9)  in  the  text  and  formula  5 in  Table  III 
(with  a = 1).  Hint.  Consider  xe~x  and  e~x. 


13.  Obtain  9ie~x  ^2)  from  Table  III. 

14.  In  Table  III  obtain  formula  7 from  formula  8. 

15.  In  Table  III  obtain  formula  1 from  formula  2. 

16.  TEAM  PROJECT.  Shifting  (a)  Show  that  if  fix) 
has  a Fourier  transform,  so  does  fix  — a),  and 
3 F{/(x  - a)}  = e~iwa9{fix)). 


(b)  Using  (a),  obtain  formula  1 in  Table  III,  Sec.  1 1.10, 
from  formula  2. 


(c)  Shifting  on  the  w- Axis.  Show  that  if/(w)  is  the 
Fourier  transform  of/(x),  then/(w  — a)  is  the  Fourier 
transform  of  e“x/(x). 


(d)  Using  (c),  obtain  formula  7 in  Table  III  from  1 and 
formula  8 from  2. 


17.  What  could  give  you  the  idea  to  solve  Prob.  1 1 by  using 
the  solution  of  Prob.  9 and  formula  (9)  in  the  text? 
Would  this  work? 


18-25 


DISCRETE  FOURIER  TRANSFORM 


18.  Verify  the  calculations  in  Example  4 of  the  text. 

19.  Find  the  transform  of  a general  signal 
/ = Ifi  h h UV  °f  four  values. 

20.  Find  the  inverse  matrix  in  Example  4 of  the  text  and 
use  it  to  recover  the  given  signal. 

21.  Find  the  transform  (the  frequency  spectrum)  of  a 
general  signal  of  two  values  \f\  f2\J. 

22.  Recreate  the  given  signal  in  Prob.  21  from  the 
frequency  spectrum  obtained. 

23.  Show  that  for  a signal  of  eight  sample  values, 
w = e-i/4  = (1  — i)/V 2.  Check  by  squaring. 

24.  Write  the  Fourier  matrix  F for  a sample  of  eight  values 
explicitly. 

25.  CAS  Problem.  Calculate  the  inverse  of  the  8X8 
Fourier  matrix.  Transform  a general  sample  of  eight 
values  and  transform  it  back  to  the  given  data. 


534 


CHAP.  11  Fourier  Analysis 


11.10  Tables  of  Transforms 

Table  i Fourier  Cosine  Transforms 

See  (2)  in  Sec.  11.8. 


SEC.  11.10  Tables  of  Transforms 
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Table  Fourier  Sine  Transforms 

See  (5)  in  Sec.  1 1.8. 
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Table  Fourier  Transforms 


See  (6)  in  Sec.  11.9. 


Chapter  11  Review  Questions  and  Problems 
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^BEBOE£e=EEEBEEgaE^BEES  T I O N S AND  PROBLEMS 


1.  What  is  a Fourier  series?  A Fourier  cosine  series?  A 
half-range  expansion?  Answer  from  memory. 

2.  What  are  the  Euler  formulas?  By  what  very  important 
idea  did  we  obtain  them? 

3.  How  did  we  proceed  from  277-periodic  to  general- 
periodic  functions? 

4.  Can  a discontinuous  function  have  a Fourier  series?  A 
Taylor  series?  Why  are  such  functions  of  interest  to  the 
engineer? 

5.  What  do  you  know  about  convergence  of  a Fourier 
series?  About  the  Gibbs  phenomenon? 

6.  The  output  of  an  ODE  can  oscillate  several  times  as 
fast  as  the  input.  How  come? 

7.  What  is  approximation  by  trigonometric  polynomials? 
What  is  the  minimum  square  error? 

8.  What  is  a Fourier  integral?  A Fourier  sine  integral? 
Give  simple  examples. 

9.  What  is  the  Fourier  transform?  The  discrete  Fourier 
transform? 


10.  What  are  Sturm-Liouville  problems?  By  what  idea  are 
they  related  to  Fourier  series? 


11-20  FOURIER  SERIES.  In  Probs.  11,  13, 16,  20  find 
the  Fourier  series  of  fix)  as  given  over  one  period  and 
sketch /(x)  and  partial  sums.  In  Probs.  12,  14,  15,  17-19 
give  answers,  with  reasons.  Show  your  work  detail. 


ii.  m = 


i3.  m = 


f0  if  — 2 < x < 0 
12  if  0 < x <2 
12.  Why  does  the  series  in  Prob.  1 1 have  no  cosine  terms? 
f0  if  — 1 < x < 0 
tx  if  0 < x < 1 


14.  What  function  does  the  series  of  the  cosine  terms  in 
Prob.  13  represent?  The  series  of  the  sine  terms? 

15.  What  function  do  the  series  of  the  cosine  terms  and  the 
series  of  the  sine  terms  in  the  Fourier  series  of 
ex  (—  5 < x < 5)  represent? 

16.  f{x)  = |x|  (—71  < x < 77) 


17.  Find  a Fourier  series  from  which  you  can  conclude  that 

1 - 1/3  + 1/5  - 1/7  + = tt/4. 

18.  What  function  and  series  do  you  obtain  in  Prob.  16  by 
(termwise)  differentiation? 

19.  Find  the  half-range  expansions  of  f{x)  = x 

(0  < x < 1). 

20.  fix)  = 3x2  (-77  < x < 77) 


21-22 


GENERAL  SOLUTION 


Solve,  y"  + co2y  = r(t),  where  |tu|  A 0,  1,  2,  ■ • • , r(t)  is 
277-periodic  and 

21.  r(t)  = 3f2  ( — 77  < t < 77) 

22.  r(t)  = |f|  (—77  < t < 77) 


23-25 


MINIMUM  SQUARE  ERROR 


23.  Compute  the  minimum  square  error  for  fix)  = x/77 
( — 77  < x < 77)  and  trigonometric  polynomials  of 
degree  N = 1,  ■ ■ ■ , 5. 


24.  How  does  the  minimum  square  error  change  if  you 
multiply  fix)  by  a constant  kl 

25.  Same  task  as  in  Prob.  23,  for  fix)  = |x|/77 
(—77  < x < 77).  Why  is  E*  now  much  smaller  (by  a 
factor  100,  approximately!)? 


26-30 


FOURIER  INTEGRALS  AND  TRANSFORMS 


Sketch  the  given  function  and  represent  it  as  indicated.  If  you 
have  a CAS,  graph  approximate  curves  obtained  by  replacing 
00  with  finite  limits;  also  look  for  Gibbs  phenomena. 


26.  fix ) = x + 1 if  0 < x < 1 and  0 otherwise;  by  the 
Fourier  sine  transform 


27.  fix ) = xifO  < x < 1 and  0 otherwise;  by  the  Fourier 
integral 

28.  fix)  = kx  if  a < x < b and  0 otherwise;  by  the  Fourier 
transform 


29.  fix)  = x if  1 < x < a and  0 otherwise;  by  the  Fourier 
cosine  transform 

30.  fix)  = e~2x  if  x > 0 and  0 otherwise;  by  the  Fourier 
transform 
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SUMMARY  OF  CH  APTER  1 1 ~ ■ 

Fourier  Analysis.  Partial  Differential  Equations  (PDEs) 


Fourier  series  concern  periodic  functions  fix)  of  period  p = 2 L,  that  is,  by 
definition  fix  + p ) = fix)  for  all  x and  some  fixed  p > 0;  thus,  fix  + np ) = fix) 
for  any  integer  n.  These  series  are  of  the  form 

oo  / \ 

/ Tlf^T  \ 

(1)  fix)  = a0  + 2 I ancos—— x + bn  sin—— x]  (Sec.  11.2) 

with  coefficients,  called  the  Fourier  coefficients  of fix),  given  by  the  Euler  formulas 
(Sec.  11.2) 


ao 


1 

2 L J 


fix)  dx,  an  = 


(2) 


-L 


L J 


\ n7TX  j 
fix)  cos  dx 


bn. 


L J 


t(  s • nTTx 
fix)  sin  dx 


where  n = 1,  2,  • ■ ■ . For  period  277  we  simply  have  (Sec.  11.1) 


oo 

(1*)  fix)  = a0  +2  iancosnx  + bn  sin  nx) 

n= 1 


with  the  Fourier  coefficients  of  fix)  (Sec.  11.1) 


«o 


277 


1 

. , 1 

fix)  dx,  an  = — 

fix)  cos  nx  dx,  bn  = — 

77  J 

— 7 T 

77  J 

— 7T 

/(x)  sin  nx  dx. 


Fourier  series  are  fundamental  in  connection  with  periodic  phenomena,  particularly 
in  models  involving  differential  equations  (Sec.  11.3,  Chap,  12).  If  fix)  is  even 
[/(— x)  = fix)]  or  odd  [/(— x)  = —fix)],  they  reduce  to  Fourier  cosine  or  Fourier 
sine  series,  respectively  (Sec.  11.2).  If  fix)  is  given  for  0 only,  it  has  two 

half-range  expansions  of  period  2 L,  namely,  a cosine  and  a sine  series  (Sec.  11.2). 

The  set  of  cosine  and  sine  functions  in  (1)  is  called  the  trigonometric  system. 
Its  most  basic  property  is  its  orthogonality  on  an  interval  of  length  2 L;  that  is,  for 
all  integers  m and  n =£  m we  have 


,L 

COS 

-L 


mTTx 

cos 

L 


J17TX 

L 


dx  = 0, 


,L 

sin 

-L 


imrx 

L 


sin 


nTTx 

L 


dx  = 0 


and  for  all  integers  m and  n. 


nnrx  . nTTx  , 
cos sin dx  = 0. 


This  orthogonality  was  crucial  in  deriving  the  Euler  formulas  (2). 


Summary  of  Chapter  11 
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Partial  sums  of  Fourier  series  minimize  the  square  error  (Sec.  1 1 .4). 

Replacing  the  trigonometric  system  in  (1)  by  other  orthogonal  systems  first  leads 
to  Sturm-Liouville  problems  (Sec.  11.5),  which  are  boundary  value  problems  for 
ODEs.  These  problems  are  eigenvalue  problems  and  as  such  involve  a parameter 
A that  is  often  related  to  frequencies  and  energies.  The  solutions  to  Sturm-Liouville 
problems  are  called  eigenfunctions . Similar  considerations  lead  to  other  orthogonal 
series  such  as  F ourier-Legendre  series  and  Fourier-Bessel  series  classified  as 
generalized  Fourier  series  (Sec.  11.6). 

Ideas  and  techniques  of  Fourier  series  extend  to  nonperiodic  functions /(T)  defined 
on  the  entire  real  line;  this  leads  to  the  Fourier  integral 


(3) 


m 


[A(w)  cos  wx  + B(w ) sin  wx]  dw 
■'o 


(Sec.  11.7) 


where 


(4) 


A(w)  = — 


f(v)  cos  wv  dv,  B(w)  = ^ 


f(v)  sin  wv  dv 


or,  in  complex  form  (Sec.  11.9), 


(5) 

where 

(6) 


f(x)  = 


V27 T . 


fiw)elwxdw 


a = v^i) 


f(w)  = 


V2tt  . 


f(x)e~lwxdx. 


Formula  (6)  transforms /(x)  into  its  Fourier  transform /(w),  and  (5)  is  the  inverse 
transform. 

Related  to  this  are  the  Fourier  cosine  transform  (Sec.  11.8) 


(7) 


/c(w)  = J — 


f(x ) cos  wx  dx 


and  the  Fourier  sine  transform  (Sec.  11.8) 

^2 


(8) 


fsiw)  = J — 


f(x)  sin  wx  dx . 


The  discrete  Fourier  transform  (DFT)  and  a practical  method  of  computing  it, 
called  the  fast  Fourier  transform  (FFT),  are  discussed  in  Sec.  11.9. 


CHAPTER 


1 2 


Partial  Differential 
Equations  (PDEs) 


A PDE  is  an  equation  that  contains  one  or  more  partial  derivatives  of  an  unknown  function 
that  depends  on  at  least  two  variables.  Usually  one  of  these  deals  with  time  t and  the 
remaining  with  space  (spatial  variable(s)).  The  most  important  PDEs  are  the  wave 
equations  that  can  model  the  vibrating  string  (Secs.  12.2,  12.3,  12.4,  12.12)  and  the 
vibrating  membrane  (Secs.  12.8,  12.9,  12.10),  the  heat  equation  for  temperature  in  a bar 
or  wire  (Secs.  12.5,  12.6),  and  the  Laplace  equation  for  electrostatic  potentials  (Secs. 
12.6,  12.10,  12.11).  PDEs  are  very  important  in  dynamics,  elasticity,  heat  transfer, 
electromagnetic  theory,  and  quantum  mechanics.  They  have  a much  wider  range  of 
applications  than  ODEs,  which  can  model  only  the  simplest  physical  systems.  Thus  PDEs 
are  subjects  of  many  ongoing  research  and  development  projects. 

Realizing  that  modeling  with  PDEs  is  more  involved  than  modeling  with  ODEs,  we 
take  a gradual,  well-planned  approach  to  modeling  with  PDEs.  To  do  this  we  carefully 
derive  the  PDE  that  models  the  phenomena,  such  as  the  one-dimensional  wave  equation 
for  a vibrating  elastic  string  (say  a violin  string)  in  Sec.  12.2,  and  then  solve  the  PDE 
in  a separate  section,  that  is.  Sec.  12.3.  In  a similar  vein,  we  derive  the  heat  equation  in 
Sec.  12.5  and  then  solve  and  generalize  it  in  Sec.  12.6. 

We  derive  these  PDEs  from  physics  and  consider  methods  for  solving  initial  and 
boundary  value  problems,  that  is,  methods  of  obtaining  solutions  which  satisfy  the 
conditions  required  by  the  physical  situations.  In  Secs.  12.7  and  12.12  we  show  how  PDEs 
can  also  be  solved  by  Fourier  and  Laplace  transform  methods. 

COMMENT.  Numerics  for  PDEs  is  explained  in  Secs.  21.4-21.7,  which,  for  greater 
teaching  flexibility,  is  designed  to  be  independent  of  the  other  sections  on  numerics  in 
Part  E. 

Prerequisites:  Linear  ODEs  (Chap.  2),  Fourier  series  (Chap.  11). 

Sections  that  may  be  omitted  in  a shorter  course:  12.7,  12.10-12.12. 

References  and  Answers  to  Problems:  App.  1 Part  C,  App.  2. 

12.  Basic  Concepts  of  PDEs 

A partial  differential  equation  (PDE)  is  an  equation  involving  one  or  more  partial 
derivatives  of  an  (unknown)  function,  call  it  u,  that  depends  on  two  or  more  variables, 
often  time  t and  one  or  several  variables  in  space.  The  order  of  the  highest  derivative  is 
called  the  order  of  the  PDE.  Just  as  was  the  case  for  ODEs,  second-order  PDEs  will  be 
the  most  important  ones  in  applications. 
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EXAMPLE  1 


THEOREM  1 


Just  as  for  ordinary  differential  equations  (ODEs)  we  say  that  a PDE  is  linear  if  it  is 
of  the  first  degree  in  the  unknown  function  u and  its  partial  derivatives.  Otherwise  we 
call  it  nonlinear.  Thus,  all  the  equations  in  Example  1 are  linear.  We  call  a linear  PDE 
homogeneous  if  each  of  its  terms  contains  either  u or  one  of  its  partial  derivatives. 
Otherwise  we  call  the  equation  nonhomogeneous.  Thus,  (4)  in  Example  1 (with  / not 
identically  zero)  is  nonhomogeneous,  whereas  the  other  equations  are  homogeneous. 


Important  Second-Order  PDEs 


(1) 

d2lt  2 d2u 

~as~c  te2 

One-dimensional  wave  equation 

(2) 

du  2 d2u 

at  ~ c ax2 

One-dimensional  heat  equation 

(3) 

a2u  a2u 

+ = 0 

ax2  ay2 

Two-dimensional  Laplace  equation 

(4) 

a2u  a\ 

— + — = /(*,  y) 
ax  ay 2 

Two-dimensional  Poisson  equation 

(5) 

a2u  2 / d2w  a2u\ 

at2  W ay2) 

Two-dimensional  wave  equation 

(6) 

a2u  a2u  a2u 

— + — + =0 

ax2  ay2  az2 

Three-dimensional  Laplace  equation 

Here  c is  a positive  constant,  t is  time,  x , y,  z are  Cartesian  coordinates,  and  dimension  is  the  number  of  these 
coordinates  in  the  equation. 


A solution  of  a PDE  in  some  region  R of  the  space  of  the  independent  variables  is  a 
function  that  has  all  the  partial  derivatives  appearing  in  the  PDE  in  some  domain  D 
(definition  in  Sec.  9.6)  containing  R,  and  satisfies  the  PDE  everywhere  in  R. 

Often  one  merely  requires  that  the  function  is  continuous  on  the  boundary  of  R,  has 
those  derivatives  in  the  interior  of  R , and  satisfies  the  PDE  in  the  interior  of  R.  Letting  R 
lie  in  D simplifies  the  situation  regarding  derivatives  on  the  boundary  of  R,  which  is  then 
the  same  on  the  boundary  as  it  is  in  the  interior  of  R. 

In  general,  the  totality  of  solutions  of  a PDE  is  very  large.  For  example,  the  functions 

(7)  u = x — y , u = e cos  y,  u = sin  x cosh  y,  u = In  (.r  + y ) 

which  are  entirely  different  from  each  other,  are  solutions  of  (3),  as  you  may  verify.  We 
shall  see  later  that  the  unique  solution  of  a PDE  corresponding  to  a given  physical  problem 
will  be  obtained  by  the  use  of  additional  conditions  arising  from  the  problem.  For  instance, 
this  may  be  the  condition  that  the  solution  u assume  given  values  on  the  boundary  of  the 
region  R (“boundary  conditions”).  Or,  when  time  t is  one  of  the  variables,  u (or  ut  = du/dt 
or  both)  may  be  prescribed  at  t = 0 (“initial  conditions”). 

We  know  that  if  an  ODE  is  linear  and  homogeneous,  then  from  known  solutions  we 
can  obtain  further  solutions  by  superposition.  For  PDEs  the  situation  is  quite  similar: 


Fundamental  Theorem  on  Superposition 

If  u ] and  u2  are  solutions  of  a homogeneous  linear  PDE  in  some  region  R,  then 

U = C\Uy  + C2U2 

with  any  constants  C\  and  c2  is  also  a solution  of  that  PDE  in  the  region  R. 
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The  simple  proof  of  this  important  theorem  is  quite  similar  to  that  of  Theorem  1 in  Sec.  2. 1 
and  is  left  to  the  student. 

Verification  of  solutions  in  Probs.  2-13  proceeds  as  for  ODEs.  Problems  16-23  concern 
PDEs  solvable  like  ODEs.  To  help  the  student  with  them,  we  consider  two  typical  examples. 

Solving  uxx  — u — 0 Like  an  ODE 

Find  solutions  u of  the  PDE  uxx  — u = 0 depending  on  x and  y. 

Solution.  Since  no  y-derivatives  occur,  we  can  solve  this  PDE  like  u"  — u = 0.  In  Sec.  2.2  we  would  have 
obtained  u = Aex  + Be~x  with  constant  A and  B.  Here  A and  B may  be  functions  of  y,  so  that  the  answer  is 

u(x,y)  = A(y)ex  + B{y)e~x 

with  arbitrary  functions  A and  B.  We  thus  have  a great  variety  of  solutions.  Check  the  result  by  differentiation. 

EXAMPLE  3 Solving  uxy  = -ux  Like  an  ODE 

Find  solutions  u = u(x,  y)  of  this  PDE. 

Solution.  Setting  ux  = p,  we  have  py  = —p,  py/p  = — 1,  In  \p\  = —y  + c(x),  p = c{x)e~y  and  by 
integration  with  respect  to  x, 

u (a,  y)  = f{x)e~y  + g(y)  where  f(x ) = ^c(x)dx, 

here,  fix)  and  g(y)  are  arbitrary. 


1.  Fundamental  theorem.  Prove  it  for  second-order 
PDEs  in  two  and  three  independent  variables.  Hint. 
Prove  it  by  substitution. 


2-13  VERIFICATION  OF  SOLUTIONS 

Verifiy  (by  substitution)  that  the  given  function  is  a solution 
of  the  PDE.  Sketch  or  graph  the  solution  as  a surface  in  space. 


2-5  Wave  Equation  (1)  with  suitable  c 

2.  u — x + t 

3 . u — cos  At  sin  2x 

4.  u = sin  kct  cos  kx 

5.  u — sin  at  sin  bx 


6-9 


Heat  Equation  (2)  with  suitable  c 


7.  u = e~ 

8.  u = e~ 


10-13  Laplace  Equation  (3) 

10.  u = ex  cos  y,  ex  sin  y 

11.  u = arctan  {y/x) 

12.  u = cos  y sinh  x,  sin  y cosh  x 


13.  u = x/{x 2 + y2),  yj(x 2 + y2) 

14.  TEAM  PROJECT.  Verification  of  Solutions 

(a)  Wave  equation.  Verify  that  u (x,  t)  = v(x  + cl)  + 
w (x  — ct ) with  any  twice  differentiable  functions  v and 
w satisfies  (1). 

(b)  Poisson  equation.  Verify  that  each  u satisfies  (4) 
with/(;t,  y)  as  indicated. 

u=y/x  f = 2y/x3 

u = sin  xy  / = (x2  + y2)  sin  xy 

u = ex2~y2  / = 4(jc2  + y2)ex2~y2 

u = l/Vx2  + y2  f=(x2  + y2)“3/2 

(c)  Laplace  equation.  Verify  that 

u = 1/Vjc2  + y2  + z2  satisfies  (6)  and 
n = In  (x2  + y2)  satisfies  (3).  Is  u = 1/Vi2  + y2  a 
solution  of  (3)?  Of  what  Poisson  equation? 

(d)  Verify  that  u with  any  (sufficiently  often  differ- 
entiable) v and  w satisfies  the  given  PDE. 

u — v( x)  + w(y)  uxy  = 0 

u = v(x)w(y)  tiuXy  = uxuy 

u = v(x  + 2r)  + w(x  — 2 1)  utt  = 4 uxx 

15.  Boundary  value  problem.  Verify  that  the  function 
u(x,  y)  = a In  (x2  + y2)  + b satisfies  Laplace's  equation 
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(3)  and  determine  a and  b so  that  u satisfies  the 
boundary  conditions  u = 110  on  the  circle 
x2  + y2  = 1 and  « = 0on  the  circle  x2  + y2  = 100. 

PDEs  SOLVABLE  AS  ODEs 

This  happens  if  a PDE  involves  derivatives  with  respect  to 
one  variable  only  (or  can  be  transformed  to  such  a form), 
so  that  the  other  variable(s)  can  be  treated  as  parameter(s). 
Solve  for  u = u(x,  y): 

16.  UyV  = 0 17.  uxx  + 167 t2u  = 0 


18.  25 uyy  — Au  = 0 19.  uy  + y2u  = 0 

20.  2 uxx  + 9 ux  + 4 u = —3  cosx  — 29  sinx 

21.  Uyy  + 6Uy  +13  u — 4e:>‘y 

22.  uxy  = ux  23.  x2uxx  + 2 xux  — 2u  = 0 

24.  Surface  of  revolution.  Show  that  the  solutions  z = 
z(x,  y)  of  yzx  — xzy  represent  surfaces  of  revolution.  Give 
examples.  Hint.  Use  polar  coordinates  r,  0 and  show  that 
the  equation  becomes  z$  = 0. 

25.  System  of  PDEs.  Solve  uxx  = 0,  uyv  = 0 


Modeling:  Vibrating  String,  Wave  Equation 

In  this  section  we  model  a vibrating  string,  which  will  lead  to  our  first  important  PDE, 
that  is,  equation  (3)  which  will  then  be  solved  in  Sec.  12.3.  The  student  should  pay  very 
close  attention  to  this  delicate  modeling  process  and  detailed  derivation  starting  from 
scratch,  as  the  skills  learned  can  be  applied  to  modeling  other  phenomena  in  general  and 
in  particular  to  modeling  a vibrating  membrane  (Sec.  12.7). 

We  want  to  derive  the  PDE  modeling  small  transverse  vibrations  of  an  elastic  string,  such 
as  a violin  string.  We  place  the  string  along  the  x-axis,  stretch  it  to  length  L,  and  fasten  it 
at  the  ends  x = 0 and  x = L.  We  then  distort  the  string,  and  at  some  instant,  call  it  t = 0, 
we  release  it  and  allow  it  to  vibrate.  The  problem  is  to  determine  the  vibrations  of  the  string, 
that  is,  to  find  its  deflection  u(x,  t)  at  any  point  x and  at  any  time  t > 0;  see  Fig.  286. 

u (x,  t)  will  be  the  solution  of  a PDE  that  is  the  model  of  our  physical  system  to  be 
derived.  This  PDE  should  not  be  too  complicated,  so  that  we  can  solve  it.  Reasonable 
simplifying  assumptions  (just  as  for  ODEs  modeling  vibrations  in  Chap.  2)  are  as  follows. 

Physical  Assumptions 

1.  The  mass  of  the  string  per  unit  length  is  constant  (“homogeneous  string”).  The  string 
is  perfectly  elastic  and  does  not  offer  any  resistance  to  bending. 

2.  The  tension  caused  by  stretching  the  string  before  fastening  it  at  the  ends  is  so  large 
that  the  action  of  the  gravitational  force  on  the  string  (trying  to  pull  the  string  down 
a little)  can  be  neglected. 

3.  The  string  performs  small  transverse  motions  in  a vertical  plane;  that  is,  every 
particle  of  the  string  moves  strictly  vertically  and  so  that  the  deflection  and  the  slope 
at  every  point  of  the  string  always  remain  small  in  absolute  value. 

Under  these  assumptions  we  may  expect  solutions  u (x,  t ) that  describe  the  physical 
reality  sufficiently  well. 


Fig.  286.  Deflected  string  at  fixed  time  t.  Explanation  on  p.  544 
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Derivation  of  the  PDE  of  the  Model 
(“Wave  Equation")  from  Forces 

The  model  of  the  vibrating  string  will  consist  of  a PDE  (“wave  equation”)  and  additional 
conditions.  To  obtain  the  PDE,  we  consider  the  forces  acting  on  a small  portion  of  the 
string  (Fig.  286).  This  method  is  typical  of  modeling  in  mechanics  and  elsewhere. 

Since  the  string  offers  no  resistance  to  bending,  the  tension  is  tangential  to  the  curve 
of  the  string  at  each  point.  Let  7)  and  T2  be  the  tension  at  the  endpoints  P and  Q of  that 
portion.  Since  the  points  of  the  string  move  vertically,  there  is  no  motion  in  the  horizontal 
direction.  Hence  the  horizontal  components  of  the  tension  must  be  constant.  Using  the 
notation  shown  in  Fig.  286,  we  thus  obtain 

( 1 ) Ti  cos  a = Tz  cos  [3  = T = const. 

In  the  vertical  direction  we  have  two  forces,  namely,  the  vertical  components  — 7}  sin  a 
and  72  sin  /3  of  7i  and  72;  here  the  minus  sign  appears  because  the  component  at  P is 
directed  downward.  By  Newton’s  second  law  (Sec.  2.4)  the  resultant  of  these  two  forces 
is  equal  to  the  mass  p Ax  of  the  portion  times  the  acceleration  d2u/dt2,  evaluated  at  some 
point  between  x and  x + Ax;  here  p is  the  mass  of  the  undeflected  string  per  unit  length, 
and  Ax  is  the  length  of  the  portion  of  the  undeflected  string.  (A  is  generally  used  to  denote 
small  quantities;  this  has  nothing  to  do  with  the  Laplacian  V2,  which  is  sometimes  also 
denoted  by  A.)  Hence 


T2  sin  (3  — 7i  sin  a = p Ax 


•>2 
d ll 


dt' 


Using  (1),  we  can  divide  this  by  T2  cos  f3  = T\  cos  a = T,  obtaining 

(2) 


T2  sin  B 7i  sin  a pAx  d2U 

T2  cos  (3  Tl  cos  a T dtz 


Now  tan  a and  tan  /3  are  the  slopes  of  the  string  at  x and  x + Ax: 

tan  /3  = 


tan  a = 


and 


x + Ax 


Here  we  have  to  write  partial  derivatives  because  11  also  depends  on  time  t.  Dividing  (2) 
by  Ax,  we  thus  have 


Ax 


x + Ax 


P d\ 
T dt2' 


If  we  let  Ax  approach  zero,  we  obtain  the  linear  PDE 


(3) 


■>2 

O U _ 2 o U 

dt2  C dx2? 


2 


C 


T 

P' 


This  is  called  the  one-dimensional  wave  equation.  We  see  that  it  is  homogeneous  and 
of  the  second  order.  The  physical  constant  T/p  is  denoted  by  c2  (instead  of  c)  to  indicate 
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that  this  constant  is  positive,  a fact  that  will  be  essential  to  the  form  of  the  solutions.  “One- 
dimensional”  means  that  the  equation  involves  only  one  space  variable,  x.  In  the  next 
section  we  shall  complete  setting  up  the  model  and  then  show  how  to  solve  it  by  a general 
method  that  is  probably  the  most  important  one  for  PDEs  in  engineering  mathematics. 

12.3  Solution  by  Separating  Variables. 

Use  of  Fourier  Series 


We  continue  our  work  from  Sec.  12.2,  where  we  modeled  a vibrating  string  and  obtained 
the  one-dimensional  wave  equation.  We  now  have  to  complete  the  model  by  adding 
additional  conditions  and  then  solving  the  resulting  model. 

The  model  of  a vibrating  elastic  string  (a  violin  string,  for  instance)  consists  of  the  one- 
dimensional wave  equation 


(1) 


^2 

a u _ 2 d u 
dt2  C dx2 


2 


C = 


T 

P 


for  the  unknown  deflection  u{x,  t)  of  the  string,  a PDE  that  we  have  just  obtained,  and 
some  additional  conditions,  which  we  shall  now  derive. 

Since  the  string  is  fastened  at  the  ends  x = 0 and  x = L (see  Sec.  12.2),  we  have  the 

two  boundary  conditions 


(2)  (a)  m(0,  t)  = 0,  (b)  u(L,  t)  = 0,  for  all  t g 0. 


Furthermore,  the  form  of  the  motion  of  the  string  will  depend  on  its  initial  deflection 
(deflection  at  time  t = 0),  call  it  f(x),  and  on  its  initial  velocity  (velocity  at  t = 0),  call  it 
g(x).  We  thus  have  the  two  initial  conditions 

(3)  (a)  u(x,  0)  = f(x),  (b)  ut(x,  0)  = g(x)  (0  S x S L) 

where  ut  = du/dt.  We  now  have  to  find  a solution  of  the  PDE  (1)  satisfying  the  conditions 
(2)  and  (3).  This  will  be  the  solution  of  our  problem.  We  shall  do  this  in  three  steps,  as 
follows. 

Step  1.  By  the  “method  of  separating  variables”  or  product  method,  setting 
u{x,t)  = F(x)G(t),  we  obtain  from  (1)  two  ODEs,  one  for  F(x)  and  the  other  one 
for  G(t). 

Step  2.  We  determine  solutions  of  these  ODEs  that  satisfy  the  boundary  conditions  (2). 

Step  3.  Finally,  using  Fourier  series,  we  compose  the  solutions  found  in  Step  2 to  obtain 
a solution  of  (1)  satisfying  both  (2)  and  (3),  that  is,  the  solution  of  our  model  of  the 
vibrating  string. 

Step  1.  Two  ODEs  from  the  Wave  Equation  (1) 

In  the  method  of  separating  variables,  or  product  method,  we  determine  solutions  of  the 
wave  equation  (1)  of  the  form 


(4) 


u(x,  t)  = F{x)G(t) 
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which  are  a product  of  two  functions,  each  depending  on  only  one  of  the  variables  x and  t. 
This  is  a powerful  general  method  that  has  various  applications  in  engineering  mathematics, 
as  we  shall  see  in  this  chapter.  Differentiating  (4),  we  obtain 


d2u 

dt2 


= FG 


and 


•>2 
d U 


dx * 


= f"g 


where  dots  denote  derivatives  with  respect  to  t and  primes  derivatives  with  respect  to  x. 
By  inserting  this  into  the  wave  equation  (1)  we  have 

FG  = c2F"G. 

Dividing  by  c FG  and  simplifying  gives 

G _ F" 
czG  ~ F 


The  variables  are  now  separated,  the  left  side  depending  only  on  t and  the  right  side  only 
on  x.  Hence  both  sides  must  be  constant  because,  if  they  were  variable,  then  changing  t 
or  x would  affect  only  one  side,  leaving  the  other  unaltered.  Thus,  say, 


G 

c G 


k. 


Multiplying  by  the  denominators  gives  immediately  two  ordinary  DEs 

(5)  F"  - kF  = 0 
and 

(6)  G - czkG  = 0. 


Here,  the  separation  constant  k is  still  arbitrary. 


Step  2.  Satisfying  the  Boundary  Conditions  (2) 

We  now  determine  solutions  F and  G of  (5)  and  (6)  so  that  u = FG  satisfies  the  boundary 
conditions  (2),  that  is, 

(7)  u{ 0,  t)  = F(0)G(f)  = 0,  u(L,  t)  = F(L)G(t ) = 0 for  all  t. 

We  first  solve  (5).  If  G = 0,  then  u = FG  = 0,  which  is  of  no  interest.  Hence  G ^ 0 
and  then  by  (7), 

(8)  (a)  F(0)  = 0,  (b)  F(L)  = 0. 

We  show  that  k must  be  negative.  For  k = 0 the  general  solution  of  (5)  is  F = ax  + b, 
and  from  (8)  we  obtain  a = b = 0,  so  that  F = 0 and  u = FG  = 0,  which  is  of  no  interest. 
For  positive  k = /jl2  a general  solution  of  (5)  is 


F = Ae^x  + Be~ ^ 
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and  from  (8)  we  obtain  F = 0 as  before  (verify!).  Hence  we  are  left  with  the  possibility 
of  choosing  k negative,  say,  k = —p2.  Then  (5)  becomes  F"  + p2F  = 0 and  has  as  a 
general  solution 

F(x)  = A cos  px  + B sin  px. 


From  this  and  (8)  we  have 

F( 0)  = A = 0 and  then  F(L)  = B sin  pL  = 0. 

We  must  take  B ¥=  0 since  otherwise  F = 0.  Hence  sin  pL  = 0.  Thus 

n7T 

(9)  pL  = mr,  so  that  p = — (n  integer). 

Setting  B = 1,  we  thus  obtain  infinitely  many  solutions  F(x)  = Fn  (x),  where 

(10)  Fn (jc)  = sin (n  = 1,2,  •■■). 

These  solutions  satisfy  (8).  [For  negative  integer  n we  obtain  essentially  the  same  solutions, 
except  for  a minus  sign,  because  sin  (— a ) = —sin  a.] 

2/0  r 

We  now  solve  (6)  with  k = —p  = — ( mr/L ) resulting  from  (9),  that  is, 

(11*)  G+  A 2G  = 0 where  \n  = cp  = 

A general  solution  is 


Gn(t ) = Bn  cos  A nt  + B%  sin  A nt. 

Hence  solutions  of  (1)  satisfying  (2)  are  un(x,  t)  = Fn(x)Gn(t)  = Gn(t)Fn(x),  written  out 

nTT 

(11)  un(x,  t)  = ( Bn  cos  \nt  + B%  sin  A nt)  sin  (n  = 1,  2,  • • • ). 

These  functions  are  called  the  eigenfunctions,  or  characteristic  functions,  and  the  values 
\n  = cmr/L  are  called  the  eigenvalues,  or  characteristic  values,  of  the  vibrating  string. 
The  set  { A1;  A2,  • • • } is  called  the  spectrum. 

Discussion  of  Eigenfunctions.  W e see  that  each  un  represents  a harmonic  motion  having 
the  frequency  An/27T  = cn/2L  cycles  per  unit  time.  This  motion  is  called  the  nth  normal 
mode  of  the  string.  The  first  normal  mode  is  known  as  the  fundamental  mode  (n  = 1), 
and  the  others  are  known  as  overtones;  musically  they  give  the  octave,  octave  plus  fifth, 
etc.  Since  in  (11) 


. mtx 
sin  —j—  = 0 


at 


L 2L 
n ’ n ’ 


the  nth  normal  mode  has  n — 1 nodes,  that  is,  points  of  the  string  that  do  not  move  (in 
addition  to  the  fixed  endpoints);  see  Fig.  287. 
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72  = 1 72  = 2 72  = 3 72  = 4 

Fig.  287.  Normal  modes  of  the  vibrating  string 

Figure  288  shows  the  second  normal  mode  for  various  values  of  t.  At  any  instant  the 
string  has  the  form  of  a sine  wave.  When  the  left  part  of  the  string  is  moving  down,  the 
other  half  is  moving  up,  and  conversely.  For  the  other  modes  the  situation  is  similar. 

Tuning  is  done  by  changing  the  tension  T.  Our  formula  for  the  frequency  \n/2ir  = cn/lL 
of  un  with  c = VT/p  [see  (3),  Sec.  12.2]  confirms  that  effect  because  it  shows  that  the 
frequency  is  proportional  to  the  tension.  T cannot  be  increased  indefinitely,  but  can  you 
see  what  to  do  to  get  a string  with  a high  fundamental  mode?  (Think  of  both  L and  p.) 
Why  is  a violin  smaller  than  a double-bass? 


Fig.  288.  Second  normal  mode  for  various  values  of  t 


Step  3.  Solution  of  the  Entire  Problem.  Fourier  Series 

The  eigenfunctions  (11)  satisfy  the  wave  equation  (1)  and  the  boundary  conditions  (2) 
(string  fixed  at  the  ends).  A single  un  will  generally  not  satisfy  the  initial  conditions  (3). 
But  since  the  wave  equation  (1)  is  linear  and  homogeneous,  it  follows  from  Fundamental 
Theorem  1 in  Sec.  12.1  that  the  sum  of  finitely  many  solutions  un  is  a solution  of  (1).  To 
obtain  a solution  that  also  satisfies  the  initial  conditions  (3),  we  consider  the  infinite  series 
(with  \n  = cmr/L  as  before) 

oo  oo  HJ  l 

(12)  u (x,  t)  = ^ un  ( x , t)  = ^ (Bn  cos  A,,/  + B%  sin  A nt)  sin  — x. 

22=1  21=1 

Satisfying  Initial  Condition  (3a)  (Given  Initial  Displacement).  From  (12)  and  (3a) 
we  obtain 

oo 

(13)  u(x,  0)  = 2 Bn  sin  —x  = /(x).  (0  g x g L). 

n=  1 B 

Hence  we  must  choose  the  Bn’s  so  that  u(x,  0)  becomes  the  Fourier  sine  series  of/(x). 
Thus,  by  (4)  in  Sec.  11.3, 


(14) 


,L 


Br,  = 


L 


ft  s • ni TX 

/(x)  sin  — — dx. 


n — 1,2, 


o 


L 
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Satisfying  Initial  Condition  (3b)  (Given  Initial  Velocity).  Similarly,  by  differentiating 
(12)  with  respect  to  t and  using  (3b),  we  obtain 


du 


dt 


t= o 


2 (~BnK  sin  A nt  + B*\n  cos  A nt)  sin 

-n= 1 


Jt= 0 


^ , x . nil  A , x 

= ^B*\n  sin  -j-  = g (x). 
n= 1 


Hence  we  must  choose  the  B* ’ s so  that  for  t = 0 the  derivative  du/dt  becomes  the  Fourier 
sine  series  of  g(x).  Thus,  again  by  (4)  in  Sec.  11.3, 


B*\n 


2 

L 


,L 

g(x)  sin 

. 

o 


7777X 

L 


dx. 


Since  \n  = cmr/L , we  obtain  by  division 


(15) 


B*  = 


2 

cmr 


,L 

g{x)  sin 


IITTX 

L c/x’ 


n = 1,  2,  • • • . 


Result.  Our  discussion  shows  that  u(x,  t)  given  by  (12)  with  coefficients  (14)  and  (15) 
is  a solution  of  (1)  that  satisfies  all  the  conditions  in  (2)  and  (3),  provided  the  series  (12) 
converges  and  so  do  the  series  obtained  by  differentiating  (12)  twice  termwise  with  respect 
to  x and  t and  have  the  sums  d2u/dx2  and  d2u/dt2,  respectively,  which  are  continuous. 


Solution  (12)  Established.  According  to  our  derivation,  the  solution  (12)  is  at  first  a 
purely  formal  expression,  but  we  shall  now  establish  it.  For  the  sake  of  simplicity  we 
consider  only  the  case  when  the  initial  velocity  g(x)  is  identically  zero.  Then  the  /!);  are 
zero,  and  (12)  reduces  to 


(16) 


i{x,t)  — cos  A nt  sin 


IITTX 


A 


cmr 


n= 1 


It  is  possible  to  sum  this  series,  that  is,  to  write  the  result  in  a closed  or  finite  form.  For 
this  purpose  we  use  the  formula  [see  (11),  App.  A3.1] 


C/777  7777  1 

cos 1 sin x = — 

L L 2 


j 7777  i f n7T  } 

sin  •!  —j~  (x  — ct)  > + sin  < —j—  {x  + ct)  1 


Consequently,  we  may  write  (16)  in  the  form 


1 “ fnTT  \ 1 ^ f 7777  1 

(x,  t)  = - 2jBn  Sin  < —(x  - ct)  > + - 2jBn  sm  < — (x  + ct)  > . 

L n= 1 ^ ) 1 n=1  f J 


These  two  series  are  those  obtained  by  substituting  x — ct  and  x + ct,  respectively,  for 
the  variable  x in  the  Fourier  sine  series  (13)  for/(x).  Thus 


(17) 


u(x,  t)  = \ [f*(x  - ct)  +f*(x  + ct)] 
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EXAMPLE  1 


where/*  is  the  odd  periodic  extension  of/with  the  period  2 L (Fig.  289).  Since  the  initial 
deflection  fix)  is  continuous  on  the  interval  0 and  zero  at  the  endpoints,  it  follows 

from  (17)  that  u(x,  t)  is  a continuous  function  of  both  variables  x and  t for  all  values  of 
the  variables.  By  differentiating  (17)  we  see  that  u{x,  t ) is  a solution  of  (1),  provided /(jc) 
is  twice  differentiable  on  the  interval  0 < x < L,  and  has  one-sided  second  derivatives  at 
x = 0 and  x = L,  which  are  zero.  Under  these  conditions  u (x,  t ) is  established  as  a solution 
of  (1),  satisfying  (2)  and  (3)  with  g(x)  = 0. 


Fig.  289.  Odd  periodic  extension  of  f(x) 


Generalized  Solution.  If f'(x)  and/”(x)  are  merely  piecewise  continuous  (see  Sec.  6.1), 
or  if  those  one-sided  derivatives  are  not  zero,  then  for  each  t there  will  be  finitely  many 
values  of  x at  which  the  second  derivatives  of  u appearing  in  (1)  do  not  exist.  Except  at 
these  points  the  wave  equation  will  still  be  satisfied.  We  may  then  regard  u(x,  t ) as  a 
“generalized  solution,”  as  it  is  called,  that  is,  as  a solution  in  a broader  sense.  For  instance, 
a triangular  initial  deflection  as  in  Example  1 (below)  leads  to  a generalized  solution. 

Physical  Interpretation  of  the  Solution  (17).  The  graph  of /*  ( x — ct ) is  obtained  from 
the  graph  of  f*(x)  by  shifting  the  latter  ct  units  to  the  right  (Fig.  290).  This  means  that 
/*  ( x — ct)(c  > 0)  represents  a wave  that  is  traveling  to  the  right  as  1 increases.  Similarly, 
f*(x  + ct)  represents  a wave  that  is  traveling  to  the  left,  and  u(x,  t)  is  the  superposition 
of  these  two  waves. 


Vibrating  String  if  the  Initial  Deflection  Is  Triangular 

Find  the  solution  of  the  wave  equation  (1)  satisfying  (2)  and  corresponding  to  the  triangular  initial  deflection 

L 

if  0 < x < - 

2 

L 

if  —<  x < L 

2 

and  initial  velocity  zero.  (Figure  291  shows /(x)  = u(x,  0)  at  the  top.) 

Solution.  Since  g(x)  = 0,  we  have  itf,  = 0 in  (12),  and  from  Example  4 in  Sec.  1 1.3  we  see  that  the  Bn  are 
given  by  (5),  Sec.  1 1.3.  Thus  (12)  takes  the  form 


fix)  = 


2k 

~l(L  ~ x) 


8 k 

u(x,  f)  = — £ 


3tt 


3ttc 


r sin  — x cos  — 

L L 
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For  graphing  the  solution  we  may  use  u (x,  0)  = f(x)  and  the  above  interpretation  of  the  two  functions  in  the 
representation  (17).  This  leads  to  the  graph  shown  in  Fig.  291. 


\ f*(x-L) 

= \f*(x  + L) 

Fig.  291.  Solution  u(x,  t)  in  Example  1 for  various  values  of  t (right  part 
of  the  figure)  obtained  as  the  superposition  of  a wave  traveling  to  the 
right  (dashed)  and  a wave  traveling  to  the  left  (left  part  of  the  figure) 


FRGBLE-W^SET— 1-^1 


1.  Frequency.  How  does  the  frequency  of  the  fundamental 
mode  of  the  vibrating  string  depend  on  the  length  of  the 
string?  On  the  mass  per  unit  length?  What  happens  if 
we  double  the  tension?  Why  is  a contrabass  larger  than 
a violin? 

2.  Physical  Assumptions.  How  would  the  motion  of 
the  string  change  if  Assumption  3 were  violated? 
Assumption  2?  The  second  part  of  Assumption  1?  The 
first  part?  Do  we  really  need  all  these  assumptions? 

3.  String  of  length  tt.  Write  down  the  derivation  in  this 
section  for  length  L — tt,  to  see  the  very  substantial 
simplification  of  formulas  in  this  case  that  may  show 
ideas  more  clearly. 


4.  CAS  PROJECT.  Graphing  Normal  Modes.  Write  a 
program  for  graphing  un  with  L = n and  c2  of  your 
choice  similarly  as  in  Fig.  287.  Apply  the  program  to 
u2,  «3,  n4.  Also  graph  these  solutions  as  surfaces  over 
the  xr-plane.  Explain  the  connection  between  these  two 
kinds  of  graphs. 


5-13 


DEFLECTION  OF  THE  STRING 


Find  u ( x , t)  for  the  string  of  length  L—  1 and  c2  = 1 when 
the  initial  velocity  is  zero  and  the  initial  deflection  with  small 
k (say,  0.01)  is  as  follows.  Sketch  or  graph  u(x,  t)  as  in 
Fig.  291  in  the  text. 


5.  k sin  37 tx 

6.  k (sin  7 tx  — g sin  27 tx) 
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7.  kx (1  - x)  8.  kx2(  1 - x) 

9. 

0.1  - 


0.5  1 


11.  1 

4 


113  1 

4 2 4 


1 3 1 

4 4 


13.  2.x  - 4x2  if  0 < x < g,  0 if  g < x < 1 

14.  Nonzero  initial  velocity.  Find  the  deflection  u(x,  t ) of 
the  string  of  length  L = 77  and  c2  = 1 for  zero  initial  dis- 
placement and  “triangular”  initial  velocity  ut(x,  0)  = O.Olx 
if  0 S x S g77,  ut(x,  0)  = 0.01  (77  — x)  if  §77  S 
x S 77.  (Initial  conditions  with  ut  (x,  0)  F 0 are  hard 
to  realize  experimentally.) 


Fig.  292.  Elastic  beam 


y-axis  in  the  figure,  p = density,  A = cross-sectional 
area).  ( Bending  of  a beam  under  a load  is  discussed  in 
Sec.  3.3.) 

15.  Substituting  u = F(x)G(0  into  (21),  show  that 

F<a)/F  = -G/c2  G = / 34  = const, 

F(x)  = A cos  fix  + B sin  fix 

+ C cosh  fix  + D sinh  fix, 

G (f)  = a cos  c/32 1 + b sin  c/32 1. 


x = 0 


i 

i 


x = L 


(C)  Clamped  at  the  left 
end,  free  at  the 
right  end 


Fig.  293.  Supports  of  a beam 


16.  Simply  supported  beam  in  Fig.  293A.  Find  solutions 
un  = Fn(x)Gn(t)  of  (21)  corresponding  to  zero  initial 
velocity  and  satisfying  the  boundary  conditions  (see 
Fig.  293A) 

u (0,  t)  = 0,  u (L,  t)  — 0 
(ends  simply  supported  for  all  times  t), 

tt'xx  (0.  t)  0,  Uxx  (F,  t)  0 

(zero  moments,  hence  zero  curvature,  at  the  ends). 

17.  Find  the  solution  of  (21)  that  satisfies  the  conditions  in 
Prob.  16  as  well  as  the  initial  condition 

u(x,  0)  = /(x)  — x(L  — x). 


15-20 


SEPARATION  OF  A FOURTH-ORDER 
PDE.  VIBRATING  BEAM 


By  the  principles  used  in  modeling  the  string  it  can  be 
shown  that  small  free  vertical  vibrations  of  a uniform  elastic 
beam  (Fig.  292)  are  modeled  by  the  fourth-order  PDE 


i2  -4 

^ O U 9 O U 

(21)  —=  -c2—  (Ref.  [Cl  1]) 

dt2  3x4 

where  c2  = EI/pA  ( E = Young’s  modulus  of  elasticity, 
l = moment  of  intertia  of  the  cross  section  with  respect  to  the 


18.  Compare  the  results  of  Probs.  17  and  7.  What  is  the 
basic  difference  between  the  frequencies  of  the  normal 
modes  of  the  vibrating  string  and  the  vibrating  beam? 

19.  Clamped  beam  in  Fig.  293B.  What  are  the  boundary 
conditions  for  the  clamped  beam  in  Fig.  293B?  Show 
that  F in  Prob.  15  satisfies  these  conditions  if  fiL  is  a 
solution  of  the  equation 

(22)  cosh  fiL  cos  fiL  = I . 

Determine  approximate  solutions  of  (22),  for  instance, 
graphically  from  the  intersections  of  the  curves  of 
cos  fiL  and  1/cosh  fiL. 


SEC.  12.4  D’Alembert’s  Solution  of  the  Wave  Equation.  Characteristics 


553 


20.  Clamped-free  beam  in  Fig.  293C.  If  the  beam  is 
clamped  at  the  left  and  free  at  the  right  (Fig.  293C), 
the  boundary  conditions  are 

u (0,  t)  = 0,  ux  (0,  t)  = 0, 

^ xx  d'  0 0,  ^ xxx  (A  0 11- 


Show  that  F in  Prob.  15  satisfies  these  conditions  if  f3L 
is  a solution  of  the  equation 

(23)  cosh  /3L  cos  f5L  = —1. 

Find  approximate  solutions  of  (23). 


Characteristics 

12.3,  of  the  wave  equation 
2 d2U 

( 1 ) = c , 

dt2  dx2 

can  be  immediately  obtained  by  transforming  (1)  in  a suitable  way,  namely,  by  introducing 
the  new  independent  variables 

(2)  v = x + ct,  w = x — ct. 

Then  u becomes  a function  of  v and  w.  The  derivatives  in  (1)  can  now  be  expressed  in  terms 
of  derivatives  with  respect  to  v and  w by  the  use  of  the  chain  rule  in  Sec.  9.6.  Denoting 
partial  derivatives  by  subscripts,  we  see  from  (2)  that  vx  = 1 and  wx  = 1.  For  simplicity 
let  us  denote  u (x,  t),  as  a function  of  v and  vv,  by  the  same  letter  u.  Then 

mvux  T uwwx  uv  T uw. 

We  now  apply  the  chain  rule  to  the  right  side  of  this  equation.  We  assume  that  all  the 
partial  derivatives  involved  are  continuous,  so  that  uwv  = uvw.  Since  vx  = 1 and  wx=  1, 
we  obtain 


12/  D'Alemberts  Solution 
of  the  Wave  Equation. 

It  is  interesting  that  the  solution  (17),  Sec. 
...  d u 


M xx  (Mv  "f~  Uyo)x  *F  Uw)vUx  "1"  (Mv  T f^u^w^x  Uvv  "F  2 Uuw  4“  ll ^ 

Transforming  the  other  derivative  in  (1)  by  the  same  procedure,  we  find 

Utt  C {Uvu  ^ww)- 

By  inserting  these  two  results  in  (1)  we  get  (see  footnote  2 in  App.  A3. 2) 


(3) 


-,2 

d U 

dw  dv 


= 0. 


The  point  of  the  present  method  is  that  (3)  can  be  readily  solved  by  two  successive 
integrations,  first  with  respect  to  w and  then  with  respect  to  v.  This  gives 


dlt  i i \ 

— = h(v) 
dv 


and 


h(v)dv  + i jj{w). 
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Here  h(y ) and  ip  (w)  are  arbitrary  functions  of  v and  w,  respectively.  Since  the  integral  is 
a function  of  v,  say,  <p(v),  the  solution  is  of  the  form  u = <p(v)  + ip(w).  In  terms  of  x 
and  t , by  (2),  we  thus  have 


(4)  u (x,  t)  = 4>(x  + ct)  + ip(x  — ct). 

This  is  known  as  d’Alembert’s  solution1  of  the  wave  equation  (1). 

Its  derivation  was  much  more  elegant  than  the  method  in  Sec.  12.3,  but  d’Alembert’s  method 
is  special,  whereas  the  use  of  Fourier  series  applies  to  various  equations,  as  we  shall  see. 


D’Alembert’s  Solution  Satisfying  the  Initial  Conditions 

(5)  (a)  u(x,0)=f(x),  (b)  ut(x,  0)  = g(x). 

These  are  the  same  as  (3)  in  Sec.  12.3.  By  differentiating  (4)  we  have 

(6)  Utix,  t)  = ccp'(x  + ct)  — cip'(x  — ct) 

where  primes  denote  derivatives  with  respect  to  the  entire  arguments  x + ct  and  x — ct, 
respectively,  and  the  minus  sign  comes  from  the  chain  rule.  From  (4)-(6)  we  have 

(7)  u(x,  0)  = 4>(x)  + ijj  (x)  = f(x), 

(8)  ut(x,  0)  = c4>\x)  + ci//’(x)  = g{x). 

Dividing  (8)  by  c and  integrating  with  respect  to  x,  we  obtain 


(9)  <£(■*)  ~ <lt(x)  = k (a'o)  + - 


g(s)  ds,  k(x0)  = <f>(x0)  - (/f(x0). 


If  we  add  this  to  (7),  then  ip  drops  out  and  division  by  2 gives 


(10) 


(b  (x)  = \ fix)  + -1- 
2 2c  J 


g(s)  ds  + - k(x0). 


Similarly,  subtraction  of  (9)  from  (7)  and  division  by  2 gives 


(ID 


<Ha) 


1 

2 


fix)  - 


1 

2c 


rX 

g(s)  ds  - 


1 

2 


k(x  0). 


In  (10)  we  replace  x by  x + ct;  we  then  get  an  integral  from  xq  to  x + ct.  In  (11)  we 
replace  x by  x — ct  and  get  minus  an  integral  from  xq  to  x — ct  or  plus  an  integral  from 
x — ct  to  xq.  Hence  addition  of  cp  (x  + ct)  and  t p(x  — ct)  gives  u (x,  t)  [see  (4)]  in  the  form 


(12) 


u (x,  t) 


\ 

2 


[fix  + Ct)  + fix  - ct)} 


+ 


2c 


rX+Ct 

g(s)ds. 

Jx-ct 


1JEAN  LE  ROND  D’ALEMBERT  (1717-1783),  French  mathematician,  also  known  for  his  important  work 
in  mechanics. 

We  mention  that  the  general  theory  of  PDEs  provides  a systematic  way  for  finding  the  transformation  (2) 
that  simplifies  (1).  See  Ref.  [C8]  in  App.  1. 
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If  the  initial  velocity  is  zero,  we  see  that  this  reduces  to 
(13)  u(x,  t)  = 2 [fix  + ct)  + f{x  - ct )], 

in  agreement  with  (17)  in  Sec.  12.3.  You  may  show  that  because  of  the  boundary  conditions 
(2)  in  that  section  the  function  / must  be  odd  and  must  have  the  period  2 L. 

Our  result  shows  that  the  two  initial  conditions  [the  functions /(x)  and  g(x)  in  (5)] 
determine  the  solution  uniquely. 

The  solution  of  the  wave  equation  by  the  Laplace  transform  method  will  be  shown  in 
Sec.  12.11. 


Characteristics.  Types  and  Normal  Forms  of  PDEs 

The  idea  of  d’Alembert’s  solution  is  just  a special  instance  of  the  method  of  characteristics. 
This  concerns  PDEs  of  the  form 

(14)  ^BUy^y  + CUyy  F (X,  y,  l/,  llxi  My) 

(as  well  as  PDEs  in  more  than  two  variables).  Equation  (14)  is  called  quasilinear  because 
it  is  linear  in  the  highest  derivatives  (but  may  be  arbitrary  otherwise).  There  are  three 
types  of  PDEs  (14),  depending  on  the  discriminant  AC  — B2,  as  follows. 


Type 

Defining  Condition 

Example  in  Sec.  12.1 

Hyperbolic 

AC  - B2  < 0 

Wave  equation  (1) 

Parabolic 

AC  - B2  = 0 

Heat  equation  (2) 

Elliptic 

AC  - Bz>  0 

Laplace  equation  (3) 

Note  that  (1)  and  (2)  in  Sec.  12.1  involve  t,  but  to  have  y as  in  (14),  we  set  y = ct  in 

(1),  obtaining  utt  — c2uxx  = cziityy  — uxx)  = 0.  And  in  (2)  we  set  y = c2t,  so  that 

2 2 / \ 

C 14 gQjQ  C \Uy  14 xx ) * 

A,  B , C may  be  functions  of  x,  y,  so  that  a PDE  may  be  of  mixed  type,  that  is,  of  different 
type  in  different  regions  of  the  xy-plane.  An  important  mixed-type  PDE  is  the  Tricomi 
equation  (see  Prob.  10). 

Transformation  of  (14)  to  Normal  Form.  The  normal  forms  of  (14)  and  the  correspond- 
ing transformations  depend  on  the  type  of  the  PDE.  They  are  obtained  by  solving  the 
characteristic  equation  of  (14),  which  is  the  ODE 

(15)  Ay'2  - 2 By'  + C = 0 

where  y = dy/dx  (note  — 2/7,  not  +2/1).  The  solutions  of  (15)  are  called  the  characteristics 
of  (14),  and  we  write  them  in  the  form  <l>(x,  y)  = const  and  "'I'  (x,  y)  = const.  Then  the 
transformations  giving  new  variables  v,  w instead  of  x,  y and  the  normal  forms  of  (14)  are 
as  follows. 
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Type 

New  Variables 

Normal  Form 

Hyperbolic 

Parabolic 

Elliptic 

v = 5>  w = 

v = x w=(I>=''P 

v = t-(3>  + 'P)  w = 

2 2 1 

Mvw 

uww  ^2 

UVv  MWw  ^3 

Here,  <!>  = <I>(x,  y),  'P  = ^(x,  y),  F\  = Fi(v,  w,  u,  uv,  uw),  etc.,  and  we  denote  u as 
function  of  v,  w again  by  u.  for  simplicity.  We  see  that  the  normal  form  of  a hyperbolic 
PDE  is  as  in  d'Alembert’s  solution.  In  the  parabolic  case  we  get  just  one  family  of  solutions 
<D  = Mf.  In  the  elliptic  case,  i = V—  1,  and  the  characteristics  are  complex  and  are  of 
minor  interest.  For  derivation,  see  Ref.  [GenRef3]  in  App.  1. 

D’Alembert’s  Solution  Obtained  Systematically 

The  theory  of  characteristics  gives  d’Alembert’s  solution  in  a systematic  fashion.  To  see  this,  we  write  the  wave 
equation  utt  — czuxx  = 0 in  the  form  (14)  by  setting  y — ct.  By  the  chain  rule,  ut  = uyyt  = cuy  and  utt  = czuyy. 
Division  by  c gives  uxx  — uyy  — 0,  as  stated  before.  Hence  the  characteristic  equation  is  y'  — 1 = (yr  + 1) 
(y  — 1)  = 0.  The  two  families  of  solutions  (characteristics)  are  <!>(*,  y)  = y + x = const  and  T' (jc,  y)  = y — x = 
const.  This  gives  the  new  variables  v = <f>=y  + x = ct  + x and  w — = y — x = ct  — x and  d’Alembert’s 

solution  u = fi(x  + ct)  + fzix  — ct). 


gRdQ-BcEE^M—SE  T 12V4 


1.  Show  that  c is  the  speed  of  each  of  the  two  waves  given 
by  (4). 

2.  Show  that,  because  of  the  boundary  conditions  (2),  Sec. 
12.3,  the  function /in  (13)  of  this  section  must  be  odd 
and  of  period  2 L. 

3.  If  a steel  wire  2 m in  length  weighs  0.9  nt  (about  0.20 
lb)  and  is  stretched  by  a tensile  force  of  300  nt  (about 
67.4  lb),  what  is  the  corresponding  speed  of  transverse 
waves? 

4.  What  are  the  frequencies  of  the  eigenfunctions  in 
Prob.  3? 

GRAPHING  SOLUTIONS 

Using  (13)  sketch  or  graph  a figure  (similar  to  Fig.  291  in 
Sec.  12.3)  of  the  deflection  u(x,t)  of  a vibrating  string 
(length  L = 1,  ends  fixed,  c = 1)  starting  with  initial 
velocity  0 and  initial  deflection  ( k small,  say,  k = 0.01). 

5.  f(x)  = k sin  ttx  6.  /(x)  = k{\  — cos  7rx) 

7.  /(x)  = k sin  2ttx  8.  f(x)  = kx{  1 — x) 

NORMAL  FORMS 
Find  the  type,  transform  to  normal  form,  and  solve.  Show 
your  work  in  detail. 

9*  l^XX  0 10.  Uxx  16 Uyy  0 


11.  UXX  ' —UXy  “h  Uyy  0 12.  U XX  2^X7/  3“  Uyy  0 

13.  UXX  + 5Uxy  + = 0 14.  XUXy  — yUyy  — 0 

15.  xuxx  — yuXy  — 0 16.  uxx  + 2 uxy  + 10 uyy  = 0 

19.  Longitudinal  Vibrations  of  an  Elastic  Bar  or  Rod. 

These  vibrations  in  the  direction  of  the  x-axis  are 
modeled  by  the  wave  equation  utt  = c2uxx,  c2  = Ej p 
(see  Tolstov  [C9],  p.  275).  If  the  rod  is  fastened  at  one 
end,  x = 0,  and  free  at  the  other,  x = L,  we  have 
u( 0,  f)  = 0 and  ux(L , t)  = 0.  Show  that  the  motion 
corresponding  to  initial  displacement  u (x,  0)  = /(x) 
and  initial  velocity  zero  is 

u = ^ An  sin  pnx  cos  pnct, 
n= 0 

2 fL  (2 n + 1)77 

= - J fix)  sin  pnx  dx,  pn  = — . 

20.  Tricomi  and  Airy  equations.2  Show  that  the  Tricomi 
equation  yuxx  + uyy  = 0 is  of  mixed  type.  Obtain  the 
Airy  equation  G"  ~ yG  = 0 from  the  Tricomi 
equation  by  separation.  (For  solutions,  see  p.  446  of 
Ref.  [GenRefl]  listed  in  App.  1.) 


2Sir  GEORGE  BIDELL  AIRY  (1801-1892),  English  mathematician,  known  for  his  work  in  elasticity.  FRANCESCO 
TRICOMI  (1897-1978),  Italian  mathematician,  who  worked  in  integral  equations  and  functional  analysis. 
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12.!  Modeling:  Heat  Flow  from  a Body 
in  Space.  Heat  Equation 

After  the  wave  equation  (Sec.  12.2)  we  now  derive  and  discuss  the  next  “big”  PDE,  the 
heat  equation,  which  governs  the  temperature  u in  a body  in  space.  We  obtain  this  model 
of  temperature  distribution  under  the  following. 


Physical  Assumptions 

1.  The  specific  heat  cr  and  the  density  p of  the  material  of  the  body  are  constant.  No 
heat  is  produced  or  disappears  in  the  body. 

2.  Experiments  show  that,  in  a body,  heat  flows  in  the  direction  of  decreasing 
temperature,  and  the  rate  of  flow  is  proportional  to  the  gradient  (cf.  Sec.  9.7)  of  the 
temperature;  that  is,  the  velocity  v of  the  heat  flow  in  the  body  is  of  the  form 

(1)  v = —K  grad  u 

where  u ( x , y,  z,  t)  is  the  temperature  at  a point  (x,  y,  z ) and  time  t. 

3.  The  thermal  conductivity  K is  constant,  as  is  the  case  for  homogeneous  material  and 
nonextreme  temperatures. 

Under  these  assumptions  we  can  model  heat  flow  as  follows. 

Let  T be  a region  in  the  body  bounded  by  a surface  S with  outer  unit  normal  vector  n 
such  that  the  divergence  theorem  (Sec.  10.7)  applies.  Then 

v • n 

is  the  component  of  v in  the  direction  of  n.  Hence  | v • n A ,4 1 is  the  amount  of  heat  leaving 
T (if  v • n > 0 at  some  point  P)  or  entering  T (if  v • n < 0 at  P)  per  unit  time  at  some 
point  P of  S through  a small  portion  A S of  S of  area  A4.  Hence  the  total  amount  of  heat 
that  flows  across  S from  T is  given  by  the  surface  integral 

v • n dA. 

J . 
s 

Note  that,  so  far,  this  parallels  the  derivation  on  fluid  flow  in  Example  1 of  Sec.  10.8. 
Using  Gauss’s  theorem  (Sec.  10.7),  we  now  convert  our  surface  integral  into  a volume 
integral  over  the  region  T.  Because  of  (1)  this  gives  [use  (3)  in  Sec.  9.8] 


(2) 


v • n dA  = —K 


(grad  u)  • n dA  = —K 


div  (grad  u)  dx  dy  dz 


s 


= -K 


\u  dx  dy  dz. 


Here, 


d2ll 

a u 

d U 

— 

+ 

+ 

dx2 

ay2 

dz2 

is  the  Laplacian  of  u. 
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On  the  other  hand,  the  total  amount  of  heat  in  T is 


H = 


crpu  dx  dy  dz 


with  cr  and  p as  before.  Hence  the  time  rate  of  decrease  of  H is 


dH 

dt 


cr p — dx  dy  dz. 
dt 


This  must  be  equal  to  the  amount  of  heat  leaving  T because  no  heat  is  produced  or 
disappears  in  the  body.  From  (2)  we  thus  obtain 


du 

crp  — dx  dy  dz  = ~K 
dt 


V m dx  dy  dz 


or  (divide  by  —crp) 


— — c2V2w  ) dx  dy  dz  = 0 


c2  = — 
crp' 


Since  this  holds  for  any  region  T in  the  body,  the  integrand  (if  continuous)  must  be  zero 
everywhere.  That  is, 


(3)  l«r  = c2V2m.  c2  = K/pcr 

dt  1 

This  is  the  heat  equation,  the  fundamental  PDE  modeling  heat  flow.  It  gives  the 
temperature  u (x,  y,  z,  t)  in  a body  of  homogeneous  material  in  space.  The  constant  c2  is 
the  thermal  dijfusivity.  K is  the  thermal  conductivity,  cr  the  specific  heat,  and  p the  density 
of  the  material  of  the  body.  V2u  is  the  Laplacian  of  u and,  with  respect  to  the  Cartesian 
coordinates  x,  y,  z,  is 


d2u 

d U 

n2 
d U 

+ 

+ . 

dx2 

T 2 

dy 

T 2 

dz 

The  heat  equation  is  also  called  the  diffusion  equation  because  it  also  models  chemical 
diffusion  processes  of  one  substance  or  gas  into  another. 


12.6  Heat  Equation:  Solution  by  Fourier  Series. 
Steady  Two-Dimensional  Heat  Problems. 
Dirichlet  Problem 


We  want  to  solve  the  (one-dimensional)  heat  equation  just  developed  in  Sec.  12.5  and 
give  several  applications.  This  is  followed  much  later  in  this  section  by  an  extension  of 
the  heat  equation  to  two  dimensions. 
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0 x = L 

Fig.  294.  Bar  under  consideration 

As  an  important  application  of  the  heat  equation,  let  us  first  consider  the  temperature 
in  a long  thin  metal  bar  or  wire  of  constant  cross  section  and  homogeneous  material,  which 
is  oriented  along  the  jc-axis  (Fig.  294)  and  is  perfectly  insulated  laterally,  so  that  heat  flows 
in  the  ^-direction  only.  Then  besides  time,  it  depends  only  on  x,  so  that  the  Laplacian 
reduces  to  uxx  = d u/dx  , and  the  heat  equation  becomes  the  one-dimensional  heat 
equation 


(1) 


du  p d2u 
= CZ p . 

dt  dxZ 


This  PDE  seems  to  differ  only  very  little  from  the  wave  equation,  which  has  a term  utt 
instead  of  ut,  but  we  shall  see  that  this  will  make  the  solutions  of  (1)  behave  quite 
differently  from  those  of  the  wave  equation. 

We  shall  solve  (1)  for  some  important  types  of  boundary  and  initial  conditions.  We 
begin  with  the  case  in  which  the  ends  x = 0 and  x = L of  the  bar  are  kept  at  temperature 
zero,  so  that  we  have  the  boundary  conditions 

(2)  u (0,  f)  = 0,  u ( L , 0 = 0 for  all  (§0. 


Furthermore,  the  initial  temperature  in  the  bar  at  time  / = 0 is  given,  say,  /Of),  so  that  we 
have  the  initial  condition 

(3)  u (x,  0)  = fix)  [f(x)  given]. 


Flere  we  must  have/(0)  = 0 and  f(L)  = 0 because  of  (2). 

We  shall  determine  a solution  u (x,  t)  of  ( 1 ) satisfying  (2)  and  (3) — one  initial  condition 
will  be  enough,  as  opposed  to  two  initial  conditions  for  the  wave  equation.  Technically, 
our  method  will  parallel  that  for  the  wave  equation  in  Sec.  12.3:  a separation  of  variables, 
followed  by  the  use  of  Fourier  series.  You  may  find  a step-by-step  comparison 
worthwhile. 


Step  1.  Two  ODEs  from  the  heat  equation  (1).  Substitution  of  a product  u (x,  t ) = 
F(x)G(t ) into  (1)  gives  FG  = c2F" G with  G = dG/dt  and  F"  = d2F/dx2.  To  separate 
the  variables,  we  divide  by  c2FG,  obtaining 


(4) 


G 

2/~> 
c G 


F”_ 

F ' 


The  left  side  depends  only  on  t and  the  right  side  only  on  x,  so  that  both  sides  must  equal 
a constant  k (as  in  Sec.  12.3).  You  may  show  that  for  k = 0 or  k > 0 the  only  solution 
u = FG  satisfying  (2)  is  u = 0.  For  negative  k = —pz  we  have  from  (4) 

G F"  o 
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Multiplication  by  the  denominators  immediately  gives  the  two  ODEs 


(5)  F"  + p2F  = 0 
and 

(6)  G + c2p2G  = 0. 


Step  2.  Satisfying  the  boundary  conditions  (2).  We  first  solve  (5).  A general  solution  is 

(7)  F(x)  = A cos  px  + B sin  px. 

From  the  boundary  conditions  (2)  it  follows  that 

w(0,  t)  = F(0)G(t)  = 0 and  u(L , t)  = F(L)G(t)  = 0. 

Since  G = 0 would  give  u = 0,  we  require  F(0)  = 0,  F(L)  = 0 and  get  F(0)  = A = 0 
by  (7)  and  then  F(L)  = B sin  pL  = 0,  with  B A 0 (to  avoid  F = 0);  thus, 

H7T 

sin  pL  = 0,  hence  p = ^ , n = 1,  2,  • • • . 

Setting  B = 1,  we  thus  obtain  the  following  solutions  of  (5)  satisfying  (2): 

Fn(x)  = sin  , n = 1,  2,  ■ ■ • . 

(As  in  Sec.  12.3,  we  need  not  consider  negative  integer  values  of  n.) 

All  this  was  literally  the  same  as  in  Sec.  12.3.  From  now  on  it  differs  since  (6)  differs 
from  (6)  in  Sec.  12.3.  We  now  solve  (6).  For  p = mr/L , as  just  obtained,  (6)  becomes 

G + A 2G  = 0 where  \n  = — ^ . 

It  has  the  general  solution 

Gn{t ) = Z?ne_A“t,  n = 1,  2,  • • • 

where  Bn  is  a constant.  Hence  the  functions 


(8) 


nvx  ,2 1 

un(x,  t)  = Fn{x)Gn{t)  = Bn  sin  — — e n (n  = 1,  2,  ■ ■ • ) 


are  solutions  of  the  heat  equation  (1),  satisfying  (2).  These  are  the  eigenfunctions  of  the 
problem,  corresponding  to  the  eigenvalues  \n  = cmr/L. 

Step  3.  Solution  of  the  entire  problem.  Fourier  series.  So  far  we  have  solutions  (8) 
satisfying  the  boundary  conditions  (2).  To  obtain  a solution  that  also  satisfies  the  initial 
condition  (3),  we  consider  a series  of  these  eigenfunctions, 


(9) 


u(x,  t)  = 2 Unix , t ) = 25nsin~ 7^  e~Kt 


n= 1 


n= 1 


cn 77 


L 


L 
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EXAMPLE  1 


EXAMPLE  2 


From  this  and  (3)  we  have 

oo 

v-,  YlTTX 

u(x,  0)  = sin  — — = fix). 

n=l 

Hence  for  (9)  to  satisfy  (3),  the  Bn’s  must  be  the  coefficients  of  the  Fourier  sine  series, 
as  given  by  (4)  in  Sec.  11.3;  thus 


(10) 


B, 


2 

L 


,L 

f(x)  sin 


I17TX 

ax 


L 


in  = 1,2,---.) 


The  solution  of  our  problem  can  be  established,  assuming  that/(x)  is  piecewise  continuous 
(see  Sec.  6.1)  on  the  interval  and  has  one-sided  derivatives  (see  Sec.  1 1.1)  at  all 

interior  points  of  that  interval;  that  is,  under  these  assumptions  the  series  (9)  with  coefficients 
(10)  is  the  solution  of  our  physical  problem.  A proof  requires  knowledge  of  uniform 
convergence  and  will  be  given  at  a later  occasion  (Probs.  19,  20  in  Problem  Set  15.5). 

Because  of  the  exponential  factor,  all  the  terms  in  (9)  approach  zero  as  t approaches 
infinity.  The  rate  of  decay  increases  with  n. 

Sinusoidal  Initial  Temperature 

Find  the  temperature  u (x,  t)  in  a laterally  insulated  copper  bar  80  cm  long  if  the  initial  temperature  is 
100  sin  (ttx/ 80)  °C  and  the  ends  are  kept  at  0°C.  How  long  will  it  take  for  the  maximum  temperature  in  the  bar 
to  drop  to  50°C?  First  guess,  then  calculate.  Physical  data  for  copper:  density  8.92  g/cm  , specific  heat 
0.092  cal/(g  °C),  thermal  conductivity  0.95  cal/(cm  sec  °C). 

Solution.  The  initial  condition  gives 


, n il  a.  ii  a. 

u (x,  0)  = ^ Bn  sin = f{x)  =100  sin  — . 

n=l  80  80 

Hence,  by  inspection  or  from  (9),  we  get  B1  = 100,  B2  = B3  = ■ ■ ■ = 0.  In  (9)  we  need  A 2 = c27T2/L2,  where 
c2  = K) itrf) ) = 0.95/(0.092  ■ 8.92)  = 1.158  [cm2/sec].  Hence  we  obtain 

A?  = 1.158  • 9.870/802  = 0.001785  [sec-1]. 

The  solution  (9)  is 


TTX 

u (x,  t)  = 100  sin  — e 
80 


— 0.001785t 


Also,  \00e  0 0017851  = 50  when  t = (In  0.5)/ ( — 0.001785)  = 388  [sec]  » 6.5  [min].  Does  your  guess,  or  at 
least  its  order  of  magnitude,  agree  with  this  result? 


Speed  of  Decay 

Solve  the  problem  in  Example  1 when  the  initial  temperature  is  100  sin  (37rx/80)  °C  and  the  other  data  are  as 
before. 

Solution.  In  (9),  instead  of  n = 1 we  now  have  n — 3,  and  A§  = 32A?  = 9 • 0.001785  = 0.01607,  so  that 
the  solution  now  is 


u(x,  t)  = 100  sin 


-0.01 607t 

80 


Hence  the  maximum  temperature  drops  to  50°C  in  / = (In  0.5)/(— 0.01607)  ~ 43  [sec],  which  is  much  faster 
(9  times  as  fast  as  in  Example  1;  why?). 
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EXAMPLE  3 


Had  we  chosen  a bigger  n,  the  decay  would  have  been  still  faster,  and  in  a sum  or  series  of  such  terms,  each 
term  has  its  own  rate  of  decay,  and  terms  with  large  n are  practically  0 after  a very  short  time.  Our  next  example 
is  of  this  type,  and  the  curve  in  Fig.  295  corresponding  to  t = 0.5  looks  almost  like  a sine  curve;  that  is,  it  is 
practically  the  graph  of  the  first  term  of  the  solution. 


Fig.  295.  Example  3.  Decrease  of  temperature 
with  time  t for  L — tt  and  c = 1 


“Triangular”  Initial  Temperature  in  a Bar 

Find  the  temperature  in  a laterally  insulated  bar  of  length  L whose  ends  are  kept  at  temperature  0,  assuming  that 
the  initial  temperature  is 


m 


( x if  0 < x < L/2, 

1 L-  x if  L/2  < x < L. 


(The  uppermost  part  of  Fig.  295  shows  this  function  for  the  special  L — tt.) 
Solution.  From  (10)  we  get 


(10*) 


Bn  = 


L/2 


dx  + (L  — x)  sin  - 
L/2 


-dx 


Integration  gives  Bn  = 0 if  n is  even, 


4 L 


Bn  = -i-z  (n  = 1,5,  9,  •••)  and  Bn  = — 

nZ7TZ 

(see  also  Example  4 in  Sec.  11.3  with  k = L/2).  Hence  the  solution  is 
u(x,  t)  = 


4 L 


(n  = 3,  7,  11,---)- 


4 L 

TTX 

( c7r Y 

i 

3t7X 

o 

sin  — exp 

H— ) t 

sin 

——exp 

7 TZ 

L 

\ L / _ 

9 

L 

f 3c7t\2 

+ - ••• 

L V l J J 

_ 

Figure  295  shows  that  the  temperature  decreases  with  increasing  t,  because  of  the  heat  loss  due  to  the  cooling 
of  the  ends. 

Compare  Fig.  295  and  Fig.  291  in  Sec.  12.3  and  comment. 
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EXAMPLE  4 


EXAMPLE  5 


Bar  with  Insulated  Ends.  Eigenvalue  0 

Find  a solution  formula  of  (1),  (3)  with  (2)  replaced  by  the  condition  that  both  ends  of  the  bar  are  insulated. 

Solution.  Physical  experiments  show  that  the  rate  of  heat  flow  is  proportional  to  the  gradient  of  the 
temperature.  Hence  if  the  ends  * = 0 and  x = L of  the  bar  are  insulated,  so  that  no  heat  can  flow  through  the 
ends,  we  have  grad  u = ux  = du / dx  and  the  boundary  conditions 

(2*)  wx(0,  t)  = 0,  ux(L,  t)  = 0 for  all  t. 

Since  u(x,  t)  = F(x)G(t),  this  gives  ux{ 0,  t)  = Ff(0)G(t)  = 0 and  ux(L,  t)  = Fr(L)G(t ) = 0.  Differentiating 
(7),  we  have  F (x)  = — Ap  sin  px  + Bp  cos  px , so  that 

F\ 0)  = Bp  — 0 and  then  F'{L)  — —Ap  sin  pL  = 0. 

The  second  of  these  conditions  gives  p = pn  — mr/L,  (n  = 0,  1,  2,  • • • ).  From  this  and  (7)  with  A = 1 and 
B = 0 we  get  Fn(x ) = cos  ( nirx/L ),  (n  = 0,  1,  2,  • • • )•  With  Gn  as  before,  this  yields  the  eigenfunctions 

mrx  _x2, 

(11)  un(x,  t)  = Fn(x)Gn(t)  = Ancos— — e A“  (n  = 0,  1,  ••■) 

corresponding  to  the  eigenvalues  An  = cmr/L.  The  latter  are  as  before,  but  we  now  have  the  additional  eigenvalue 
Aq  = 0 and  eigenfunction  uq  = const,  which  is  the  solution  of  the  problem  if  the  initial  temperature  f{x)  is 
constant.  This  shows  the  remarkable  fact  that  a separation  constant  can  very  well  be  zero,  and  zero  can  be  an 
eigenvalue. 

Furthermore,  whereas  (8)  gave  a Fourier  sine  series,  we  now  get  from  (11)  a Fourier  cosine  series 


(12) 


A 77 


Its  coefficients  result  from  the  initial  condition  (3), 


^2,  H7TX 

u(x,  0)  = 2j  An  COS  J = /(*)» 

n=  0 ^ 


in  the  form  (2),  Sec.  11.3,  that  is, 


(13) 


A0  = “ fix)  dx, 
L JQ 


2 [ mrx 

An  = — f{x ) cos dx,  n — 1 , 2,  • • • . 

L J0  L 


“Triangular”  Initial  Temperature  in  a Bar  with  Insulated  Ends 

Find  the  temperature  in  the  bar  in  Example  3,  assuming  that  the  ends  are  insulated  (instead  of  being  kept  at 
temperature  0). 

Solution.  For  the  triangular  initial  temperature,  (13)  gives  Aq  = L/ 4 and  (see  also  Example  4 in  Sec.  11.3 
with  k = L/2) 


r L/2 

mrx 

fL 

mrx 

2 L 

^ mr  \ 

x cos dx  + 

(L 

— *)  cos dx 

2 cos cos  mr  — 1 

U0  L 

J 

JL/2 

L 

nz7Tz 

V 2 J 

Hence  the  solution  (12)  is 


L 8L  f 1 277* 

u(x,t)  = o i ~o  cos exp 

4 TT2  l22  L 


1 677* 

H — o cos exp 

6 2 L 


+ ••• 


}■ 


We  see  that  the  terms  decrease  with  increasing  t,  and  u — >L/4  as  t —■ * this  is  the  mean  value  of  the  initial 

temperature.  This  is  plausible  because  no  heat  can  escape  from  this  totally  insulated  bar.  In  contrast,  the  cooling 
of  the  ends  in  Example  3 led  to  heat  loss  and  u — > 0,  the  temperature  at  which  the  ends  were  kept. 
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Steady  Two-Dimensional  Heat  Problems. 

Laplace’s  Equation 

We  shall  now  extend  our  discussion  from  one  to  two  space  dimensions  and  consider  the 
two-dimensional  heat  equation 

dli  n n n f 5 U d U 

— = czVzu  = cz\  — ^ ^ 

dt  \dxz  dyz 

for  steady  (that  is,  time-independent)  problems.  Then  du/dt  = 0 and  the  heat  equation 
reduces  to  Laplace’s  equation 


(14) 


d2M  d2u 

+ 

-.2  -s  2 

ox  ay 


(which  has  already  occurred  in  Sec.  10.8  and  will  be  considered  further  in  Secs. 
12.8-12.11).  A heat  problem  then  consists  of  this  PDE  to  be  considered  in  some  region 
R of  the  Ay-plane  and  a given  boundary  condition  on  the  boundary  curve  C of  R.  This  is 

a boundary  value  problem  (BYP).  One  calls  it: 


First  BVP  or  Dirichlet  Problem  if  u is  prescribed  on  C (“Dirichlet  boundary 
condition”) 

Second  BVP  or  Neumann  Problem  if  the  normal  derivative  un  = du/dn  is 
prescribed  on  C (“Neumann  boundary  condition”) 

Third  BVP,  Mixed  BVP,  or  Robin  Problem  if  u is  prescribed  on  a portion  of  C 
and  un  on  the  rest  of  C (“Mixed  boundary  condition”). 


y 

u = fix) 

0 

R 

u = 0 

u = 0 

X 

0 a 

Fig.  296  Rectangle  R and  given  boundary  values 


Dirichlet  Problem  in  a Rectangle  R (Fig.  296).  We  consider  a Dirichlet  problem  for 
Laplace’s  equation  (14)  in  a rectangle  R,  assuming  that  the  temperature  u(x,y)  equals  a 
given  function  f(x)  on  the  upper  side  and  0 on  the  other  three  sides  of  the  rectangle. 

We  solve  this  problem  by  separating  variables.  Substituting  u(x,y)  = F(x)G(y)  into 
(14)  written  as  uxx  = ~uyy,  dividing  by  FG,  and  equating  both  sides  to  a negative 
constant,  we  obtain 
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I - __L  d*G 

F dx2  G dy2 


From  this  we  get 


d2F 
dx 2 


+ kF  = 0, 


and  the  left  and  right  boundary  conditions  imply 

F(  0)  = 0,  and  F(a)  = 0. 
This  gives  k = (hit/ a)2  and  corresponding  nonzero  solutions 


(15) 


F(x)  = Fn(x)  = sin  x. 


n = 1.2, 


The  ODE  for  G with  k = (mr/a)2  then  becomes 


Solutions  are 

G(y)  = Gn(y)  = Anen^a  + Bne~n^a. 

Now  the  boundary  condition  « = 0 on  the  lower  side  of  R implies  that  Gn( 0)  = 0;  that 
is,  Gn( 0)  = An  + Bn  = 0 or  Bn  = —An.  This  gives 

Gn(y)  = An(en^a  - e~n^a)  = 2An  sinh 
From  this  and  (15),  writing  2 An  = A:/,  we  obtain  as  the  eigenfunctions  of  our  problem 

yi'TTX  YlTTy 

(16)  un(x,  y)  = Fn(x)Gn(y)  = A*  sin  sinh  a . 

These  solutions  satisfy  the  boundary  condition  « = 0 on  the  left,  right,  and  lower  sides. 

To  get  a solution  also  satisfying  the  boundary  condition  u(x,  b ) = f(x)  on  the  upper 
side,  we  consider  the  infinite  series 

oo 

u(x,y)  = y/un(x,  y). 

n=l 

From  this  and  (16)  with  y = b we  obtain 

u(x,  b ) = fix)  = 2 An  sin  sinh 

n= 1 

We  can  write  this  in  the  form 

u(x,  b)  = 2 (a*  sinh  sin 

n= 1 ' ' 
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This  shows  that  the  expressions  in  the  parentheses  must  be  the  Fourier  coefficients  bn  of 
/(x);  that  is,  by  (4)  in  Sec.  11.3, 


bn  = sinh 


mrb  _ 2 
a a 


, ■ nlTx 
f(x)  sin ax. 


From  this  and  (16)  we  see  that  the  solution  of  our  problem  is 

(17) 


*(x,y)  = 2 un(x,y)  = 2 An 

n= 1 n= 1 


. nirx  . mry 

sin sinh 

a a 


where 


(18) 


A * 


2 

a sinh  (mrb/ a) 


a 

. ■ nirx 
f(x)  sin ax. 

■’o 


We  have  obtained  this  solution  formally,  neither  considering  convergence  nor  showing 
that  the  series  for  u,  uxx,  and  uyy  have  the  right  sums.  This  can  be  proved  if  one  assumes 
that /and/  are  continuous  and  f"  is  piecewise  continuous  on  the  interval  0 SxSa. 
The  proof  is  somewhat  involved  and  relies  on  uniform  convergence.  It  can  be  found  in 
[C4]  listed  in  App.  1. 

Unifying  Power  of  Methods.  Electrostatics,  Elasticity 

The  Laplace  equation  (14)  also  governs  the  electrostatic  potential  of  electrical  charges  in  any 
region  that  is  free  of  these  charges.  Thus  our  steady-state  heat  problem  can  also  be  interpreted 
as  an  electrostatic  potential  problem.  Then  (17),  (18)  is  the  potential  in  the  rectangle  R when 
the  upper  side  of  R is  at  potential /(x)  and  the  other  three  sides  are  grounded. 

Actually,  in  the  steady-state  case,  the  two-dimensional  wave  equation  (to  be  considered 
in  Secs.  12.8, 12.9)  also  reduces  to  (14).  Then  (17),  (18)  is  the  displacement  of  a rectangular 
elastic  membrane  (rubber  sheet,  drumhead)  that  is  fixed  along  its  boundary,  with  three 
sides  lying  in  the  xy- plane  and  the  fourth  side  given  the  displacement /(x). 

This  is  another  impressive  demonstration  of  the  unifying  power  of  mathematics.  It 
illustrates  that  entirely  different  physical  systems  may  have  the  same  mathematical  model 
and  can  thus  be  treated  by  the  same  mathematical  methods. 


PRQBL  E^M~  S E-T  ~1^  6 


1.  Decay.  How  does  the  rate  of  decay  of  (8)  with  fixed 
n depend  on  the  specific  heat,  the  density,  and  the 
thermal  conductivity  of  the  material? 

2.  Decay.  If  the  first  eigenfunction  (8)  of  the  bar 
decreases  to  half  its  value  within  20  sec,  what  is  the 
value  of  the  diffusivity? 


3.  Eigenfunctions.  Sketch  or  graph  and  compare  the  first 
three  eigenfunctions  (8)  with  Bn  = 1,  c = 1,  and 
L = Trfort  = 0,  0.1,  0.2,  • ■ ■ , 1.0. 

4.  WRITING  PROJECT.  Wave  and  Heat  Equations. 
Compare  these  PDEs  with  respect  to  general  behavior 
of  eigenfunctions  and  kind  of  boundary  and  initial 
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conditions.  State  the  difference  between  Fig.  291  in 
Sec.  12.3  and  Fig.  295. 


5-7 


LATERALLY  INSULATED  BAR 


Find  the  temperature  u ( x , t)  in  a bar  of  silver  of  length 
10  cm  and  constant  cross  section  of  area  1 cm2  (density 
10.6  g/cm3,  thermal  conductivity  1.04  cal/(cm  sec  °C), 
specific  heat  0.056  cal/(g  °C)  that  is  perfectly  insulated 
laterally,  with  ends  kept  at  temperature  0°C  and  initial 
temperature /(a)  °C,  where 


5.  /( a)  = sin  0.1  ttx 

6.  f(x)  = 4 - 0.8 1 a-  - 5 1 


7.  f{x)  = a(10  — a) 


8.  Arbitrary  temperatures  at  ends.  If  the  ends  x = 0 
and  x = L of  the  bar  in  the  text  are  kept  at  constant 
temperatures  Ui  and  f/2,  respectively,  what  is  the  tem- 
perature u i(x)  in  the  bar  after  a long  time  (theoretically, 
as  t —*  »)?  First  guess,  then  calculate. 


9.  In  Prob.  8 find  the  temperature  at  any  time. 


10.  Change  of  end  temperatures.  Assume  that  the  ends 
of  the  bar  in  Probs.  5-7  have  been  kept  at  100°C  for  a 
long  time.  Then  at  some  instant,  call  it  t — 0,  the 
temperature  at  x = L is  suddenly  changed  to  0°C  and 
kept  at  0°C,  whereas  the  temperature  at  x = 0 is  kept 
at  100°C.  Find  the  temperature  in  the  middle  of  the  bar 
at  t = 1,  2,  3,  10,  50  sec.  First  guess,  then  calculate. 


BAR  UNDER  ADIABATIC  CONDITIONS 

“Adiabatic”  means  no  heat  exchange  with  the  neigh- 
borhood, because  the  bar  is  completely  insulated,  also  at 
the  ends.  Physical  Information:  The  heat  flux  at  the  ends 
is  proportional  to  the  value  of  du/dx  there. 

11.  Show  that  for  the  completely  insulated  bar,  nx(0,  t)  = 0, 
ux(L,  t ) = 0,  u{x,  t)  — fix)  and  separation  of  variables 
gives  the  following  solution,  with  An  given  by  (2)  in 
Sec.  11.3. 


u(x,  t)  = A„  + 2 An 
n= 1 


tlTTX 

COS e 

L 


—(cmr/LSyH 


12-15 


Find  the  temperature  in  Prob.  11  with  L = tt, 


c = 1,  and 


12.  fix)  = x 13.  fix)  = 1 

14.  fix)  — cos  2x  15.  fix)  = 1 — a/7 t 

16.  A bar  with  heat  generation  of  constant  rate  H ( > 0) 

is  modeled  by  ut  = c2uxx  + H.  Solve  this  problem  if 

L = tt  and  the  ends  of  the  bar  are  kept  at  0°C.  Hint. 
Set  u = v — Hxix  — tt )/ (2c2). 

17.  Heat  flux.  The  heat  flux  of  a solution  u (a,  t)  across  x = 0 
is  defined  by  <pit)  = —Kuxi0,t)-  Find  r/j it)  for  the 
solution  (9).  Explain  the  name.  Is  it  physically  under- 
standable that  (/>  goes  to  0 as  t —*  °°? 


18-25 


TWO-DIMENSIONAL  PROBLEMS 


18.  Laplace  equation.  Find  the  potential  in  the  rec- 
tangle 0 S a S 20,  0 S y = 40  whose  upper  side  is 
kept  at  potential  110  V and  whose  other  sides  are 
grounded. 


19.  Find  the  potential  in  the  square  0 S a S 2,  0 S y S 2 
if  the  upper  side  is  kept  at  the  potential  1000  sin  \ttx 
and  the  other  sides  are  grounded. 


20.  CAS  PROJECT.  Isotherms.  Find  the  steady-state 
solutions  (temperatures)  in  the  square  plate  in  Fig.  297 
with  a = 2 satisfying  the  following  boundary  condi- 
tions. Graph  isotherms. 


(a)  u = 80  sin  ttx  on  the  upper  side,  0 on  the  others. 

(b)  u = 0 on  the  vertical  sides,  assuming  that  the  other 
sides  are  perfectly  insulated. 


(c)  Boundary  conditions  of  your  choice  (such  that  the 
solution  is  not  identically  zero). 


y 

a 


a 


x 


Fig.  297.  Square  plate 


21.  Heat  flow  In  a plate.  The  faces  of  the  thin  square  plate 
in  Fig.  297  with  side  a = 24  are  perfectly  insulated. 
The  upper  side  is  kept  at  25  °C  and  the  other  sides  are 
kept  at  0°C.  Find  the  steady-state  temperature  uix,y) 
in  the  plate. 

22.  Find  the  steady-state  temperature  in  the  plate  in  Prob. 
21  if  the  lower  side  is  kept  at  Uq°C,  the  upper  side  at 
t/i°C,  and  the  other  sides  are  kept  at  0°C.  Hint:  Split 
into  two  problems  in  which  the  boundary  temperature 
is  0 on  three  sides  for  each  problem. 

23.  Mixed  boundary  value  problem.  Find  the  steady- 
state  temperature  in  the  plate  in  Prob.  2 1 with  the  upper 
and  lower  sides  perfectly  insulated,  the  left  side  kept 
at  0°C,  and  the  right  side  kept  at/(y)°C. 

24.  Radiation.  Find  steady-state  temperatures  in  the 
rectangle  in  Fig.  296  with  the  upper  and  left  sides 
perfectly  insulated  and  the  right  side  radiating  into  a 
medium  at  0°C  according  to  uxia,y)  + huia,y)  = 0, 
h > 0 constant.  (You  will  get  many  solutions  since  no 
condition  on  the  lower  side  is  given.) 

25.  Find  formulas  similar  to  (17),  (18)  for  the  temperature 
in  the  rectangle  R of  the  text  when  the  lower  side  of  R 
is  kept  at  temperature  /(a)  and  the  other  sides  are  kept 
at  0°C. 
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12.  Heat  Equation:  Modeling  Very  Long  Bars. 
Solution  by  Fourier  Integrals  and 
Transforms 

Our  discussion  of  the  heat  equation 


in  the  last  section  extends  to  bars  of  infinite  length,  which  are  good  models  of  very  long 
bars  or  wires  (such  as  a wire  of  length,  say,  300  ft).  Then  the  role  of  Fourier  series  in  the 
solution  process  will  be  taken  by  Fourier  integrals  (Sec.  11.7). 

Let  us  illustrate  the  method  by  solving  (1)  for  a bar  that  extends  to  infinity  on  both 
sides  (and  is  laterally  insulated  as  before).  Then  we  do  not  have  boundary  conditions,  but 
only  the  initial  condition 

(2)  u(x,  0)  = f(x)  (-00  < X < °o) 

where  fix)  is  the  given  initial  temperature  of  the  bar. 

To  solve  this  problem,  we  start  as  in  the  last  section,  substituting  u{x,  t)  = F(x)Git) 
into  (1).  This  gives  the  two  ODEs 

(3)  F"  + p2F  = 0 [see  (5),  Sec.  12.6] 

and 

(4)  G + czp2G  = 0 [see  (6),  Sec.  12.6], 

Solutions  are 


Fix)  = A cos  px  + B sin  px  and  Git)  = e 


respectively,  where  A and  B are  any  constants.  Hence  a solution  of  (1)  is 

(5)  m(x,  t;  p)  = FG  = (A  cos  px  + B sin  px)  e~c  p 1 . 

Here  we  had  to  choose  the  separation  constant  k negative,  k = — p2,  because  positive 
values  of  k would  lead  to  an  increasing  exponential  function  in  (5),  which  has  no  physical 
meaning. 


Use  of  Fourier  Integrals 

Any  series  of  functions  (5),  found  in  the  usual  manner  by  taking  p as  multiples  of  a fixed 
number,  would  lead  to  a function  that  is  periodic  in  x when  t = 0.  However,  since  f(x) 
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in  (2)  is  not  assumed  to  be  periodic,  it  is  natural  to  use  Fourier  integrals  instead  of  Fourier 
series.  Also,  A and  B in  (5)  are  arbitrary  and  we  may  regard  them  as  functions  of  p,  writing 
A = A(p)  and  B = B(P).  Now,  since  the  heat  equation  (1)  is  linear  and  homogeneous, 
the  function 


(6)  n(x,  t)  = 


u (x,  t\  p)  dp  = 


[A(p)  cos  px  + B(p)  sin  px]  e c p t dp 


is  then  a solution  of  (1),  provided  this  integral  exists  and  can  be  differentiated  twice  with 
respect  to  x and  once  with  respect  to  t. 

Determination  of  A(p)  and  B( p)  from  the  Initial  Condition.  From  (6)  and  (2)  we  get 


(7) 


n(x,  0)  = 


[A(p)  cos  px  + B(p)  sin  px]  dp  = /(x). 


Jo 


This  gives  A(p)  and  B(p)  in  terms  of /(x);  indeed,  from  (4)  in  Sec.  11.7  we  have 


(8) 


A(p)  = 


77 


f(v)  cos  pv  dv,  B(p)  = 


77 


f(v ) sin  pv  dv. 


According  to  (1*),  Sec.  11.9,  our  Fourier  integral  (7)  with  these  A(p)  and  B ( p)  can  be 
written 


u(x,  0)  = — 

77 


Similarly,  (6)  in  this  section  becomes 

1 


f(v)  cos  ( px  — pv)  dv 


dp. 


u(x,  t)  = 


77 


"0  L 


f{v)  cos  (px  — pv)e  c p 1 dv 


dp. 


Assuming  that  we  may  reverse  the  order  of  integration,  we  obtain 

1 


(9) 


m(x,  t)  = 


77 


m 

CO 

e~c  p f cos  (px  — pv)  dp 

X 

^0 

dv. 


Then  we  can  evaluate  the  inner  integral  by  using  the  formula 

(10) 


„2  V77 

e cos  2 bs  ds  = e 

2 


[A  derivation  of  (10)  is  given  in  Problem  Set  16.4  (Team  Project  24).]  This  takes  the  form 
of  our  inner  integral  if  we  choose  p = s/{c\/~t ) as  a new  variable  of  integration  and  set 


b = 


x — v 
2 cVt' 
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Then  2 bs  = (x  — v)p  and  ds  = c\Zt  dp,  so  that  (10)  becomes 


Jo 


-c'Vt  / \ i v 77  f (x  ~ V)Z  \ 

e p cos  (px  — pv)  dp  = exp  s > . 

‘ " I 4c2f  J 


2cVt 


By  inserting  this  result  into  (9)  we  obtain  the  representation 


(ID 


u( x,  t)  = 


1 


2 c\Zrrt 


dv. 


Taking  z = (v  — x)/(2c\rt)  as  a variable  of  integration,  we  get  the  alternative  form 

- oo 

1 


(12) 


u(x,  t ) = 


V77  . 


f(x  + 2 czVt)  e " dz. 


If  f(x)  is  bounded  for  all  values  of  x and  integrable  in  every  finite  interval,  it  can  be 
shown  (see  Ref.  [CIO])  that  the  function  (11)  or  (12)  satisfies  (1)  and  (2).  Hence  this 
function  is  the  required  solution  in  the  present  case. 

Temperature  in  an  Infinite  Bar 

Find  the  temperature  in  the  infinite  bar  if  the  initial  temperature  is  (Fig.  298) 

f Uq  = const  if  \x\  < 1, 

/W=i  0 if  \x\  > t. 


fix) 

Uo 

-1  1 * 


Fig.  298.  Initial  temperature  in  Example  1 


Solution.  From  (11)  we  have 


U0  [l  \ (x-vf\ 

u( x,  t)  = exp  s > dv. 

2cVtt7  J_i  l 4c2t  ) 

If  we  introduce  the  above  variable  of  integration  z,  then  the  integration  over  v from  —1  to  1 corresponds  to  the 
integration  over  z from  (—  1 — x)/(2cV?)  to  (1  — x)/(2 cVf),  and 


(13) 


u(x,  t)  = 


Up 
Vt t 


(l-i)/(2cVt) 


e dz 


(l+l)/(2cVt) 


(t  > 0). 


We  mention  that  this  integral  is  not  an  elementary  function,  but  can  be  expressed  in  terms  of  the  error 
function,  whose  values  have  been  tabulated.  (Table  A4  in  App.  5 contains  a few  values;  larger  tables  are 
listed  in  Ref.  [GenRefl]  in  App.  1.  See  also  CAS  Project  1,  p.  574.)  Figure  299  shows  u(x,  t ) for  I/0  = 100°C, 
r2  = 1 cm2/sec,  and  several  values  of  t. 
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EXAMPLE  2 


Fig.  299  Solution  u(x,  t)  in  Example  1 for  U0  = 100°C, 
c2  = 1 cmVsec,  and  several  values  of  t 

Use  of  Fourier  Transforms 

The  Fourier  transform  is  closely  related  to  the  Fourier  integral,  from  which  we  obtained  the 
transform  in  Sec.  11.9.  And  the  transition  to  the  Fourier  cosine  and  sine  transform  in  Sec. 
11.8  was  even  simpler.  (You  may  perhaps  wish  to  review  this  before  going  on.)  Hence  it 
should  not  surprise  you  that  we  can  use  these  transforms  for  solving  our  present  or  similar 
problems.  The  Fourier  transform  applies  to  problems  concerning  the  entire  axis,  and  the 
Fourier  cosine  and  sine  transforms  to  problems  involving  the  positive  half-axis.  Let  us  explain 
these  transform  methods  by  typical  applications  that  fit  our  present  discussion. 

Temperature  in  the  Infinite  Bar  in  Example  1 

Solve  Example  1 using  the  Fourier  transform. 

Solution.  The  problem  consists  of  the  heat  equation  (1)  and  the  initial  condition  (2),  which  in  this  example  is 
f(x ) = Uq  = const  if  \x\  < 1 and  0 otherwise. 

Our  strategy  is  to  take  the  Fourier  transform  with  respect  to  x and  then  to  solve  the  resulting  ordinary  DE  in  t. 
The  details  are  as  follows. 

Let  u — cF( u ) denote  the  Fourier  transform  of  u,  regarded  as  a function  of  x.  From  (10)  in  Sec.  1 1.9  we  see 
that  the  heat  equation  (1)  gives 

s '(ut)  = cz<$(uxx)  = c\-w2mu)  = -czw2u. 

On  the  left,  assuming  that  we  may  interchange  the  order  of  differentiation  and  integration,  we  have 

*■■>  - vfe ['■*-*'*  - vfe  5 - f ■ 


Thus 


du 

dt 


Since  this  equation  involves  only  a derivative  with  respect  to  t but  none  with  respect  to  w,  this  is  a first-order 
ordinary  DE,  with  t as  the  independent  variable  and  w as  a parameter.  By  separating  variables  (Sec.  1.3)  we 
get  the  general  solution 


u (w,  t)  = C ( w)e  c w f 
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with  the  arbitrary  “constant”  C (w)  depending  on  the  parameter  w.  The  initial  condition  (2)  yields  the  relationship 
u(w,  0)  = C(w)  = /(w)  = S'(f).  Our  intermediate  result  is 


u(w,  t)  = f(w)e 


— C2W2t 


The  inversion  formula  (7),  Sec.  11.9,  now  gives  the  solution 
(14) 


u (x,  t ) = — — I /(w)  e~cVt  eiwx  dw. 

V2tt  J__ 


In  this  solution  we  may  insert  the  Fourier  transform 


/(w)  = 


1 


f(v)ewwdv. 


V2tt 

Assuming  that  we  may  invert  the  order  of  integration,  we  then  obtain 


u(x,  t)  = -1-  I f(v) 
2tt 


e-c*wH  eKwx-wv)dw 


dv. 


By  the  Euler  formula  (3).  Sec.  11.9,  the  integrand  of  the  inner  integral  equals 

e~c  w 1 cos  (wx  — wv)  + ie~c  w f sin  (wx  — wv). 

We  see  that  its  imaginary  part  is  an  odd  function  of  w,  so  that  its  integral  is  0.  (More  precisely,  this  is  the 
principal  part  of  the  integral;  see  Sec.  16.4.)  The  real  part  is  an  even  function  of  w,  so  that  its  integral  from  — °o 
to  oo  equals  twice  the  integral  from  0 to  °o; 


u(x,t)  = 2 I f(v) 


e c w 4 cos  (wx  — wv)  dw 


dv. 


This  agrees  with  (9)  (with  p = w)  and  leads  to  the  further  formulas  (11)  and  (13). 

Solution  in  Example  1 by  the  Method  of  Convolution 

Solve  the  heat  problem  in  Example  1 by  the  method  of  convolution. 

Solution.  The  beginning  is  as  in  Example  2 and  leads  to  (14),  that  is. 


(15) 


u (x,  I)  = ^2=  j f(w)e~crw^eTWX  dw. 


Now  comes  the  crucial  idea.  We  recognize  that  this  is  of  the  form  (13)  in  Sec.  11.9,  that  is, 
(16)  u(x,t)  = (f*g)(x)  = [ f(w)g(w)elwx  dw 


where 


(17) 


~ 1 —c2m‘2t 

g(w)  = ^—e 

V2tt 


Since,  by  the  definition  of  convolution  [(11),  Sec.  1 1.9], 


(IB) 


(f*g)(x)=  f(p)g(x  - p)dp. 
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as  our  next  and  last  step  we  must  determine  the  inverse  Fourier  transform  g of  g.  For  this  we  can  use  formula 
9 in  Table  III  of  Sec.  11.10, 


me-™?)  = — 1 — c-^2/<4«) 

V2 a 

with  a suitable  a.  With  c2t  = 1/(4 a)  or  a = 1/(4 c2t),  using  (17)  we  obtain 

= V2 

Hence  g has  the  inverse 


1 £-*■/( AcH) 

Vl^tVlTT 


Replacing  x with  x — p and  substituting  this  into  (18)  we  finally  have 


(19) 


u(x,  t)  = ( f*g)(x ) = 1 

2c\Zrrt 


| /(P)exp 


{ 


U-pf] 

4c2,  > 


dp. 


This  solution  formula  of  our  problem  agrees  with  (11).  We  wrote  (/  *g)( x),  without  indicating  the  parameter  t 
with  respect  to  which  we  did  not  integrate. 

Fourier  Sine  Transform  Applied  to  the  Heat  Equation 

If  a laterally  insulated  bar  extends  from  x = 0 to  infinity,  we  can  use  the  Fourier  sine  transform.  We  let  the 
initial  temperature  be  u (x,  0)  = f(x)  and  impose  the  boundary  condition  u (0,  t ) = 0.  Then  from  the  heat  equation 
and  (9b)  in  Sec.  11.8,  since  /(0)  = w(0,  0)  = 0,  we  obtain 

= -r^  = c2<$s(uxx)  = -c2w2^s(u)  = -c2w2us(w,t). 
at 

This  is  a first-order  ODE  dus/ dt  + c2w2us  = 0.  Its  solution  is 

us(w,  t ) = C(w)e~c  w t. 

From  the  initial  condition  u(x,  0)  = f(x)  we  have  us(w,  0)  =^(w)  = C(w).  Hence 

«sO,  f)  =fs(w)e~c  w ‘. 

Taking  the  inverse  Fourier  sine  transform  and  substituting 


is  (w)  = A / — f(p)  sin  wp  dp 


on  the  right,  we  obtain  the  solution  formula 


(20) 


u (x,  t) 


2_ 

77 


f(p)  sin  wp  e c w f sin  wx  dp  dw. 
■* o 


Figure  300  shows  (20)  with  c = 1 for  f(x)  = 1 if  0 ^ x = 1 and  0 otherwise,  graphed  over  the  xf-plane  for 
0 ^ x ^ 2,  0.01  ^ t ^ 1.5.  Note  that  the  curves  of  u(x,  t ) for  constant  t resemble  those  in  Fig.  299. 
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Partial  Differential  Equations  (PDEs) 


Fig.  300.  Solution  (20)  in  Example  4 


FRhQB  t^Tfl-SET- 1-2^7 


1.  CAS  PROJECT.  Heat  Flow,  (a)  Graph  the  basic 
Fig.  299. 

(b)  In  (a)  apply  animation  to  “see”  the  heat  flow  in 
terms  of  the  decrease  of  temperature. 

(c)  Graph  u ( x , t)  with  c = 1 as  a surface  over  a 
rectangle  of  the  form  —a  < x < a,  0 < y < b. 


2-8 


SOLUTION 
IN  INTEGRAL  FORM 


Using  (6),  obtain  the  solution  of  (1)  in  integral  form 
satisfying  the  initial  condition  u ( x , 0)  = fix),  where 

2.  fix)  =1  if  \x\  < a and  0 otherwise 

3.  f(x)  = 1/(1  + x2). 

Hint.  Use  (15)  in  Sec.  11.7. 

4.  fix)  = e~u 

5.  fix)  = \x\  if  \x\  < 1 and  0 otherwise 

6.  fix)  = x if  |jc|  < 1 and  0 otherwise 

7.  fix)  = (sin  x)/x. 

Hint.  Use  Prob.  4 in  Sec.  11.7. 


8.  Verify  that  u in  the  solution  of  Prob.  7 satisfies  the 
initial  condition. 


9-12 


CAS  PROJECT.  Error  Function. 


(21) 


erf  x = 


Vtt  J0 


dw 


9.  Graph  the  bell-shaped  curve  [the  curve  of  the  inte- 
grand in  (21)].  Show  that  erf  x is  odd.  Show  that 

,.,2  Vtt 

~w  dw  = — (erf  6 - erf  a). 


e w dw  = Vtt  erf  b. 


10.  Obtain  the  Maclaurin  series  of  erf  x from  that  of  the 
integrand.  Use  that  series  to  compute  a table  of  erf  x 
for  x — 0(0.01)3  (meaning  x = 0,  0.01,  0.02,  • ■ ■ , 3). 

11.  Obtain  the  values  required  in  Prob.  10  by  an  integration 
command  of  your  CAS.  Compare  accuracy. 

12.  It  can  be  shown  that  erf  (°°)  = 1.  Confirm  this  experi- 
mentally by  computing  erf  x for  large  x. 

13.  Let  fix)  = 1 when  x > 0 and  0 when  x < 0.  Using 
erf  (°o)  = 1,  show  that  (12)  then  gives 


u (x,  t) 


1 

Vtt 


-x2 


dz 


-x/(2  cVt) 


it  > 0). 


14.  Express  the  temperature  (13)  in  terms  of  the  error 
function. 


15.  Show  that  tbfx) 


1 

VTt t 


This  function  is  important  in  applied  mathematics  and 
physics  (probability  theory  and  statistics,  thermodynamics, 
etc.)  and  fits  our  present  discussion.  Regarding  it  as  a typical 
case  of  a special  function  defined  by  an  integral  that  cannot 
be  evaluated  as  in  elementary  calculus,  do  the  following. 


Here,  the  integral  is  the  definition  of  the  “distribution 
function  of  the  normal  probability  distribution”  to  be 
discussed  in  Sec.  24.8. 
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12.8  Modeling:  Membrane, 

Two-Dimensional  Wave  Equation 

Since  the  modeling  here  will  be  similar  to  that  of  Sec.  12.2,  you  may  want  to  take  another 
look  at  Sec.  12.2. 

The  vibrating  string  in  Sec.  12.2  is  a basic  one-dimensional  vibrational  problem.  Equally 
important  is  its  two-dimensional  analog,  namely,  the  motion  of  an  elastic  membrane,  such 
as  a drumhead,  that  is  stretched  and  then  fixed  along  its  edge.  Indeed,  setting  up  the  model 
will  proceed  almost  as  in  Sec.  12.2. 

Physical  Assumptions 

1.  The  mass  of  the  membrane  per  unit  area  is  constant  (“homogeneous  membrane”). 
The  membrane  is  perfectly  flexible  and  offers  no  resistance  to  bending. 

2.  The  membrane  is  stretched  and  then  fixed  along  its  entire  boundary  in  the  xy-plane. 
The  tension  per  unit  length  T caused  by  stretching  the  membrane  is  the  same  at  all 
points  and  in  all  directions  and  does  not  change  during  the  motion. 

3.  The  deflection  u (x,  y,  t)  of  the  membrane  during  the  motion  is  small  compared  to 
the  size  of  the  membrane,  and  all  angles  of  inclination  are  small. 

Although  these  assumptions  cannot  be  realized  exactly,  they  hold  relatively  accurately  for 
small  transverse  vibrations  of  a thin  elastic  membrane,  so  that  we  shall  obtain  a good 
model,  for  instance,  of  a drumhead. 

Derivation  of  the  PDE  of  the  Model  (“Two-Dimensional  Wave  Equation”)  from  Forces. 

As  in  Sec.  12.2  the  model  will  consist  of  a PDE  and  additional  conditions.  The  PDE  will  be 
obtained  by  the  same  method  as  in  Sec.  12.2,  namely,  by  considering  the  forces  acting  on  a 
small  portion  of  the  physical  system,  the  membrane  in  Fig.  301  on  the  next  page,  as  it  is 
moving  up  and  down. 

Since  the  deflections  of  the  membrane  and  the  angles  of  inclination  are  small,  the  sides 
of  the  portion  are  approximately  equal  to  Ax  and  Ay.  The  tension  T is  the  force  per  unit 
length.  Hence  the  forces  acting  on  the  sides  of  the  portion  are  approximately  T Ax  and 
T Ay.  Since  the  membrane  is  perfectly  flexible,  these  forces  are  tangent  to  the  moving 
membrane  at  every  instant. 

Horizontal  Components  of  the  Forces.  We  first  consider  the  horizontal  components 
of  the  forces.  These  components  are  obtained  by  multiplying  the  forces  by  the  cosines  of 
the  angles  of  inclination.  Since  these  angles  are  small,  their  cosines  are  close  to  1 . Hence 
the  horizontal  components  of  the  forces  at  opposite  sides  are  approximately  equal. 
Therefore,  the  motion  of  the  particles  of  the  membrane  in  a horizontal  direction  will  be 
negligibly  small.  From  this  we  conclude  that  we  may  regard  the  motion  of  the  membrane 
as  transversal;  that  is,  each  particle  moves  vertically. 

Vertical  Components  of  the  Forces.  These  components  along  the  right  side  and  the 
left  side  are  (Fig.  301),  respectively, 

T Ay  sin  [3  and  — T Ay  sin  a. 

Here  a and  (3  are  the  values  of  the  angle  of  inclination  (which  varies  slightly  along  the 
edges)  in  the  middle  of  the  edges,  and  the  minus  sign  appears  because  the  force  on  the 
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Membrane 


y + Ay  -- 


y -- 


X 


x + Ax 


TAx 


TAy 


u 


Fig.  301.  Vibrating  membrane 


left  side  is  directed  downward.  Since  the  angles  are  small,  we  may  replace  their  sines  by 
their  tangents.  Hence  the  resultant  of  those  two  vertical  components  is 


where  subscripts  x denote  partial  derivatives  and  y1  and  y2  are  values  between  y and 
y + Ay.  Similarly,  the  resultant  of  the  vertical  components  of  the  forces  acting  on  the 
other  two  sides  of  the  portion  is 


where  x\  and  x2  are  values  between  x and  x + Ax. 

Newton’s  Second  Law  Gives  the  PDE  of  the  Model.  By  Newton’s  second  law  (see 
Sec.  2.4)  the  sum  of  the  forces  given  by  (1)  and  (2)  is  equal  to  the  mass  pAA  of  that 
small  portion  times  the  acceleration  d2u/dt2\  here  p is  the  mass  of  the  undeflected 
membrane  per  unit  area,  and  A A = Ax  Ay  is  the  area  of  that  portion  when  it  is  unde- 
flected.  Thus 


where  the  derivative  on  the  left  is  evaluated  at  some  suitable  point  (x,  y)  corresponding 
to  that  portion.  Division  by  pAxAy  gives 


(1) 


T Ay  (sin  ^ — sin  a)  ~ T Ay  (tan  /3  — tan  a) 

= T Ay  [mx(x  + Ax,  y^  - ux(x,  y2)] 


(2) 


TAx [uy(xi,  y + Ay)  - uy(x2,y)] 
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d2u 

at2 


T Ux(x  + Ax,  V!)  - ux(x,y2)  + uy(x  1-  y + Ay)  uy(x2,y) 
P Ax  A y 


If  we  let  Ax  and  Ay  approach  zero,  we  obtain  the  PDE  of  the  model 


(3) 


d2U  o 1 

/ ^2 
' d u 

d U 

= C 

— + — 

dt2 

Vdx2 

T 2 

dy 

T 

P' 


This  PDE  is  called  the  two-dimensional  wave  equation.  The  expression  in  parentheses 
is  the  Laplacian  A2h  of  u (Sec.  10.8).  Hence  (3)  can  be  written 


(3') 


Solutions  of  the  wave  equation  (3)  will  be  obtained  and  discussed  in  the  next  section. 


12/  Rectangular  Membrane. 
Double  Fourier  Series 


Now  we  develop  a solution  for  the  PDE  obtained  in  Sec.  12.8.  Details  are  as  follows. 

The  model  of  the  vibrating  membrane  for  obtaining  the  displacement  u{x,  y,  t)  of  a point 
(x,  y)  of  the  membrane  from  rest  ( u = 0)  at  time  t is 

d2u  o ( d2w  d2u\ 

(1) 

(2)  u = 0 on  the  boundary 

(3a)  u (x,  y,  0)  = /(x,  y) 

(3b)  «t(x,  y,  0)  = g(x,  y). 


y 


R 


a 

Fig.  302. 

Rectangular 

membrane 


Here  (1)  is  the  two-dimensional  wave  equation  with  c2  = T/p  just  derived,  (2)  is 
the  boundary  condition  (membrane  fixed  along  the  boundary  in  the  xv-plane  for 
all  times  t g 0),  and  (3)  are  the  initial  conditions  at  t = 0,  consisting  of  the  given 
initial  displacement  (initial  shape)  /(x,  y)  and  the  given  initial  velocity  g(x,  y),  where 
ut  = du/dt.  We  see  that  these  conditions  are  quite  similar  to  those  for  the  string  in 
Sec.  12.2. 

Let  us  consider  the  rectangular  membrane  R in  Fig.  302.  This  is  our  first  important 
model.  It  is  much  simpler  than  the  circular  drumhead,  which  will  follow  later.  First  we 
note  that  the  boundary  in  equation  (2)  is  the  rectangle  in  Fig.  302.  We  shall  solve  this 
problem  in  three  steps: 
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Step  1.  By  separating  variables,  first  setting  u(x,  y,  t)  = Fix,  y)G(t)  and  later 
F(x,  y)  = H{x)Q{y)  we  obtain  from  (1)  an  ODE  (4)  for  G and  later  from  a PDE  (5)  for  F 
two  ODEs  (6)  and  (7)  for  FI  and  Q. 

Step  2.  From  the  solutions  of  those  ODEs  we  determine  solutions  (13)  of  (1) 
(“eigenfunctions”  umn)  that  satisfy  the  boundary  condition  (2). 

Step  3.  We  compose  the  umn  into  a double  series  (14)  solving  the  whole  model  (1), 
(2),  (3). 


Step  1.  Three  ODEs  From  the  Wave  Equation  (1) 

To  obtain  ODEs  from  (1),  we  apply  two  successive  separations  of  variables.  In  the  first 
separation  we  set  u(x,  y,  t)  = F(x,  y)G(t).  Substitution  into  (1)  gives 


FG  = c\FxxG  + FyyG ) 


where  subscripts  denote  partial  derivatives  and  dots  denote  derivatives  with  respect  to  t. 
To  separate  the  variables,  we  divide  both  sides  by  c2FG: 


G 

2/^ 
c G 


, (y xx  T fv/7/i- 


Since  the  left  side  depends  only  on  t,  whereas  the  right  side  is  independent  of  t,  both  sides 
must  equal  a constant.  By  a simple  investigation  we  see  that  only  negative  values  of  that 
constant  will  lead  to  solutions  that  satisfy  (2)  without  being  identically  zero;  this  is  similar 

o 

to  Sec.  12.3.  Denoting  that  negative  constant  by  — v , we  have 

^ = —(F  + F 1 = —v2 

'n1  1 yyJ  v ■ 

C G r 


This  gives  two  equations:  for  the  “time  function”  G(t ) we  have  the  ODE 

(4)  G + A2G  = 0 where  A = cv, 

and  for  the  “amplitude  function”  F(x,  y)  a PDE,  called  the  two-dimensional  Helmholtz3 
equation 

(5)  Fxx  + Fyy  + v2F  = 0. 


3HERMANN  VON  HELMHOLTZ  (1821-1894),  German  physicist,  known  for  his  fundamental  work  in 
thermodynamics,  fluid  flow,  and  acoustics. 


SEC.  12.9  Rectangular  Membrane.  Double  Fourier  Series 


579 


Separation  of  the  Helmholtz  equation  is  achieved  if  we  set  F(x,y ) = H(x)Qiy).  By 
substitution  of  this  into  (5)  we  obtain 


Both  sides  must  equal  a constant,  by  the  usual  argument.  This  constant  must  be  negative, 
say,  — k2,  because  only  negative  values  will  lead  to  solutions  that  satisfy  (2)  without  being 
identically  zero.  Thus 


General  solutions  of  (6)  and  (7)  are 

I fix)  = A cos  kx  + B sin  kx  and  Q(y)  = C cos  py  + D sin  py 

with  constant  A,  B,  C,  D.  From  u = FG  and  (2)  it  follows  that  F = HQ  must  be  zero  on 
the  boundary,  that  is,  on  the  edges  x = 0,  x = a,  y = 0,  y = b\  see  Fig.  302.  This  gives 
the  conditions 


Hence  H{ 0)  = A = 0 and  then  H(a)  = B sin  ka  = 0.  Here  we  must  take  B F 0 since 
otherwise  H(x)  = 0 and  Fix,  y)  = 0.  Hence  sin  ka  = 0 or  ka  = imr,  that  is. 


To  separate  the  variables,  we  divide  both  sides  by  HQ,  finding 


This  yields  two  ODEs  for  H and  Q,  namely, 


(6) 


and 


(7) 


where  p2  = v2  — k2. 


Step  2.  Satisfying  the  Boundary  Condition 


H(  0)  = 0,  H(a)  = 0,  g(0)  = 0,  Q(b)  = 0. 


(m  integer). 
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In  precisely  the  same  fashion  we  conclude  that  C = 0 and  p must  be  restricted  to  the 
values  p = mr/b  where  n is  an  integer.  We  thus  obtain  the  solutions  H = Hm,  Q = Qn, 
where 


Hm[x)  = sin 


mTTX 


and 


^ • wiry 

Qn(y ) = Sin 


m — 1,  2, 


b n = 1,2, 


As  in  the  case  of  the  vibrating  string,  it  is  not  necessary  to  consider  m,  n = — 1,  —2,  • • • 
since  the  corresponding  solutions  are  essentially  the  same  as  for  positive  m and  n,  expect 
for  a factor  — 1 . Hence  the  functions 


(8) 


nnrx  mry 

Fmn(x,y ) = Hm(x)Qn(y)  = sin sin——, 

a b 


m = 1,2, 
n = 1,2, 


are  solutions  of  the  Helmholtz  equation  (5)  that  are  zero  on  the  boundary  of  our  membrane. 

Eigenfunctions  and  Eigenvalues.  Having  taken  care  of  (5),  we  turn  to  (4).  Since 
2 2 2 

p = v — k in  (7)  and  A = cv  in  (4),  we  have 


A = cVk2  + p2. 

Hence  to  k = rmr/a  and  p = mr/b  there  corresponds  the  value 


(9) 


I m2  n2 

A A rn/n.  c77a  ' + 72  ’ 

a b 


m = 1,2, 
n = 1,2,  •• 


in  the  ODE  (4).  A corresponding  general  solution  of  (4)  is 

d )>l:r l (k)  l>mn  COS  Amr,/  y B 7,n  ^ ^ krn  r!  t. 

It  follows  that  the  functions  umn(x,  y,  t)  = Fmn(x,  y)  Gmn{t),  written  out 


mTTx  . mry 

(10)  urarL{x,  y.  t)  (Bmn  cos  Amn?  + B fnn  sin  A mnt)  sin  sin 

a b 

with  Amr,  according  to  (9),  are  solutions  of  the  wave  equation  (1)  that  are  zero  on 
the  boundary  of  the  rectangular  membrane  in  Fig.  302.  These  functions  are  called  the 
eigenfunctions  or  characteristic  functions,  and  the  numbers  Amm  are  called  the 
eigenvalues  or  characteristic  values  of  the  vibrating  membrane.  The  frequency  of  umn 
is  Amn/277. 

Discussion  of  Eigenfunctions.  It  is  very  interesting  that,  depending  on  a and  b,  several 
functions  Fmn  may  correspond  to  the  same  eigenvalue.  Physically  this  means  that  there 
may  exists  vibrations  having  the  same  frequency  but  entirely  different  nodal  lines  (curves 
of  points  on  the  membrane  that  do  not  move).  Let  us  illustrate  this  with  the  following 
example. 
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EXAMPLE  1 


Eigenvalues  and  Eigenfunctions  of  the  Square  Membrane 

Consider  the  square  membrane  with  a = b = 1 . From  (9)  we  obtain  its  eigenvalues 

(11)  Amn  = c7rVm2  + n2. 

Hence  Amn  = Anm,  but  for  m A n the  corresponding  functions 

Fmn  — sin  nnrx  sin  firry  and  Fnm  — sin  nrrx  sin  miry 

are  certainly  different.  For  example,  to  A12  — A21  = C77  V5  there  correspond  the  two  functions 

F12  = sin  77 A'  sin  27Ty  and  F21  = sin  277 a sin  77 y. 

Hence  the  corresponding  solutions 

U12  = (#12  cos  c77  V5r  + B12  sin  c77 V5r)Fi2  and  W21  — (#21  cos  c77  V5r  + B21  sin  c77  V5f)^2i 

have  the  nodal  lines  y = | and  x = respectively  (see  Fig.  303).  Taking  B12  = 1 and  5*2  = #21  = 0,  we 
obtain 

(12)  M12  + u2i  = cos  c7rV5t(F12  + 52iF2i) 

which  represents  another  vibration  corresponding  to  the  eigenvalue  C77  V5.  The  nodal  line  of  this  function  is  the 
solution  of  the  equation 

F12  + B21F21  = sin  77a  sin  277y  + B21  sin  2irx  sin  77y  = 0 
or,  since  sin  2a  = 2 sin  a cos  a, 

(13)  sin  7Tx  sin  77y(cos  Try  + B21  cos  77x)  = 0. 

This  solution  depends  on  the  value  of  B21  (see  Fig.  304). 

From  (11)  we  see  that  even  more  than  two  functions  may  correspond  to  the  same  numerical  value  of  A mri. 
For  example,  the  four  functions  Fig,  Fgi,  F47,  and  F74  correspond  to  the  value 

Am  = A81  = A47  = A74  = C77  V65,  because  l2  + 82  = 42  + 72  = 65. 

This  happens  because  65  can  be  expressed  as  the  sum  of  two  squares  of  positive  integers  in  several  ways. 
According  to  a theorem  by  Gauss,  this  is  the  case  for  every  sum  of  two  squares  among  whose  prime  factors 
there  are  at  least  two  different  ones  of  the  form  4 n + 1 where  n is  a positive  integer.  In  our  case  we  have 
65  = 5 ■ 13  = (4  + 1)(  12  + 1).  ■ 


Fig.  303.  Nodal  lines  of  the  solutions 
M11,  m12,  u2V  u22,  un,  u31  in  the  case  of 
the  square  membrane 


Fig.  304.  Nodal  lines 
of  the  solution  (12)  for 
some  values  of  B21 
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Step  3.  Solution  of  the  Model  (1),  (2),  (3). 

Double  Fourier  Series 

So  far  we  have  solutions  (10)  satisfying  (1)  and  (2)  only.  To  obtain  the  solutions  that  also 
satisfies  (3),  we  proceed  as  in  Sec.  12.3.  We  consider  the  double  series 

oo  oo 

y>  0 

m = 1 n= 1 

^ ^ / r)  x , x • m7Tx  ■ n7Ty 

x 7 / 7 \Bmn  cos  Amnt  4-  B rnn  sin  Amrz/)  sin  sin 

i ! a b 

m=  1 n=l 

(without  discussing  convergence  and  uniqueness).  From  (14)  and  (3a),  setting  t = 0,  we 
have 

4°^  77177  V YlTTy 

(15)  M (x,  y,  0)  = 2)  2 sin  sin  TT  = y)- 

m=  1 n=l 


n (x,  y,  t)  = 

(14) 


Suppose  that/(x,  y)  can  be  represented  by  (15).  (Sufficient  for  this  is  the  continuity  of 
f df/dx,  df/dy,  d2f/dx  dy  in  R.)  Then  (15)  is  called  the  double  Fourier  series  of  fix,  y). 
Its  coefficients  can  be  determined  as  follows.  Setting 

oc  7177 

(16)  Km(y)  ^mn  sin  ^ 

n=l 

we  can  write  (15)  in  the  form 

fix , y)  = 2 Km(y)  sin 

m=  1 


For  fixed  y this  is  the  Fourier  sine  series  of/(x,  y),  considered  as  a function  of  x.  From 
(4)  in  Sec.  11.3  we  see  that  the  coefficients  of  this  expansion  are 


(17) 


K'm(y) 


2 

a 


a 

fix,  y)  sin 


mTTx 
ax. 


Furthermore,  (16)  is  the  Fourier  sine  series  of  Km(y),  and  from  (4)  in  Sec.  1 1.3  it  follows 
that  the  coefficients  are 


B 


mn 


2 

b 


rb 

mry 

Kmiy ) sin  — dy. 
b 


From  this  and  (17)  we  obtain  the  generalized  Euler  formula 


(18) 


By. 


_4_ 

ab 


rb  a 

fix,  y)  sin 

J 0 0 


mTTx 

a 


sin 


niry 

dx  dy 

b 


m = 1,2, 

7i  = 1,2,  ••• 


for  the  Fourier  coefficients  of  f(x,  y)  in  the  double  Fourier  series  (15). 
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EXAMPLE  2 


The  Bmn  in  (14)  are  now  determined  in  terms  affix,  y).  To  determine  the  B* 
differentiate  (14)  termwise  with  respect  to  f;  using  (3b),  we  obtain 


we 


du 

dt 


= 22* 

t 0 m=l n= 1 


win  ^ run 


. lmrx  . mry 
sm sin  — - = g(x,  y). 


Suppose  that  g(x,  y ) can  be  developed  in  this  double  Fourier  series.  Then,  proceeding  as 
before,  we  find  that  the  coefficients  are 


(19) 


D y 


ab\n 


CL 

g(x,  y)  sin 

^0 


mTTx  . mry 

sm  — — ax  ay 

a a 


m = 1,  2,  • • • 
n = 1,  2,  • • • . 


Result.  If  f and  g in  (3)  are  such  that  u can  be  represented  by  (14),  then  (14)  with 
coefficients  (18)  and  (19)  is  the  solution  of  the  model  (1),  (2),  (3). 

Vibration  of  a Rectangular  Membrane 

Find  the  vibrations  of  a rectangular  membrane  of  sides  a = 4 ft  and  b = 2 ft  (Fig.  305)  if  the  tension  is  12.5  lb/ft, 
the  density  is  2.5  slugs/ft2  (as  for  light  rubber),  the  initial  velocity  is  0,  and  the  initial  displacement  is 

(20)  f(x,  y)  = 0. 1 (4x  - x2)(2y  - y 2)  ft. 


Membrane 


Initial  displacement 


Fig.  305.  Example  2 


Solution,  c2  = T/p  = 12.5/2.5  = 5 [ft2/ sec2] . Also  B^n  = 0 from  (19).  From  (18)  and  (20), 


^ f2  f4 

4 ,,  0 mux  mry 

l> mn  ~ 0.1(4*  — xz)(2 y — y ) sin sin  — — dx  dy 

4 • 2 J0  J0  ■ • 4 2 


1 


= — (4*  — x ) sin dx  (2 y — y ) sin dy. 


mry 


Two  integrations  by  parts  give  for  the  first  integral  on  the  right 


128  m 256 

[1  - (-l)m]  = 


(m  odd) 


and  for  the  second  integral 


16 


;[1  - (-If]  = 


32 


(n  odd). 
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For  even  m or  ti  we  get  0.  Together  with  the  factor  1/20  we  thus  have  Bmn  = 0 if  m or  n is  even  and 


B 


ran 


256  ■ 32 
20m  Vir6 


0.426050 


(m  and  n both  odd). 


From  this,  (9),  and  (14)  we  obtain  the  answer 


u(x,y,  t)  = 0.426050  2 2 3 3 cos 

m,n  odd  HI  fl 


'\/n 


Air 


mirx  . niry 

t sin sin  — — 

4 2 


/ V5ttV5  . t tx  . vy  i V5ttV37  7 TX  . 37ry 

(21)  = 0.426050  cos 1 sin  — sin 1 cos t sin  — sin 

V 4 A 2 21  A 42 

1 V577VT3  . 377.1'  . 77 1 V577V45  _ 37 tx  . 377 y 

H cos 1 sin sin 1 cos 1 sin sin 1- 

27  4 4 2 729  4 4 2 


To  discuss  this  solution,  we  note  that  the  first  term  is  very  similar  to  the  initial  shape  of  the  membrane,  has  no 
nodal  lines,  and  is  by  far  the  dominating  term  because  the  coefficients  of  the  next  terms  are  much  smaller.  The 
second  term  has  two  horizontal  nodal  lines  (y  = §,  f),  the  third  term  two  vertical  ones  (x  = §,  §),  the  fourth 
term  two  horizontal  and  two  vertical  ones,  and  so  on. 


FR~OB£=E^F=SET-^l-2=9 


1.  Frequency.  How  does  the  frequency  of  the  eigen- 
functions of  the  rectangular  membrane  change  (a)  If 
we  double  the  tension?  (b)  If  we  take  a membrane  of 
half  the  density  of  the  original  one?  (c)  If  we  double 
the  sides  of  the  membrane?  Give  reasons. 

2.  Assumptions.  Which  part  of  Assumption  2 cannot  be 
satisfied  exactly?  Why  did  we  also  assume  that  the 
angles  of  inclination  are  small? 

3.  Determine  and  sketch  the  nodal  lines  of  the  square 
membrane  for  m = 1,  2,  3,  4 and  n = 1,  2,  3,  4. 


4-8 


DOUBLE  FOURIER  SERIES 


Represent /(x,  y)  by  a series  (15),  where 


4.  /( x,y)  =1,  a = b = 1 

5.  f{x,  y)  =y,  a = b = 1 

6.  f(x,  y)  = x,  a = b = 1 

7.  f(x,  y)  = xy,  a and  b arbitrary 

8.  f(x,  y)  = xy(a  — x)(b  — y),  a and  b arbitrary 

9.  CAS  PROJECT.  Double  Fourier  Series,  (a)  Write 
a program  that  gives  and  graphs  partial  sums  of  (15). 
Apply  it  to  Probs.  5 and  6.  Do  the  graphs  show  that 
those  partial  sums  satisfy  the  boundary  condition  (3a)? 
Explain  why.  Why  is  the  convergence  rapid? 

(b)  Do  the  tasks  in  (a)  for  Prob.  4.  Graph  a portion, 

say,  0 < x < 0 < y < g,  of  several  partial  sums  on 

common  axes,  so  that  you  can  see  how  they  differ.  (See 
Fig.  306.) 

(c)  Do  the  tasks  in  (b)  for  functions  of  your  choice. 


Fig.  306  Partial  sums  S22  and  S1010 
in  CAS  Project  9b 


10.  CAS  EXPERIMENT.  Quadruples  of  Fmn.  Write  a 
program  that  gives  you  four  numerically  equal  \mn  in 
Example  1,  so  that  four  different  Fmn  correspond  to  it. 
Sketch  the  nodal  lines  of  F18,  Fsi.  E47,  F74  in  Example 
1 and  similarly  for  further  Fmn  that  you  will  find. 
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SQUARE  MEMBRANE 


Find  the  deflection  u (x,  y,  t)  of  the  square  membrane  of  side 
7 t and  r2  = 1 for  initial  velocity  0 and  initial  deflection 


11.  0.1  sin  2x  sin  4y 


12.  0.01  sin  x sin  y 

13.  0.1xy(7t  — x)(tt  — y) 
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14-19 


RECTANGULAR  MEMBRANE 


14.  Verify  the  discussion  of  (21)  in  Example  2. 

15.  Do  Prob.  3 for  the  membrane  with  a = 4 and  b = 2. 

16.  Verify  Bmn  in  Example  2 by  integration  by  parts. 

17.  Find  eigenvalues  of  the  rectangular  membrane  of  sides 
fi  = 2 and  b = 1 to  which  there  correspond  two  or 
more  different  (independent)  eigenfunctions. 

18.  Minimum  property.  Show  that  among  all  rectangular 
membranes  of  the  same  area  A = ab  and  the  same  c 
the  square  membrane  is  that  for  which  Un  [see  (10)] 
has  the  lowest  frequency. 


19.  Deflection.  Find  the  deflection  of  the  membrane  of 
sides  a and  b with  c2  = 1 for  the  initial  deflection 

677-jc  27 rv 

f(x,  y)  = sin  a sin  ' and  initial  velocity  0. 


20.  Forced  vibrations.  Show  that  forced  vibrations  of  a 
membrane  are  modeled  by  the  PDE  u tt  = c2V2u  + P/p, 
where  P(x,  y,  t)  is  the  external  force  per  unit  area  acting 
perpendicular  to  the  xy-plane. 


12.1  Laplacian  in  Polar  Coordinates. 
Circular  Membrane. 
Fourier-Bessel  Series 


It  is  a general  principle  in  boundary  value  problems  for  PDEs  to  choose  coordinates  that 
make  the  formula  for  the  boundary  as  simple  as  possible.  Here  polar  coordinates  are  used 
for  this  purpose  as  follows.  Since  we  want  to  discuss  circular  membranes  (drumheads), 
we  first  transform  the  Laplacian  in  the  wave  equation  (1),  Sec.  12.9, 

(1)  Uft  C V U C (UXX  4“  Uyy ) 

(subscripts  denoting  partial  derivatives)  into  polar  coordinates  r,  9 defined  by  x = r cos  6, 
y = rsin  0\  thus. 


r = V x2  + y 2,  tan  9 = 


By  the  chain  rule  (Sec.  9.6)  we  obtain 

ux  urrx  I it  q9 x . 

Differentiating  once  more  with  respect  to  x and  using  the  product  rule  and  then  again  the 
chain  rule  gives 


UXX  ( U "f  ( U()9x)x 

(2)  (iir)xrx  ttTrXx  “f  fi(/)x0x  “P  UqQxx 

( urrrx  4*  ur09x)rx  + uTrxx  4“  ( Uyrrx  4-  u ni)9x)9x  4~  Uq9xx. 
Also,  by  differentiation  of  r and  9 we  find 


+ y 


ox  - 


1 + (y/x) 


2' 


r 
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Fig.  307.  Circular 
membrane 


Differentiating  these  two  formulas  again,  we  obtain 


rxx 


We  substitute  all  these  expressions  into  (2).  Assuming  continuity  of  the  first  and  second 
partial  derivatives,  we  have  urff  = uBr,  and  by  simplifying, 


(3) 


_x_  _ ,y_  ,y_  , 

*xx  o Urr  ^ o ' 4 ^00  ' o ^ w d • 

r r r r r 


In  a similar  fashion  it  follows  that 

2 2 2 

— y , ■*  , * ixy 

(4)  Myy  2 UTr  ' 2 “ UrQ  ~T  ~ UqQ  + ~ Ur  2 ~ U.0. 

r r r r r 


By  adding  (3)  and  (4)  we  see  that  the  Laplacian  of  u in  polar  coordinates  is 
(5)  V2m  = 


-,2 

d U 


dr' 


1 du  1 d u 

H 1 p — p. 

r dr  rz  dOz 


Circular  Membrane 

Circular  membranes  are  important  parts  of  drums,  pumps,  microphones,  telephones,  and 
other  devices.  This  accounts  for  their  great  importance  in  engineering.  Whenever  a circular 
membrane  is  plane  and  its  material  is  elastic,  but  offers  no  resistance  to  bending  (this 
excludes  thin  metallic  membranes!),  its  vibrations  are  modeled  by  the  two-dimensional 
wave  equation  in  polar  coordinates  obtained  from  (1)  with  V2«  given  by  (5),  that  is. 


(6) 


d2u  2 ( d2u  Idu  J_d\\  2 _T 

dt2  Ur2  + r dr  + r2  dO2 ) C ~ P' 


We  shall  consider  a membrane  of  radius  R (Fig.  307)  and  determine  solutions  u(r,  t) 
that  are  radially  symmetric.  (Solutions  also  depending  on  the  angle  6 will  be  discussed  in 
the  problem  set.)  Then  ugg  = 0 in  (6)  and  the  model  of  the  problem  (the  analog  of  (1), 
(2),  (3)  in  Sec.  12.9)  is 


(7) 

d2 ll  2 ( d2 11  1 du  \ 

dt2  ° \5r2  r dr ) 

(8) 

u (R,  t)  = 0 for  all  f g 0 

(9a) 

u(r,  0)  =f(r) 

(9b) 

ut(r,  0)  = g(r). 

Here  (8)  means  that  the  membrane  is  fixed  along  the  boundary  circle  r = R.  The  initial 
deflection  f(r)  and  the  initial  velocity  g(r)  depend  only  on  r,  not  on  0,  so  that  we  can 
expect  radially  symmetric  solutions  u(r,  t). 
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Step  1.  Two  ODEs  From  the  Wave  Equation  (7). 

Bessel’s  Equation 

Using  the  method  of  separation  of  variables,  we  first  determine  solutions  u(r,  t ) = 
W(r)G(t).  (We  write  W,  not  F because  W depends  on  r,  whereas  F,  used  before,  depended 
on  x.)  Substituting  u = WG  and  its  derivatives  into  (7)  and  dividing  the  result  by  c2WG, 
we  get 


G 

c G 


W 


w"  + 1 w' 


where  dots  denote  derivatives  with  respect  to  t and  primes  denote  derivatives  with  respect 
to  r.  The  expressions  on  both  sides  must  equal  a constant.  This  constant  must  be  negative, 
say,  —k2,  in  order  to  obtain  solutions  that  satisfy  the  boundary  condition  without  being 
identically  zero.  Thus, 


G 

2/“- 
c G 


W 


W"  + - W'  ) = —k2. 


This  gives  the  two  linear  ODEs 


(10) 


G + A2G  = 0 


where  A = ck 


and 


(ID 


W 


+ ~ W'  + k2W  = 0. 
r 


We  can  reduce  (11)  to  Bessel’s  equation  (Sec.  5.4)  if  we  set  v = kr.  Then  1/r  = k/s  and, 
retaining  the  notation  W for  simplicity,  we  obtain  by  the  chain  rule 


dW  dWds  dW  dt2W 

W = ——  = = k and  W = 


dr  ds  dr  ds 


ds 


k2. 


By  substituting  this  into  (11)  and  omitting  the  common  factor  kz  we  have 


(12) 


dzW  . 1 dW 


+ + W = 0. 


dsz  s ds 


This  is  Bessel’s  equation  (1),  Sec.  5.4,  with  parameter  v = 0. 

Step  2.  Satisfying  the  Boundary  Condition  (8) 

Solutions  of  (12)  are  the  Bessel  functions  J0  and  Y0  of  the  first  and  second  kind  (see  Secs. 
5.4,  5.5).  But  hi  becomes  infinite  at  0,  so  that  we  cannot  use  it  because  the  deflection  of 
the  membrane  must  always  remain  finite.  This  leaves  us  with 


(13) 


W(r)  = J0(s)  = J0{kr ) 


(s  = kr). 
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On  the  boundary  r = R we  get  W (R)  = J0  ( kR ) = 0 from  (8)  (because  G = 0 would  imply 
u = 0).  We  can  satisfy  this  condition  because  J0  has  (infinitely  many)  positive  zeros, 
s = aq,  a2,  ' ' ' (see  Fig-  308),  with  numerical  values 

oq  = 2.4048,  a2  = 5.5201,  a3  = 8.6537,  a4  = 11.7915,  a5  = 14.9309 

and  so  on.  (For  further  values,  consult  your  CAS  or  Ref.  [GenRefl]  in  App.  1.)  These 
zeros  are  slightly  irregularly  spaced,  as  we  see.  Equation  (13)  now  implies 

am 

(14)  kR  = am  thus  k = km  = m = 1,2,  • • • . 

Hence  the  functions 

(15)  Wm(r ) = J0(kmr)  = J0  rj,  m=  ]’2’ 

are  solutions  of  (11)  that  are  zero  on  the  boundary  circle  r = R. 

Eigenfunctions  and  Eigenvalues.  For  Wm  in  (15),  a corresponding  general  solution  of 
(10)  with  A = Am  = ckm  = cam/R  is 

^’rnit)  A m cos  A mt  + R)ri  sin  kmt. 


Hence  the  functions 

(16)  nm(r,  i)  Wrn( r)Gm(t)  (Amcos  A mt  T /im  s i n A,fnt)J()( kmr) 

with  m = 1,  2,  ■ • • are  solutions  of  the  wave  equation  (7)  satisfying  the  boundary  condition 
(8).  These  are  the  eigenfunctions  of  our  problem.  The  corresponding  eigenvalues  are  Am. 

The  vibration  of  the  membrane  corresponding  to  um  is  called  the  ;nth  normal  mode; 
it  has  the  frequency  Am/277 ■ cycles  per  unit  time.  Since  the  zeros  of  the  Bessel  function 
Jo  are  not  regularly  spaced  on  the  axis  (in  contrast  to  the  zeros  of  the  sine  functions 
appearing  in  the  case  of  the  vibrating  string),  the  sound  of  a drum  is  entirely  different 
from  that  of  a violin.  The  forms  of  the  normal  modes  can  easily  be  obtained  from  Fig.  308 
and  are  shown  in  Fig.  309.  For  m = 1,  all  the  points  of  the  membrane  move  up  (or  down) 
at  the  same  time.  For  in  = 2,  the  situation  is  as  follows.  The  function  W2(r)  = Jo(a2r/R) 
is  zero  for  a2r/R  = aq,  thus  r = ix\Rj cx2.  The  circle  r = ot\R/a2  is,  therefore,  nodal  line, 
and  when  at  some  instant  the  central  part  of  the  membrane  moves  up,  the  outer  part 
( r > a\Ri a2)  moves  down,  and  conversely.  The  solution  «m(r,  f)  has  m — 1 nodal  lines, 
which  are  circles  (Fig.  309). 


Fig.  308.  Bessel  function  J0[s) 
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m = 1 


m = 3 


Fig.  309.  Normal  modes  of  the  circular  membrane  in  the  case  of  vibrations 
independent  of  the  angle 


Step  3.  Solution  of  the  Entire  Problem 

To  obtain  a solution  u(r , t)  that  also  satisfies  the  initial  conditions  (9),  we  may  proceed 
as  in  the  case  of  the  string.  That  is,  we  consider  the  series 


oo  oo  / a 

(17)  u (r,  t ) Wm(r)Gm(t)  2)  (Am  cos  Amf  T sin  Amt)/o  ( ^ f 

m=  1 m=l 


(leaving  aside  the  problems  of  convergence  and  uniqueness).  Setting  t = 0 and  using  (9a), 
we  obtain 


(18) 


w (r,  0) 


oo 

V.  2lm/o 

m=l 


/to- 


Thus  for  the  series  (17)  to  satisfy  the  condition  (9a),  the  constants  Am  must  be  the 
coefficients  of  the  Fourier-Bessel  series  (18)  that  represents /(r)  in  terms  of  Jo  (anlr/ Ry, 
that  is  [see  (9)  in  Sec.  1 1.6  with  n = 0,  otQ  m = am,  and  x = r ], 


(19) 


2 

R2j\(am) 


rR 


rf(r)J0 


dr 


(m  = 1,2,  •■■). 


Differentiability  of  fir)  in  the  interval  0 g r g S is  sufficient  for  the  existence  of  the 
development  (18);  see  Ref.  [A13],  The  coefficients  Bm  in  (17)  can  be  determined  from 
(9b)  in  a similar  fashion.  Numeric  values  of  Am  and  Bm  may  be  obtained  from  a CAS  or 
by  a numeric  integration  method,  using  tables  of  J0  and  J\.  However,  numeric  integration 
can  sometimes  be  avoided,  as  the  following  example  shows. 
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EXAMPLE 


Vibrations  of  a Circular  Membrane 

Find  the  vibrations  of  a circular  drumhead  of  radius  1 ft  and  density  2 slugs/ft2  if  the  tension  is  8 lb/ft,  the 
initial  velocity  is  0,  and  the  initial  displacement  is 

m = 1 - r2  [ft]. 

Solution,  c2  = T/p  = 1 = 4 [ft2/  sec2].  Also  Bm  = 0,  since  the  initial  velocity  is  0.  From  (10)  in  Sec.  1 1.6, 
since  R = 1,  we  obtain 

2 r 1 

Am  = — r(l  - rVoM* 

j 1 (“m)  ^0 

472(am) 

8 

«»„,/ 1 (rrm) 

where  the  last  equality  follows  from  (21c),  Sec.  5.4,  with  v = 1,  that  is, 

2 2 
J‘2  (A.„J  — — ,/i  (&m)  — ./))  ( <rm)  *—  ./-]  (Vrm). 

Table  9.5  on  p.  409  of  [GenRefl]  gives  am  and  From  this  we  get  J\{ocm)  = by  (21b),  Sec.  5.4, 

with  v = 0,  and  compute  the  coefficients  Am: 


m 

Oim 

*^2  (^m) 

A 

i 

2.40483 

0.51915 

0.43176 

1.10801 

2 

5.52008 

-0.34026 

-0.12328 

-0.13978 

3 

8.65373 

0.27145 

0.06274 

0.04548 

4 

11.79153 

-0.23246 

-0.03943 

-0.02099 

5 

14.93092 

0.20655 

0.02767 

0.01164 

6 

18.07106 

-0.18773 

-0.02078 

-0.00722 

7 

21.21164 

0.17327 

0.01634 

0.00484 

8 

24.35247 

-0.16170 

-0.01328 

-0.00343 

9 

27.49348 

0.15218 

0.01107 

0.00253 

10 

30.63461 

-0.14417 

-0.00941 

-0.00193 

Thus 


f(r)  = 1.1087o(2.4048r)  - 0.140/o(5.5201r)  + 0.0457o(8.6537r) 

We  see  that  the  coefficients  decrease  relatively  slowly.  The  sum  of  the  explicitly  given  coefficients  in  the  table 
is  0.99915.  The  sum  of  all  the  coefficients  should  be  1.  (Why?)  Hence  by  the  Leibniz  test  in  App.  A3. 3 the 
partial  sum  of  those  terms  gives  about  three  correct  decimals  of  the  amplitude  /(r). 

Since 


Am  ckm  C(xm/R  1cxm, 

from  (17)  we  thus  obtain  the  solution  (with  r measured  in  feet  and  t in  seconds) 

u(r,  t ) = 1.108/o(2.4048r)  cos  4.8097?  - 0.1407o(5.5201r)  cos  11.0402?  + 0.0457o(8.6537r)  cos  17.3075? 

In  Fig.  309,  m = 1 gives  an  idea  of  the  motion  of  the  first  term  of  our  series,  m = 2 of  the  second  term,  and 
m = 3 of  the  third  term,  so  that  we  can  “see”  our  result  about  as  well  as  for  a violin  string  in  Sec.  12.3. 
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FRQBgE^M=S^E-T— 1-2^1-S 


1-3 


RADIAL  SYMMETRY 


1.  Why  did  we  introduce  polar  coordinates  in  this 
section? 


2.  Radial  symmetry  reduces  (5)  to  V2u  = + ur/r. 

Derive  this  directly  from  V2i;  = uxx  + uyy.  Show 
that  the  only  solution  of  V2u  = 0 depending  only  on 
r — V*2  + y2  is  u — a In  r + b with  arbitrary  con- 
stants a and  b. 


3.  Alternative  form  of  (5).  Show  that  (5)  can  be  written 
V2«  = (rur)r/r  + ugg/r2,  a form  that  is  often  practical. 


with  arbitrary  A0  and 


An 


7 mRn 


f(8)  cos  nO  dO, 


B„  — 


7 TnRn 


f(8)  sin  n8  dd. 


(e)  Compatibility  condition.  Show  that  (9),  Sec.  10.4, 
imposes  on  f(6 ) in  (d)  the  “compatibility  condition  ” 


m dd  = 0. 


BOUNDARY  VALUE  PROBLEMS.  SERIES 

4.  TEAM  PROJECT.  Series  for  Dirichlet  and  Neumann 
Problems 

(a)  Show  that  un  = rn cos  nd,  un  = rn  sin  nd,  n — 0, 
1,  - - - , are  solutions  of  Laplace’s  equation  V2zz  = 0 
with  V2«  given  by  (5).  (What  would  un  be  in  Cartesian 
coordinates?  Experiment  with  small  n.) 

(b)  Dirichlet  problem  (See  Sec.  12.6)  Assuming  that 
termwise  differentiation  is  permissible,  show  that  a 
solution  of  the  Laplace  equation  in  the  disk  r < R 
satisfying  the  boundary  condition  u(R,  8)  = f{8)  {R  and 
/given)  is 


(20) 


u(r,  8)  = a0  + 2 

n= 1 


n 

cos  nd 


n 

sin  nd 


where  an,  bn  are  the  Fourier  coefficients  of  / (see 
Sec.  11.1). 

(c)  Dirichlet  problem.  Solve  the  Dirichlet  problem 
using  (20)  if  R = 1 and  the  boundary  values  are 
u(8)  = —100  volts  if  — 7T  < 8 < 0,  u(8)  = 100  volts 
if  0 < 8 < 77.  (Sketch  this  disk,  indicate  the  boundary 
values.) 

(d)  Neumann  problem.  Show  that  the  solution  of  the 
Neumann  problem  V2z<  = 0 if  r < R,  uN(R,  8)  = f{8) 
(where  uN  = du/dN  is  the  directional  derivative  in  the 
direction  of  the  outer  normal)  is 


(f)  Neumann  problem.  Solve  V2zz  = 0 in  the  annulus 
1 < r < 2 if  «r(  1,  8)  = sin  8,  ur( 2,  8)  = 0. 


5-8 


ELECTROSTATIC  POTENTIAL. 
STEADY-STATE  HEAT  PROBLEMS 


The  electrostatic  potential  satisfies  Laplace’s  equation 
V2zz  = 0 in  any  region  free  of  charges.  Also  the  heat 
equation  ut  = c2V2zz  (Sec.  12.5)  reduces  to  Laplace’s 
equation  if  the  temperature  u is  time-independent 
(“steady-state  case”).  Using  (20),  find  the  potential 
(equivalently:  the  steady-state  temperature)  in  the  disk 
r < 1 if  the  boundary  values  are  (sketch  them,  to  see  what 
is  going  on). 

5.  u(  1,  8)  = 220  if  — g7 t < 8 < 1 7T  and  0 otherwise 

6.  u(l,8)  = 400  cos3  8 

7.  u(l,  8)  = 11O|0|  if  —77  < 8 < TT 

8.  u(  1,  8)  = 8 if  — g7r  < 8 < \tt  and  0 otherwise 

9.  CAS  EXPERIMENT.  Equipotential  Lines.  Guess 
what  the  equipotential  lines  u(r , 8)  = const  in  Probs.  5 
and  7 may  look  like.  Then  graph  some  of  them,  using 
partial  sums  of  the  series. 

10.  Semidisk.  Find  the  electrostatic  potential  in  the  semi- 
disk r < 1,  0 < 8 < 77  which  equals  1100(77  — 8) 
on  the  semicircle  r = 1 and  0 on  the  segment 
— 1 < x < 1. 

11.  Semidisk.  Find  the  steady-state  temperature  in  a 
semicircular  thin  plate  r < a,  0 < 8 < tt  with  the 
semicircle  r = a kept  at  constant  temperature  u0  and 
the  segment  —a  < x < a at  0. 


CIRCULAR  MEMBRANE 


w(l  8)  — A0  + ^ rn(An  cos  n8  + Bn  sin  nd) 

n= 1 


12.  CAS  PROJECT.  Normal  Modes,  (a)  Graph  the 
normal  modes  m4,  u5,  u6  as  in  Fig.  306. 
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(b)  Write  a program  for  calculating  the  Am’s  in 
Example  1 and  extend  the  table  to  m = 15.  Verify 
numerically  that  am  ~ (m  — 5)7 r and  compute  the 
error  for  m = 1,  • ■ ■ , 10. 

(c)  Graph  the  initial  deflection /(r)  in  Example  1 as 
well  as  the  first  three  partial  sums  of  the  series. 
Comment  on  accuracy. 

(d)  Compute  the  radii  of  the  nodal  lines  of  u2,  u3,  w4 
when  R = 1 . How  do  these  values  compare  to  those  of 
the  nodes  of  the  vibrating  string  of  length  1 ? Can  you 
establish  any  empirical  laws  by  experimentation  with 
further  uml 

13.  Frequency.  What  happens  to  the  frequency  of  an 
eigenfunction  of  a drum  if  you  double  the  tension? 

14.  Size  of  a drum.  A small  drum  should  have  a higher 
fundamental  frequency  than  a large  one,  tension  and 
density  being  the  same.  How  does  this  follow  from  our 
formulas? 

15.  Tension.  Find  a formula  for  the  tension  required 
to  produce  a desired  fundamental  frequency  of  a 
drum. 

16.  Why  is  A1  + A2  + ■ ■ • = 1 in  Example  1?  Compute 
the  first  few  partial  sums  until  you  get  3-digit 
accuracy.  What  does  this  problem  mean  in  the  field 
of  music? 

17.  Nodal  lines.  Is  it  possible  that  for  fixed  c and  R two 
or  more  um  [see  (16)]  with  different  nodal  lines 
correspond  to  the  same  eigenvalue?  (Give  a reason.) 

18.  Nonzero  initial  velocity  is  more  of  theoretical  interest 
because  it  is  difficult  to  obtain  experimentally.  Show 
that  for  (17)  to  satisfy  (9b)  we  must  have 


(21) 


Bm,  Vi. 


rg(r)J0(amr/R)  dr 


where  Km  = 2 /(ca  mR)J  1 (a  m)- 


(24)  F„  + - Fr  + —2  Fm  + k2F  - 0. 

r r 

Show  that  the  PDE  can  now  be  separated  by  sub- 
stituting F = W(r)Q(6),  giving 

(25)  Q"  + n2Q  = 0, 

(26)  r2W"  + rW'  + ( k2r 2 - n2)W  = 0. 

20  Periodicity.  Show  that  Q(Q)  must  be  periodic  with 
period  2tt  and,  therefore,  n — 0,  1,  2,  ■ • • in  (25)  and 

(26) .  Show  that  this  yields  the  solutions  Qn  = cos  n6 , 
Qn  = sin  nQ,  Wn  = Jn(kr),  n = 0,  1,  ■ ■ ■ . 

21.  Boundary  condition.  Show  that  the  boundary  condition 

(27)  u(R,0,t)  = 0 

leads  to  k = kmn  — amn/R,  where  5 = anm  is  the  mth 
positive  zero  of  Jn(s). 

22.  Solutions  depending  on  both  r and  0.  Show  that 
solutions  of  (22)  satisfying  (27)  are  (see  Fig.  310) 

Unm  = (A  nm,  cos  cknrnt  + Bnrn  sin  cknmt) 

x Jn(knmr)  cos  n0 

<28>  * * 

^ nm  (^nm  cos  cknrnt  H-  Bnm  sin  cknrnt ) 

X (knmf)  sin  nQ 


Fig.  310.  Nodal  lines  of  some  of  the  solutions  (28) 


VIBRATIONS  OF  A CIRCULAR  MEMBRANE 
DEPENDING  ON  BOTH  r AND  d 

19.  (Separations)  Show  that  substitution  of  u = F(r,6)G(t) 
into  the  wave  equation  (6),  that  is, 

J I 1 \ 

(22)  Uft  C I Urr  "i"  ^ Ur  E ^ Uqq  J , 

gives  an  ODE  and  a PDE 

(23)  G + A2G  = 0, 


23.  Initial  condition.  Show  that  ut(r,  0,  0)  = 0 gives 
Bnm  = 0,  Bnm  = 0 in  (28). 

24.  Show  that  Ugm  = 0 and  u0m  is  identical  with  (16)  in 
this  section. 

25.  Semicircular  membrane.  Show  that  represents  the 
fundamental  mode  of  a semicircular  membrane  and 
find  the  corresponding  frequency  when  c2  = 1 and 
R = 1. 


where  A = ck. 
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12.  Laplaces  Equation  in  Cylindrical  and 
Spherical  Coordinates.  Potential 

One  of  the  most  important  PDEs  in  physics  and  engineering  applications  is  Laplace’s 
equation,  given  by 

(1)  V ll  Uxx  ^yy  "f"  Mzz  0- 

Here,  x,  y,  z are  Cartesian  coordinates  in  space  (Fig.  167  in  Sec.  9.1),  uxx  = d2u/dx2,  etc. 
The  expression  V2«  is  called  the  Laplacian  of  w.  The  theory  of  the  solutions  of  (1)  is 
called  potential  theory.  Solutions  of  (1)  that  have  continuous  second  partial  derivatives 
are  known  as  harmonic  functions. 

Laplace’s  equation  occurs  mainly  in  gravitation,  electrostatics  (see  Theorem  3,  Sec.  9.7), 
steady-state  heat  flow  (Sec.  12.5),  and  fluid  flow  (to  be  discussed  in  Sec.  18.4). 

Recall  from  Sec.  9.7  that  the  gravitational  potential  u(x,  y,  z)  at  a point  (x,  y,  z)  resulting 
from  a single  mass  located  at  a point  (X,  Y,  Z)  is 


(2) 


u (x,  y,  z ) 


c 

r 


c 

V(x  - X)2  + (y  - Y)2  + (z~  Z)2 


(r>  0) 


and  u satisfies  (1).  Similarly,  if  mass  is  distributed  in  a region  T in  space  with  density 
p (X,  Y,  Z),  its  potential  at  a point  (x,  y,  z)  not  occupied  by  mass  is 


(3) 


u (x,  y,  z)  = k 


-ff  p(X,  Y,Z) 
r 

J J J 


T 


dXdYdZ. 


It  satisfies  (1)  because  V2  ( I jr)  = 0 (Sec.  9.7)  and  p is  not  a function  of  x,  y,  z. 

Practical  problems  involving  Laplace’s  equation  are  boundary  value  problems  in  a 
region  T in  space  with  boundary  surface  S.  Such  problems  can  be  grouped  into  three  types 
(see  also  Sec.  12.6  for  the  two-dimensional  case): 


(I)  First  boundary  value  problem  or  Dirichlet  problem  if  u is  prescribed  on  S. 

(II)  Second  boundary  value  problem  or  Neumann  problem  if  the  normal 
derivative  un  = du/dn  is  prescribed  on  .S'. 

(Ill)  Third  or  mixed  boundary  value  problem  or  Robin  problem  if  u is  prescribed 
on  a portion  of  S and  un  on  the  remaining  portion  of  .S. 

In  general,  when  we  want  to  solve  a boundary  value  problem,  we  have  to  first  select 
the  appropriate  coordinates  in  which  the  boundary  surface  S has  a simple  representation. 
Here  are  some  examples  followed  by  some  applications. 


Laplacian  in  Cylindrical  Coordinates 

The  first  step  in  solving  a boundary  value  problem  is  generally  the  introduction  of 
coordinates  in  which  the  boundary  surface  S has  a simple  representation.  Cylindrical 
symmetry  (a  cylinder  as  a region  T)  calls  for  cylindrical  coordinates  r,  6,  z related  to 

x,  y,  z by 


(4) 


x = r cos  0, 


y = r sin  6, 


Z = Z 


(Fig.  311). 
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Fig.  311.  Cylindrical  coordinates 
(rSO,OSDS  27t) 


Fig.  312.  Spherical  coordinates 
(rsO,OS0S2ir,OSf  Stt) 


For  these  we  get  V2m  immediately  by  adding  uzz  to  (5)  in  Sec.  12.10;  thus, 
(5)  V2m  = 


2 d2M  1 du  1 d2M  d2M 


+ 


+ 2 -,a2  + -,2  ■ 


dr  r dr  r dO  dz 


Laplacian  in  Spherical  Coordinates 

Spherical  symmetry  (a  ball  as  region  T bounded  by  a sphere  S ) requires  spherical 
coordinates  r,  6 , </;  related  to  x,  y,  z by 

(6)  x = rcos  0 sin  <fi,  y = r sin  0 sin  ip , z = r cos  (Fig.  312). 

Using  the  chain  rule  (as  in  Sec.  12.10),  we  obtain  V2m  in  spherical  coordinates 


(7) 


d2u  2 du  1 d2u  cot  </>  du 


1 


i2 

d U 


V U — h “ h c\  n “h  c\  “h  cy  cy  ey  . 

dr 2 r dr  r d(j)  rz  dip  r sin  (p  d0z 


We  leave  the  details  as  an  exercise.  It  is  sometimes  practical  to  write  (7)  in  the  form 

1 


(7') 


V w = 


— (r2  — ) + — t 

dr  V dr  / sin  <p  dip  \ 


9 ( ■ , dit  , 

sin  ip  — | + 
dip 


1 d2« 

sin2  ip  d62 


Remark  on  Notation.  Equation  (6)  is  used  in  calculus  and  extends  the  familiar  notation 
for  polar  coordinates.  Unfortunately,  some  books  use  6 and  ip  interchanged,  an  extension 
of  the  notation  x = r cos  ip,  y = r sin  ip  for  polar  coordinates  (used  in  some  European 
countries). 

Boundary  Value  Problem  in  Spherical  Coordinates 

We  shall  solve  the  following  Dirichlet  problem  in  spherical  coordinates: 


(8) 

(9) 


V 2 
r 


— ( r2  — | + 
dr  V dr 


1 d . , du 

sin  ip  — 

sin  ip  dip  \ dip  J 


= 0. 


u(R,iP)  =f(<P) 
lim  u (r,  ip)  = 0. 


(10) 
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The  PDE  (8)  follows  from  (7)  or  (7')  by  assuming  that  the  solution  u will  not  depend  on 
6 because  the  Dirichlet  condition  (9)  is  independent  of  6.  This  may  be  an  electrostatic 
potential  (or  a temperature)  /(</>)  at  which  the  sphere  .S':  r = R is  kept.  Condition  (10) 
means  that  the  potential  at  infinity  will  be  zero. 

Separating  Variables  by  substituting  u(r.  <j>)  = G(r)H(4>)  into  (8).  Multiplying  (8)  by 

o 

r , making  the  substitution  and  then  dividing  by  GH,  we  obtain 

L±(r^)= — Tsin  <p  Y 

G dr  \ dr  J H sin  <fr  d <f>  V d<f>  J 


By  the  usual  argument  both  sides  must  be  equal  to  a constant  k.  Thus  we  get  the  two 
ODEs 


(ID 


and 


lA(,^_G\k 

G dr  V dr  J 


or 


2 d2G 

r 7T 


„ dG 

+ 2 r 

dr 


kG 


(12) 


1 d 
sin  <f>  d(f> 


+ kH  = 0. 


The  solutions  of  (11)  will  take  a simple  form  if  we  set  k = n (n  + 1).  Then,  writing 
G'  = dG/dr,  etc.,  we  obtain 

(13)  r2G"  + 2rG'  - n(n  + 1)  G = 0. 


This  is  an  Euler-Cauchy  equation.  From  Sec.  2.5  we  know  that  it  has  solutions  G = ra. 
Substituting  this  and  dropping  the  common  factor  ra  gives 


a (a  — 1)  + 2a  — n(n  + 1)  = 0.  The  roots  are  a = n and  —n  — 1. 


Hence  solutions  are 


(14) 


Gn(r)  = rr 


and 


G*(r ) = 


n+ 1 


We  now  solve  (12).  Setting  cos  cj)  = w,  we  have  sin2  </>=!—  w2  and 


d d dw  . d 

——  = — — = — sin  cp  — . 

dcp  dw  dcp  dw 


Consequently,  (12)  with  k = n(n  + 1)  takes  the  form 


(15) 


dw 


n 2\  dH 
(1  - w ) — 
dw 


+ n(n  + 1 )H  = 0. 


This  is  Legendre’s  equation  (see  Sec.  5.3),  written  out 
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(15')  (1  — w2)——i  — 2 w——  + n (n  + 1 )H  = 0. 

dw  dw 

For  integer  n = 0,  1,  • ■ ■ the  Legendre  polynomials 

H = Pn  (vv)  = Pn  (cos  <f>)  n = 0,  1,  • ■ ■ , 

are  solutions  of  Legendre’s  equation  (15).  We  thus  obtain  the  following  two  sequences 
of  solution  u = GH  of  Laplace’s  equation  (8),  with  constant  An  and  Bn,  where 
n = 0,1, 

Bn 

(16)  (a)  un(r,  <p)  = AnrnPn{ cos  <p),  (b)  u*  (r,  </>)  = ^+1  Pn( cos  $) 


Use  of  Fourier-Legendre  Series 

Interior  Problem:  Potential  Within  the  Sphere  S.  We  consider  a series  of  terms  from 
(16a), 


(17)  u(r,  <f>)  = 2 AnrnPn( cos  </>)  ( r g R). 

n= 0 


Since  S is  given  by  r = R,  for  (17)  to  satisfy  the  Dirichlet  condition  (9)  on  the  sphere  S, 
we  must  have 

oo 

(18)  u(R,  cf>)  = 2 AnRnPn( cos  <f»  = /(</>); 

n= 0 

that  is,  (18)  must  be  the  Fourier-Legendre  series  of /(</>).  From  (7)  in  Sec.  5.8  we  get 
the  coefficients 


(19*) 


AnR 


2 n + 1 
2 


l 

f(w)Pn(w)dw 

J-i 


where  f(w)  denotes/) <p)  as  a function  of  vv  = cos  <j>.  Since  dw  = —sin  </;  dxj),  and  the  limits 
of  integration  — 1 and  1 correspond  to  <f>  = 77  and  <f>  = 0,  respectively,  we  also  obtain 


(19) 


2 n + 1 
2/?n 


f(<p)Pn  (cos  4>)  sin  <p  dip, 
■^o 


n = 0,  I,-”. 


If  and  f ( <f> ) are  piecewise  continuous  on  the  interval  0 S g 77,  then  the  series 
(17)  with  coefficients  (19)  solves  our  problem  for  points  inside  the  sphere  because  it  can 
be  shown  that  under  these  continuity  assumptions  the  series  (17)  with  coefficients  (19) 
gives  the  derivatives  occurring  in  (8)  by  termwise  differentiation,  thus  justifying  our 
derivation. 
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EXAMPLE  1 


Exterior  Problem:  Potential  Outside  the  Sphere  S.  Outside  the  sphere  we  cannot  use 
the  functions  un  in  (16a)  because  they  do  not  satisfy  (10).  But  we  can  use  the  u%  in  (16b), 
which  do  satisfy  (10)  (but  could  not  be  used  inside  S;  why?).  Proceeding  as  before  leads 
to  the  solution  of  the  exterior  problem 


(20) 


u{r,  (f>)  = ^ 

n= 0 


Bn 

rn  + l 


Pn( cos  4>) 


(r^R) 


satisfying  (8),  (9),  (10),  with  coefficients 


(21) 


B, , 


2 n + 1 
2 


Rn  + 1 


f(<p)Pn( cos  4>)  sin  cf>  d(f). 


The  next  example  illustrates  all  this  for  a sphere  of  radius  1 consisting  of  two  hemispheres 
that  are  separated  by  a small  strip  of  insulating  material  along  the  equator,  so  that  these 
hemispheres  can  be  kept  at  different  potentials  (110  V and  0 V). 

Spherical  Capacitor 

Find  the  potential  inside  and  outside  a spherical  capacitor  consisting  of  two  metallic  hemispheres  of  radius  1 ft 
separated  by  a small  slit  for  reasons  of  insulation,  if  the  upper  hemisphere  is  kept  at  110  V and  the  lower  is 
grounded  (Fig.  313). 

Solution.  The  given  boundary  condition  is  (recall  Fig.  312) 


m = 


r no 
l o 


if  0 ^ (f>  < 77/2 
if  77/2  < (f)  77. 


Since  R = 1,  we  thus  obtain  from  (19) 


An 


In  + 1 
2 


ctt/2 

•110  Pn(cos  </>)  sin  (f>  dxj) 
-T) 


2 n + 1 
2 


• 110 


Pn(w)dw 


'o 


where  w = cos  </>.  Hence  Pn(cos  </>)  sin  (f>  d(f)  = — Pn(w)  dw,  we  integrate  from  1 to  0,  and  we  finally  get  rid  of 
the  minus  by  integrating  from  0 to  1.  You  can  evaluate  this  integral  by  your  CAS  or  continue  by  using  (11)  in 
Sec.  5.2,  obtaining 


(2  n — 2m)! 


A„  = 55(2«  + 1)X  (-Dm  „ 

m— o 2 m\(n  - m)\(n  - 2m)!  J0 


where  M = n/2  for  even  n and  M = (n  — l)/2  for  odd  n.  The  integral  equals  !/(;;  — 2m  + 1).  Thus 


Fig,  313.  Spherical  capacitor  in  Example  1 
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(22) 


55 (2m  + 1)  “ 

A„  = 2 (“I)” 


(2  n — 2m)! 


2n  m=o  m\(n  — m)\(n  — 2m  + 1)! 

= 1).  For  n = 

165  2! 

A\  : 

A3  = 

A3  = 1 I = , etc. 

Hence  the  potential  (17)  inside  the  sphere  is  (since  Pq  = 1) 

(23) 


= 1). 

For  n : 

= 1,2,3,  ■ 

• • we  get 

165 

2! 

165 

2 

Oil !2! 

2 ’ 

275  I 

' 4! 

21  \ 

= o, 

4 V 

.01213! 

mm; 

385  I 

' 6! 

4!  > 

385 

8 V 

.01314! 

1121217 

8 

165  385  o 

u(r,  0)  = 55  H r />i( cos  0) r /^(cos  0)  + 

2 8 


(Fig.  314) 


with  Pi,  P3,  • • • given  by  (1 17),  Sec.  5.21.  Since  R = 1,  we  see  from  (19)  and  (21)  in  this  section  that  Bn  = An 
and  (20)  thus  gives  the  potential  outside  the  sphere 


(24) 


55  165  , 385 

“(A  <t>)  = — + — - fi(cos  4>) P3(cos  4>)  + 

r 2 r2  8 r4 


Partial  sums  of  these  series  can  now  be  used  for  computing  approximate  values  of  the  inner  and  outer  potential. 
Also,  it  is  interesting  to  see  that  far  away  from  the  sphere  the  potential  is  approximately  that  of  a point  charge, 
namely,  55/r.  (Compare  with  Theorem  3 in  Sec.  9.7.) 


Fig.  314.  Partial  sums  of  the  first  4,  6,  and  11  nonzero  terms  of  (23)  for  r = R = 1 

EXAMPLE  2 Simpler  Cases.  Help  with  Problems 

The  technicalities  encountered  in  cases  that  are  similar  to  the  one  shown  in  Example  1 can  often  be  avoided. 
For  instance,  find  the  potential  inside  the  sphere  S:  r = R = 1 when  S is  kept  at  the  potential  /(</>)  = cos  20. 
(Can  you  see  the  potential  on  5?  What  is  it  at  the  North  Pole?  The  equator?  The  South  Pole?) 

Solution,  w = cos  0,  cos  20  = 2 cos2  0 — 1 = 2w2  1 = — 5 = f (§w2  — |)  — 3-  Hence  the 

potential  in  the  interior  of  the  sphere  is 

u = I r2P2(w)  - i = |r2P2(cos  <£)-£  = |r2(3  cos2  0 - 1)  - 


PROBLEM  S ET12:il 


1.  Spherical  coordinates.  Derive  (7)  from  V2m  in 
spherical  coordinates. 

2.  Cylindrical  coordinates.  Verify  (5)  by  transforming 
V2m  back  into  Cartesian  coordinates. 


3.  Sketch  Pn{ cos  d),  0 S d S 27 r,  for  n = 0,  1,  2.  (Use 
(11')  in  Sec.  5.2.) 

4.  Zero  surfaces.  Find  the  surfaces  on  which  ii\,  u2,  u3 
in  (16)  are  zero. 
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5.  CAS  PROBLEM.  Partial  Sums.  In  Example  1 in  the 
text  verify  the  values  of  A0,A1,A2,A3  and  compute 
A4,  ■ ■ • , A10.  Try  to  find  out  graphically  how  well  the 
corresponding  partial  sums  of  (23)  approximate  the 
given  boundary  function. 

6.  CAS  EXPERIMENT.  Gibbs  Phenomenon.  Study  the 
Gibbs  phenomenon  in  Example  1 (Fig.  314)  graphically. 

7.  Verify  that  un  and  u„  in  (16)  are  solutions  of  (8). 


8-15 


POTENTIALS  DEPENDING  ONLY  ON  r 


8.  Dimension  3.  Verify  that  the  potential  u = c/r,  r = 
V x2  + y2  + z2  satisfies  Laplace’s  equation  in  spherical 
coordinates. 


9.  Spherical  symmetry.  Show  that  the  only  solution 
of  Laplace’s  equation  depending  only  on  r = 
V x2  + y2  + z2  is  u = c/r  + k with  constant  c and  k. 

10.  Cylindrical  symmetry.  Show  that  the  only  solution  of 
Laplace’s  equation  depending  only  on  r = Vx2  + y2 
is  u = c In  r + k. 

11.  Verification.  Substituting  u (r)  with  r as  in  Prob.  9 into 
uxx  + uyy  + uzz  ~ 0,  verify  that  u"  + 2 u /r  = 0,  in 
agreement  with  (7). 

12.  Dirichlet  problem.  Find  the  electrostatic  potential 
between  coaxial  cylinders  of  radii  r\  = 2 cm  and 
r2  — 4 cm  kept  at  the  potentials  U\  = 220  V and 
U2  — 140  V,  respectively. 

13.  Dirichlet  problem.  Find  the  electrostatic  potential 
between  two  concentric  spheres  of  radii  = 2 cm 
and  r2  — 4 cm  kept  at  the  potentials  (?i  = 220  V and 
U2  = 140  V,  respectively.  Sketch  and  compare  the 
equipotential  lines  in  Probs.  12  and  13.  Comment. 

14.  Heat  problem.  If  the  surface  of  the  ball  r2  = 
x2  + y2  + z2  S R2  is  kept  at  temperature  zero  and  the 
initial  temperature  in  the  ball  is  f{r),  show  that  the 
temperature  u ( r , t)  in  the  ball  is  a solution  of  ut  = 
c2(nIT  + 2ur/r)  satisfying  the  conditions  u(R,t)  = 
0,  u ( r , 0)  = f(r).  Show  that  setting  v = ru  gives 
vt  = c2Vrr,  v (R,  t)  = 0,  v ( r , 0)  = rf(f).  Include  the 
condition  v (0,  t)  = 0 (which  holds  because  u must  be 
bounded  at  r = 0),  and  solve  the  resulting  problem  by 
separating  variables. 

15.  What  are  the  analogs  of  Probs.  12  and  13  in  heat 
conduction? 


16-20 


BOUNDARY  VALUE  PROBLEMS 
IN  SPHERICAL  COORDINATES  r,  0,  </> 


Find  the  potential  in  the  interior  of  the  sphere  r = R = 1 
if  the  interior  is  free  of  charges  and  the  potential  on  the 
sphere  is 

16-  /(</>)  = cos  17.  f(4>)  = 1 

18.  /(</>)  = 1 — cos2  <p  19.  f(4>)  = cos  2<p 

20.  /(</>)  = 10  cos3  (f>  — 3 cos2  <f>  — 5 cos  — 1 


21.  Point  charge.  Show  that  in  Prob.  17  the  potential  exterior 
to  the  sphere  is  the  same  as  that  of  a point  charge  at  the 
origin. 

22.  Exterior  potential.  Find  the  potentials  exterior  to  the 
sphere  in  Probs.  16  and  19. 

23.  Plane  intersections.  Sketch  the  intersections  of  the 
equipotential  surfaces  in  Prob.  16  with  xz-plane. 

24.  TEAM  PROJECT.  Transmission  Line  and  Related 
PDEs.  Consider  a long  cable  or  telephone  wire  (Fig.  315) 
that  is  imperfectly  insulated,  so  that  leaks  occur  along  the 
entire  length  of  the  cable.  The  source  S of  the  current 
i ( x , t ) in  the  cable  is  at  x = 0,  the  receiving  end  T at 
x = l.  The  current  flows  from  S to  T and  through  the 
load,  and  returns  to  the  ground.  Let  the  constants  R,  L, 
C,  and  G denote  the  resistance,  inductance,  capacitance 
to  ground,  and  conductance  to  ground,  respectively,  of 
the  cable  per  unit  length. 


S T 


^ Load 

x = 0 x = l 

Fig.  315.  Transmission  line 


(a)  Show  that  (“first  transmission  line  equation”) 


du 

dx 


di 

= Ri  + L — 
dt 


where  u (x,  t)  is  the  potential  in  the  cable.  Hint:  Apply 
Kirchhoff’s  voltage  law  to  a small  portion  of  the  cable 
between  x and  x + Ax  (difference  of  the  potentials  at 
x and  x + Ax  = resistive  drop  + inductive  drop). 

(b)  Show  that  for  the  cable  in  (a)  ("second  transmis- 
sion line  equation”). 


di_ 

dx 


= Gu  + C 


du 
dt  ' 


Hint:  Use  Kirchhoff’s  current  law  (difference  of  the 
currents  at  x and  x + Ax  = loss  due  to  leakage  to 
ground  + capacitive  loss). 

(c)  Second-order  PDEs.  Show  that  elimination  of  i 
or  u from  the  transmission  line  equations  leads  to 


uXx  — I.Cu  i ; + ( RC  + Gk)iif  + RGu , 
ixx  ~ LCitt  + (RC  + GL)it  + RGi. 


(d)  Telegraph  equations.  For  a submarine  cable,  G 
is  negligible  and  the  frequencies  are  low.  Show  that 
this  leads  to  the  so-called  submarine  cable  equations 

or  telegraph  equations 

u xx  H C i xx  ^ C : 
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Find  the  potential  in  a submarine  cable  with  ends 
(x  = 0,  x = /)  grounded  and  initial  voltage  distribution 
Uq  = const. 

(e)  High-frequency  line  equations.  Show  that  in  the 
case  of  alternating  currents  of  high  frequencies  the 
equations  in  (c)  can  be  approximated  by  the  so-called 

high-frequency  line  equations 


Solve  the  first  of  them,  assuming  that  the  initial 
potential  is 

U0  sin  (7 tx/1), 

and  ut(x,  0)  = 0 and  u = 0 at  the  ends  x = 0 and  x = l 
for  all  t. 

25.  Reflection  in  a sphere.  Let  r,  6 , 4>  be  spherical 
coordinates.  If  u (r,  0,  <f>)  satisfies  V2zr  = 0,  show  that 
v(r,  6,  (f>)  — u(\/r , 6,  4>)/r  satisfies  V2u  = 0. 


12.  Solution  of  PDEs  by  Laplace  Transforms 

Readers  familiar  with  Chap.  6 may  wonder  whether  Laplace  transforms  can  also  be  used 
for  solving  partial  differential  equations.  The  answer  is  yes,  particularly  if  one  of  the 
independent  variables  ranges  over  the  positive  axis.  The  steps  to  obtain  a solution  are 
similar  to  those  in  Chap.  6.  For  a PDE  in  two  variables  they  are  as  follows. 

1.  Take  the  Laplace  transform  with  respect  to  one  of  the  two  variables,  usually  t.  This 
gives  an  ODE  for  the  transform  of  the  unknown  function.  This  is  so  since  the 
derivatives  of  this  function  with  respect  to  the  other  variable  slip  into  the 
transformed  equation.  The  latter  also  incorporates  the  given  boundary  and  initial 
conditions. 

2.  Solving  that  ODE,  obtain  the  transform  of  the  unknown  function. 

3.  Taking  the  inverse  transform,  obtain  the  solution  of  the  given  problem. 

If  the  coefficients  of  the  given  equation  do  not  depend  on  t,  the  use  of  Laplace  transforms 
will  simplify  the  problem. 

We  explain  the  method  in  terms  of  a typical  example. 

Semi-Infinite  String 

Find  the  displacement  w(x,  t)  of  an  elastic  string  subject  to  the  following  conditions.  (We  write  w since  we  need 
u to  denote  the  unit  step  function.) 

(i)  The  string  is  initially  at  rest  on  the  x-axis  from  x = 0 to  o°  (“ semi-infinite  string”). 

(ii)  For  t > 0 the  left  end  of  the  string  (x  = 0)  is  moved  in  a given  fashion,  namely,  according  to  a single 
sine  wave 


(sin  t if  0 ^ t ^ 277 

w(0,r)  =m=\  (Fig.  316). 

I 0 otherwise 


(iii)  Furthermore,  lim  w(x,t ) = 0 for  t S 0. 

X—*oo 


Fig.  316.  Motion  of  the  left  end  of  the  string  in  Example  1 as  a function  of  time  t 
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Of  course  there  is  no  infinite  string,  but  our  model  describes  a long  string  or  rope  (of  negligible  weight)  with 
its  right  end  fixed  far  out  on  the  x-axis. 

Solution.  We  have  to  solve  the  wave  equation  (Sec.  12.2) 


(1) 


3 w 2 3 w 
dt2  dx2 ' 


for  positive  x and  t,  subject  to  the  “boundary  conditions” 

(2)  w(0,  t)  = fit),  Jim  w(x,  t)  = 0 
with  / as  given  above,  and  the  initial  conditions 

(3)  (a)  w (x,  0)  = 0,  (b)  wt  (x,  0)  = 0. 

We  take  the  Laplace  transform  with  respect  to  t.  By  (2)  in  Sec.  6.2, 


= s2Z£{w]  - sw(x,  0)  - wt(x,  0)  = c2S£ 


f cfV  1 
tax2  > 


(tSO) 


The  expression  — sw(x,  0)  — wt(x,  0)  drops  out  because  of  (3).  On  the  right  we  assume  that  we  may  interchange 
integration  and  differentiation.  Then 


FM 

_ f 

1 dx2  ' 

"Jo 

dt  = 


dx  dx 

Writing  W(x,  s)  = -i  { w(x,  / ) ) , we  thus  obtain 


e w(x,t)dt  = f£{w(x, /)}. 

dx2 


thus 


dx* 


dx ' 


W=  0. 

2 2 


Since  this  equation  contains  only  a derivative  with  respect  to  x,  it  may  be  regarded  as  an  ordinary  differential 
equation  for  W (x,  s)  considered  as  a function  of  x.  A general  solution  is 

(4)  W(x,  s)  = A (s)esx/c  + B(s)e~sx/C. 

From  (2)  we  obtain,  writing  F(s ) = ££{f(f)}, 

W( 0,  s)  = ££{w(0,  0}  = #{/(*)}  = F(s). 

Assuming  that  we  can  interchange  integration  and  taking  the  limit,  we  have 


lim  W (jc,  s)  = lim  e stw  (x,  t)  dt  = e sl  lim  w (x,  t ) dt  = 0. 

X—*oo  X — »oo  I I X— »oo 

J0  Jo 

This  implies  A (5)  = 0 in  (4)  because  c > 0,  so  that  for  every  fixed  positive  s the  function  esx^c  increases  as  x 
increases.  Note  that  we  may  assume  s > 0 since  a Laplace  transform  generally  exists  for  all  s greater  than  some 
fixed  k (Sec.  6.2).  Hence  we  have 

W(0,s ) = B(s)  = F(s), 

so  that  (4)  becomes 

W(x,  s ) = F(s)e~sx/C. 

From  the  second  shifting  theorem  (Sec.  6.3)  with  a = xjc  we  obtain  the  inverse  transform 


(5) 


w(x,t)  =f\t-  — 1 Ml  t - 


(Fig.  317) 
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that  is, 


w (x,  t)  = sin  — — J if  — < t < — + 277  or  ct  > x > (t  — 27 t)c 

and  zero  otherwise.  This  is  a single  sine  wave  traveling  to  the  right  with  speed  c.  Note  that  a point  x remains 
at  rest  until  t = x/c,  the  time  needed  to  reach  that  x if  one  starts  at  t = 0 (start  of  the  motion  of  the  left  end) 
and  travels  with  speed  c.  The  result  agrees  with  our  physical  intuition.  Since  we  proceeded  formally,  we  must 
verify  that  (5)  satisfies  the  given  conditions.  We  leave  this  to  the  student. 


Fig.  317.  Traveling  wave  in  Example  1 


We  have  reached  the  end  of  Chapter  12,  in  which  we  concentrated  on  the  most  important 
partial  differential  equations  (PDEs)  in  physics  and  engineering.  We  have  also  reached 
the  end  of  Part  C on  Fourier  Analysis  and  PDEs. 

Outlook 

We  have  seen  that  PDEs  underlie  the  modeling  process  of  various  important  engineering 
application.  Indeed,  PDEs  are  the  subject  of  many  ongoing  research  projects. 

Numerics  for  PDEs  follows  in  Secs.  21.4-21.7,  which,  by  design  for  greater  flexibility 
in  teaching,  are  independent  of  the  other  sections  in  Part  E on  numerics. 

In  the  next  part,  that  is,  Part  D on  complex  analysis,  we  turn  to  an  area  of  a different 
nature  that  is  also  highly  important  to  the  engineer.  The  rich  vein  of  examples  and  problems 
will  signify  this.  It  is  of  note  that  Part  D includes  another  approach  to  the  two-dimensional 
Laplace  equation  with  applications,  as  shown  in  Chap.  18. 


EEBEgEE3BE£EEre=BE 


1.  Verify  the  solution  in  Example  1.  What  traveling  wave 
do  we  obtain  in  Example  1 for  a nonterminating 
sinusoidal  motion  of  the  left  end  starting  at  f = 27 r? 

2.  Sketch  a figure  similar  to  Fig.  317  when  c = 1 and 
/(x)  is  "triangular,”  say,  /( x)  = x if  0 < x < |,/( x)  = 
1— xifg<x<l  and  0 otherwise. 

3.  How  does  the  speed  of  the  wave  in  Example  1 of  the 
text  depend  on  the  tension  and  on  the  mass  of  the  string? 


4-8 


SOLVE  BY  LAPLACE  TRANSFORMS 


. dw  dw 

5.  x — H = xt, 

dx  dt 


w (x,  0)  = 0 if  x £ 0, 
w(0,  t)  = 0 if  t £ 0 


6. 


dw 

dx 


„ dw 

+ 2x — = 2x, 

at 


w (x,  0)  = 1,  w(0,  f)  = l 


7.  Solve  Prob.  5 by  separating  variables. 


•>2 

d w 


-.2 

d w 


8.  — o = 100 

dx2  dt 


dw 

<r  ~b  100 b 25 w, 

2 at 


4. b x — = x,  w(x,  0)  = 1,  w(0,  0=1 

dx  dt 


w (x,  0)  = 0 if  x £ 0,  wt(x,  0)  = 0 if  t £ 0, 
w (0,  t)  = sin  t if  t £ 0 


Chapter  12  Review  Questions  and  Problems 
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9-12 


HEAT  PROBLEM 


Find  the  temperature  w(x,  t)  in  a semi-infinite  laterally 
insulated  bar  extending  from  x = 0 along  the  x-axis  to 
infinity,  assuming  that  the  initial  temperature  is  0 ,w(x,  t)—*  0 
as  x — > oo  for  every  fixed  t £ 0,  and  w( 0,  t ) = f(t).  Proceed 
as  follows. 


9.  Set  up  the  model  and  show  that  the  Laplace  transform 
leads  to 


sW 


(w  = £e{w}) 


and 

W = F{s)e~Vsx/c  (F  = ££{/}). 


10.  Applying  the  convolution  theorem,  show  that  in  Prob.  9, 


11.  Let  w(0,  t)  = f(t)  = u(t)  (Sec.  6.3).  Denote  the  corre- 
sponding w,  VP,  and  F by  w0,  Wq,  and  F0.  Show  that 
then  in  Prob.  10, 


w0(x,  0 = 


2 cV77 


3/2  -x2/(4c2T) 


dr 


with  the  error  function  erf  as  defined  in  Problem  Set 
12.7. 

12.  Duhamel’s  formula.4  Show  that  in  Prob.  11, 

VF0(x,  i)  = ]se~Vsxlc 

and  the  convolution  theorem  gives  Duhamel’s  formula 


w (x,  t) 


x 

2cV77 


f /(f  - T)T-3'2e-**^\lT. 
•'o 


r dwo 

W(x,  t)  = \f{t  - t)  — — dr. 

•'n 


mraHHaEHEBPEESTIONS  AND  PROBLEMS 


1.  For  what  kinds  of  problems  will  modeling  lead  to  an 
ODE?  To  a PDE? 

2.  Mention  some  of  the  basic  physical  principles  or  laws 
that  will  give  a PDE  in  modeling. 

3.  State  three  or  four  of  the  most  important  PDEs  and  their 
main  applications. 

4.  What  is  “separating  variables”  in  a PDE?  When  did  we 
apply  it  twice  in  succession? 

5.  What  is  d’Alembert’s  solution  method?  To  what  PDE 
does  it  apply? 

6.  What  role  did  Fourier  series  play  in  this  chapter?  Fourier 
integrals? 

7.  When  and  why  did  Legendre’s  equation  occur?  Bessel’s 
equation? 

8.  What  are  the  eigenfunctions  and  their  frequencies  of  the 
vibrating  string?  Of  the  vibrating  membrane? 

9.  What  do  you  remember  about  types  of  PDEs?  Normal 
forms?  Why  is  this  important? 

10.  When  did  we  use  polar  coordinates?  Cylindrical  coor- 
dinates? Spherical  coordinates? 

11.  Explain  mathematically  (not  physically)  why  we  got 
exponential  functions  in  separating  the  heat  equation, 
but  not  for  the  wave  equation. 

12.  Why  and  where  did  the  error  function  occur? 


13.  How  do  problems  for  the  wave  equation  and  the  heat 
equation  differ  regarding  additional  conditions? 

14.  Name  and  explain  the  three  kinds  of  boundary  conditions 
for  Laplace’s  equation. 

15.  Explain  how  the  Laplace  transform  applies  to  PDEs. 


16-18 

16.  uxx  + 

17 . yy  ^ 

18.  u xx  I 

19-21 


Solve  for  u = u{x,  y)\ 

25  u = 0 
uy  — 6u  = 18 

ux  — 0,  w(0,  v)=/(y), 

NORMAL  FORM 


Transform  to  normal  form  and  solve: 

19.  Uxy  Uyy 

20.  uXx  T 6n„j,  + 9 Uyy  0 


xy 

21.  Uxx  4 Uyy  0 


22-24 


VIBRATING  STRING 


«x(0,y)  = gly) 


Find  and  sketch  or  graph  (as  in  Fig.  288  in  Sec.  12.3)  the 
deflection  «(x,  t)  of  a vibrating  string  of  length  77,  extending 
from  x = 0 to  x = 77,  and  c2  = T/ p = 4 starting  with 
velocity  zero  and  deflection: 

22.  sin  4x  23.  sin3  x 

24.  g77  - |x  - ^ 77 1 


‘JEAN-MARIE  CONSTANT  DUHAMEL  (1797-1872),  French  mathematician. 
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CHAP.  12  Partial  Differential  Equations  (PDEs) 


25-27 


HEAT 


Find  the  temperature  distribution  in  a laterally  insulated  thin 
copper  bar  (c2  = K/(ap)  = 1.158  cm2/sec)  of  length  100 
cm  and  constant  cross  section  with  endpoints  at  x = 0 and 
100  kept  at  0°C  and  initial  temperature: 

25.  sin  0.0  l7r.tr  26.  50  - |50  - x\ 

27.  sin3  0.0177JC 


28-30 


ADIABATIC  CONDITIONS 


Find  the  temperature  distribution  in  a laterally  insulated 
bar  of  length  7 r with  c2  = 1 for  the  adiabatic  boundary 
condition  (see  Problem  Set  12.6)  and  initial  temperature: 
28.  3x2  29.  100  cos  2x 


30.  27 r - 4|jc  — \tt\ 


31-32 


TEMPERATURE  IN  A PLATE 


31.  Let  f(x,  y)  = u (x,  y,  0)  be  the  initial  temperature  in  a 
thin  square  plate  of  side  77  with  edges  kept  at  0°C  and 
faces  perfectly  insulated.  Separating  variables,  obtain 
from  ut  = c2V2«  the  solution 


u(x,  y,  t) 
where 


2 2 Bmn  sin  mx  sin  ny  e ° <m  +"  M 

m=  1 n=  1 


/( x,  y)  sin  mx  sin  ny  dx  dy. 


32.  Find  the  temperature  in  Prob.  31  if 
f(x,y)  = x(7T  - x)y{i r - y). 


33-37 


MEMBRANES 


Show  that  the  following  membranes  of  area  1 with  c2  = 1 
have  the  frequencies  of  the  fundamental  mode  as  given 
(4-decimal  values).  Compare. 


33.  Circle:  ai/(2Vn)  = 0.6784 

34.  Square:  1/V2  = 0.7071 

35.  Rectangle  with  sides  l:2:V5/8  = 0.7906 

36.  Semicircle:  3.832/V877  = 0.7643 

37.  Quadrant  of  circle:  a21/(4V7r)  = 0.7244 
(“21  = 5.13562  = first  positive  zero  of  72) 


38-40 


ELECTROSTATIC  POTENTIAL 


Find  the  potential  in  the  following  charge-free  regions. 


38.  Between  two  concentric  spheres  of  radii  r0  and  kept 
at  potentials  uq  and  u i,  respectively. 


39.  Between  two  coaxial  circular  cylinders  of  radii  r0  and 
Y\  kept  at  the  potentials  uq  and  u i,  respectively. 
Compare  with  Prob.  38. 


40.  In  the  interior  of  a sphere  of  radius  1 kept  at  the 
potential  /(</>)  = cos  3(f)  + 3 cos  </>  (referred  to  our 
usual  spherical  coordinates). 


SUMMARY  Of  CHAPTER  12 

Partial  Differential  Equations  (PDEs) 


Whereas  ODEs  (Chaps.  1-6)  serve  as  models  of  problems  involving  only  one 
independent  variable,  problems  involving  two  or  more  independent  variables  (space 
variables  or  time  t and  one  or  several  space  variables)  lead  to  PDEs.  This  accounts  for 
the  enormous  importance  of  PDEs  to  the  engineer  and  physicist.  Most  important  are: 

o 

(1)  Utt  = c uxx  One-dimensional  wave  equation  (Secs.  12.2-12.4) 

o 

(2)  utt  = c ( uxx  + uyy)  Two-dimensional  wave  equation  (Secs.  12.8-12.10) 

(3)  Ut  = c2uxx  One-dimensional  heat  equation  (Secs.  12.5,  12.6,  12.7) 

(4)  V2m  = uxx  + Uyy  = 0 Two-dimensional  Laplace  equation  (Secs.  12.6,  12.10) 

(5)  V2m  = uxx  + Uyy  + uzz  = 0 Three-dimensional  Laplace  equation 

(Sec.  12.11). 

Equations  (1)  and  (2)  are  hyperbolic,  (3)  is  parabolic,  (4)  and  (5)  are  elliptic. 


Summary  of  Chapter  12 
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In  practice,  one  is  interested  in  obtaining  the  solution  of  such  an  equation  in  a 
given  region  satisfying  given  additional  conditions,  such  as  initial  conditions 
(conditions  at  time  t = 0)  or  boundary  conditions  (prescribed  values  of  the  solution 
u or  some  of  its  derivatives  on  the  boundary  surface  S,  or  boundary  curve  C,  of  the 
region)  or  both.  For  (1)  and  (2)  one  prescribes  two  initial  conditions  (initial 
displacement  and  initial  velocity).  For  (3)  one  prescribes  the  initial  temperature 
distribution.  For  (4)  and  (5)  one  prescribes  a boundary  condition  and  calls  the 
resulting  problem  a (see  Sec.  12.6) 

Dirichlet  problem  if  u is  prescribed  on  .S', 

Neumann  problem  if  un  = du/dn  is  prescribed  on  S, 

Mixed  problem  if  u is  prescribed  on  one  part  of  S and  un  on  the  other. 

A general  method  for  solving  such  problems  is  the  method  of  separating 
variables  or  product  method,  in  which  one  assumes  solutions  in  the  form  of 
products  of  functions  each  depending  on  one  variable  only.  Thus  equation  (1)  is 
solved  by  setting  u(x,  t)  = F(x)G(t)\  see  Sec.  12.3;  similarly  for  (3)  (see  Sec.  12.6). 
Substitution  into  the  given  equation  yields  ordinary  differential  equations  for  F and 
G,  and  from  these  one  gets  infinitely  many  solutions  F = Fn  and  G = Gn  such  that 
the  corresponding  functions 

ur/(x,  t)  Fn(x)Gn(t) 

are  solutions  of  the  PDE  satisfying  the  given  boundary  conditions.  These  are  the 
eigenfunctions  of  the  problem,  and  the  corresponding  eigenvalues  determine  the 
frequency  of  the  vibration  (or  the  rapidity  of  the  decrease  of  temperature  in  the  case 
of  the  heat  equation,  etc.).  To  satisfy  also  the  initial  condition  (or  conditions),  one 
must  consider  infinite  series  of  the  un,  whose  coefficients  turn  out  to  be  the  Fourier 
coefficients  of  the  functions  / and  g representing  the  given  initial  conditions  (Secs. 
12.3,  12.6).  Hence  Fourier  series  (and  Fourier  integrals)  are  of  basic  importance 
here  (Secs.  12.3,  12.6,  12.7,  12.9). 

Steady-state  problems  are  problems  in  which  the  solution  does  not  depend  on 
time  t.  For  these,  the  heat  equation  = c2V2;/  becomes  the  Laplace  equation. 

Before  solving  an  initial  or  boundary  value  problem,  one  often  transforms  the 
PDE  into  coordinates  in  which  the  boundary  of  the  region  considered  is  given  by 
simple  formulas.  Thus  in  polar  coordinates  given  by  x = r cos  6,  y = r sin  6,  the 
Laplacian  becomes  (Sec.  12.11) 


(6) 


V2h  — urr  + 


1 

— ur  + 
r 


1 

2 

r 


for  spherical  coordinates  see  Sec.  12.10.  If  one  now  separates  the  variables,  one  gets 
Bessel’s  equation  from  (2)  and  (6)  (vibrating  circular  membrane.  Sec.  12.10)  and 
Legendre’s  equation  from  (5)  transformed  into  spherical  coordinates  (Sec.  12.11). 


Complex 

Analysis 


CHAPTER  13 
CHAPTER  14 
CHAPTER  15 
CHAPTER  16 
CHAPTER  17 
CHAPTER  18 


Complex  Numbers  and  Functions.  Complex  Differentiation 

Complex  Integration 

Power  Series,  Taylor  Series 

Laurent  Series.  Residue  Integration 

Conformal  Mapping 

Complex  Analysis  and  Potential  Theory 


Complex  analysis  has  many  applications  in  heat  conduction,  fluid  flow,  electrostatics,  and 
in  other  areas.  It  extends  the  familiar  “real  calculus”  to  “complex  calculus”  by  introducing 
complex  numbers  and  functions.  While  many  ideas  carry  over  from  calculus  to  complex 
analysis,  there  is  a marked  difference  between  the  two.  For  example,  analytic  functions, 
which  are  the  “good  functions”  (differentiable  in  some  domain)  of  complex  analysis,  have 
derivatives  of  all  orders.  This  is  in  contrast  to  calculus,  where  real-valued  functions  of 
real  variables  may  have  derivatives  only  up  to  a certain  order.  Thus,  in  certain  ways, 
problems  that  are  difficult  to  solve  in  real  calculus  may  be  much  easier  to  solve  in  complex 
analysis.  Complex  analysis  is  important  in  applied  mathematics  for  three  main  reasons: 

1.  Two-dimensional  potential  problems  can  be  modeled  and  solved  by  methods  of 
analytic  functions.  This  reason  is  the  real  and  imaginary  parts  of  analytic  functions  satisfy 
Laplace’s  equation  in  two  real  variables. 

2.  Many  difficult  integrals  (real  or  complex)  that  appear  in  applications  can  be  solved 
quite  elegantly  by  complex  integration. 

3.  Most  functions  in  engineering  mathematics  are  analytic  functions,  and  their  study 
as  functions  of  a complex  variable  leads  to  a deeper  understanding  of  their  properties  and 
to  interrelations  in  complex  that  have  no  analog  in  real  calculus. 
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CHAPTER 


Complex  Numbers 
and  Functions.  Complex 
Differentiation 


The  transition  from  “real  calculus”  to  “complex  calculus”  starts  with  a discussion  of 
complex  numbers  and  their  geometric  representation  in  the  complex  plane.  We  then 
progress  to  analytic  functions  in  Sec.  13.3.  We  desire  functions  to  be  analytic  because 
these  are  the  “useful  functions”  in  the  sense  that  they  are  differentiable  in  some  domain 
and  operations  of  complex  analysis  can  be  applied  to  them.  The  most  important  equations 
are  therefore  the  Cauchy-Riemann  equations  in  Sec.  13.4  because  they  allow  a test  of 
analyticity  of  such  functions.  Moreover,  we  show  how  the  Cauchy-Riemann  equations 
are  related  to  the  important  Laplace  equation. 

The  remaining  sections  of  the  chapter  are  devoted  to  elementary  complex  functions 
(exponential,  trigonometric,  hyperbolic,  and  logarithmic  functions).  These  generalize  the 
familiar  real  functions  of  calculus.  Detailed  knowledge  of  them  is  an  absolute  necessity 
in  practical  work,  just  as  that  of  their  real  counterparts  is  in  calculus. 

Prerequisite:  Elementary  calculus. 

References  and  Answers  to  Problems:  App.  1 Part  D,  App.  2. 

13.1  Complex  Numbers  and 

Their  Geometric  Representation 

The  material  in  this  section  will  most  likely  be  familiar  to  the  student  and  serve  as  a 
review. 

Equations  without  real  solutions,  such  as  r = — 1 or  x — 1 Ox  + 40  = 0,  were 
observed  early  in  history  and  led  to  the  introduction  of  complex  numbers.1  By  definition, 
a complex  number  z is  an  ordered  pair  (x,  y ) of  real  numbers  x and  y,  written 

z = (*,  y)- 


1First  to  use  complex  numbers  for  this  purpose  was  the  Italian  mathematician  GIROLAMO  CARDANO 
(1501-1576),  who  found  the  formula  for  solving  cubic  equations.  The  term  “complex  number”  was  introduced 
by  CARL  FRIEDRICH  GAUSS  (see  the  footnote  in  Sec.  5.4),  who  also  paved  the  way  for  a general  use  of 
complex  numbers. 
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x is  called  the  real  part  and  y the  imaginary  part  of  z,  written 

x = Re  z,  y = Im  z. 

By  definition,  two  complex  numbers  are  equal  if  and  only  if  their  real  parts  are  equal 
and  their  imaginary  parts  are  equal. 

(0,  1)  is  called  the  imaginary  unit  and  is  denoted  by  ;, 

(1)  i = (0,  1). 

Addition,  Multiplication.  Notation  z = x + iy 

Addition  of  two  complex  numbers  zi  = (x1;  jq)  and  z2  = (x2,  V2)  is  defined  by 

(2)  zi  + z2  = (x  1,  yi)  + (x2,  y2)  = (*1  + *2,  yi  + j2)- 

Multiplication  is  defined  by 

(3)  Z1Z2  = (*1,  vi)Cy2,  y2)  = (x i-Y 2 - yiyz,  xxy2  + x2yd- 
These  two  definitions  imply  that 

(xi,  0)  + (x2,  0)  = Oi  + x2,  0) 

and 

(xi,  0)(x2,  0)  = (XiX2,  0) 

as  for  real  numbers  x\,x2-  Hence  the  complex  numbers  “ extend ” the  real  numbers.  We 
can  thus  write 

(x,  0)  = x.  Similarly,  (0,  y)  = iy 

because  by  (1),  and  the  definition  of  multiplication,  we  have 

iy  = (0,  l)y  = (0,  l)(y,  0)  = (0  ■ y - 1 ■ 0,  0 ■ 0 + 1 ■ y)  = (0,  y). 

Together  we  have,  by  addition,  (x,  y)  = (x,  0)  + (0,  y)  = x + iy. 

In  practice,  complex  numbers  z = (x,  y ) are  written 

(4)  z = x + iy 

or  z = x + yi,  e.g.,  17  + 4 i (instead  of  ;4). 

Electrical  engineers  often  write  j instead  of  i because  they  need  i for  the  current. 

If  x = 0,  then  z = iy  and  is  called  pure  imaginary.  Also,  (1)  and  (3)  give 

(5)  iz  = -1 

because,  by  the  definition  of  multiplication,  ;2  = ii  = (0,  1)(0,  1)  = (—1,  0)  = — 1. 
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CHAP.  13  Complex  Numbers  and  Functions.  Complex  Differentiation 


EXAMPLE  1 


EXAMPLE  2 


For  addition  the  standard  notation  (4)  gives  [see  (2)] 

C*i  + iyi)  + (*2  + iyz)  = Oi  + x2)  + i(ji  + y2). 

For  multiplication  the  standard  notation  gives  the  following  very  simple  recipe.  Multiply 
each  term  by  each  other  term  and  use  i2  = —1  when  it  occurs  [see  (3)]: 

(xi  + iyi)(x2  + iy2)  = x \x2  + ixiyz  + iy\x2  + i2yiy2 
= (x±x2  - yi.V2)  + i(x1y2  + x2yi). 


This  agrees  with  (3).  And  it  shows  that  x + iy  is  a more  practical  notation  for  complex 
numbers  than  (x,  y). 

If  you  know  vectors,  you  see  that  (2)  is  vector  addition,  whereas  the  multiplication  (3) 
has  no  counterpart  in  the  usual  vector  algebra. 

Real  Part,  Imaginary  Part,  Sum  and  Product  of  Complex  Numbers 

Let  Zi  = 8 + 3 / and  z2  = 9 — 2 /.  Then  Re  zi  = 8,  Im  Zi  = 3.  Re  z2  = 9,  Im  z2  = and 

Zi  + Z2  = (8  + 3/)  + (9  - 2;)  = 17  + i, 

Z1Z2  = (8  + 30(9  - 20  = 72  + 6 + /(— 16  + 27)  = 78  + 11/. 

Subtraction,  Division 

Subtraction  and  division  are  defined  as  the  inverse  operations  of  addition  and  multipli- 
cation, respectively.  Thus  the  difference  z = Zi  — Z2ts  the  complex  number  z for  which 
Z\  = z + Z2-  Hence  by  (2), 

(6)  zi  - z2  = (xi  - x2)  + i(y  1 - y2). 


The  quotient  z = zi/z2(z2  A 0)  is  the  complex  number  z for  which  z\  = zz2.  If  we 
equate  the  real  and  the  imaginary  parts  on  both  sides  of  this  equation,  setting  z = x + iy, 
we  obtain  xi  = x2x  — y2y,  yi  = y2x  + x2y.  The  solution  is 


(7*) 


x + iy, 


xix2  + yiy2 

2,2  ’ 

*2  + J2 


x2yi  - X1V2 

2,2 
*2  + J2 


The  practical  rule  used  to  get  this  is  by  multiplying  numerator  and  denominator  of  Z1/Z2 
by  x2  — iy2  and  simplifying: 


Xi  + iyi  _ (xi  + iy1)(x2  ~ iy2 ) _ xhx2  + y±y2  x2y±  - xxy2 

x2  + iy2  (x2  + iy2){x2  - iy2)  x\  + y\  x\  + y2 


Difference  and  Quotient  of  Complex  Numbers 

For  Zi  = 8 + 3/  and  z2  = 9 — 2/  we  get  zt  — z2  = (8  + 3 i)  — (9  — 2/)  = — 1 + 5/  and 

zi  _ 8 + 3/  _ (8  + 30(9  + 2 i)  66  + 43/  _ 66  43  . 

Z2  9 - 2/  (9  - 2/)(9  + 2/)  81  + 4 85  85  *' 

Check  the  division  by  multiplication  to  get  8 + 3 i. 
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Complex  numbers  satisfy  the  same  commutative,  associative,  and  distributive  laws  as  real 
numbers  (see  the  problem  set). 

Complex  Plane 

So  far  we  discussed  the  algebraic  manipulation  of  complex  numbers.  Consider  the 
geometric  representation  of  complex  numbers,  which  is  of  great  practical  importance.  We 
choose  two  perpendicular  coordinate  axes,  the  horizontal  x-axis,  called  the  real  axis,  and 
the  vertical  y-axis,  called  the  imaginary  axis.  On  both  axes  we  choose  the  same  unit  of 
length  (Fig.  318).  This  is  called  a Cartesian  coordinate  system. 


(Imaginary 

axis) 


Fig.  319.  The  number  4 — 3/  in 
the  complex  plane 


We  now  plot  a given  complex  number  z = (x,  y)  = x + iy  as  the  point  P with  coordinates 
x,  y.  The  xy-plane  in  which  the  complex  numbers  are  represented  in  this  way  is  called  the 
complex  plane.2  Figure  319  shows  an  example. 

Instead  of  saying  “the  point  represented  by  z in  the  complex  plane”  we  say  briefly  and 
simply  “ the  point  z in  the  complex  plane.”  This  will  cause  no  misunderstanding. 

Addition  and  subtraction  can  now  be  visualized  as  illustrated  in  Figs.  320  and  321. 


Fig.  320.  Addition  of  complex  numbers  Fig.  321.  Subtraction  of  complex  numbers 


2Sometimes  called  the  Argand  diagram,  after  the  French  mathematician  JEAN  ROBERT  ARGAND 
(1768-1822),  bom  in  Geneva  and  later  librarian  in  Paris.  His  paper  on  the  complex  plane  appeared  in  1806, 
nine  years  after  a similar  memoir  by  the  Norwegian  mathematician  CASPAR  WESSEL  (1745-1818),  a surveyor 
of  the  Danish  Academy  of  Science. 
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Complex  Conjugate  Numbers 

The  complex  conjugate  z of  a complex  number  z = x + iy  is  defined  by 

z = x - iy. 

It  is  obtained  geometrically  by  reflecting  the  point  z in  the  real  axis.  Figure  322  shows 
this  for  z = 5 + 2i  and  its  conjugate  z = 5 — 2 i. 


Fig.  322.  Complex  conjugate  numbers 


The  complex  conjugate  is  important  because  it  permits  us  to  switch  from  complex 
to  real.  Indeed,  by  multiplication,  zz  = x + y (verify!).  By  addition  and  subtraction, 
Z + z = 2x,  z ~ z = 2 iy.  We  thus  obtain  for  the  real  part  x and  the  imaginary  part  y 
(not  ;y!  ) of  z = x + iy  the  important  formulas 


(8) 


Re  z = x = | (z  + z), 


Im  z = y = 


' . (z  - z). 


If  z is  real,  z = x,  then  z = Z by  the  definition  of  z,  and  conversely.  Working  with 
conjugates  is  easy,  since  we  have 

(zi  + z2)  = Zi  + z2,  (zi  - Zz)  = Zi  - z2, 

£i 

Z2  ' 


(ZlZ2)  - ZlZz, 


EXAMPLE  3 Illustration  of  (8)  and  (9) 

Let  Zi  = 4 + 3 ('  and  z2  = 2 + 5 i.  Then  by  (8), 

1 3 i + 3 i 

Im  Zi  = —[(4  + 3i)  - (4  - 301  = = 3. 

2;  2 i 

Also,  the  multiplication  formula  in  (9)  is  verified  by 

(ziz2)  = (4  + 30(2  + 50  = (-7  + 26 i)  = -1  - 26 i. 
Z1Z2  = (4  - 30(2  - 50  = -7  - 26 i. 


PRO  BLEM  SET  13 .1 


1.  Powers  of  i.  Show  that  /1  2 = — 1,  i:i  = —i,  i4  = 1, 
i5  = i,  ■ ■ ■ and  l/i  = — i,  l//2  = — 1,  l/i 3 = (,■■■. 

2.  Rotation.  Multiplication  by  i is  geometrically  a 

counterclockwise  rotation  through  7r/2  (90°).  Verify 


this  by  graphing  z and  iz  and  the  angle  of  rotation  for 
z = 1 + i,  z = - 1 + 21,  z = 4 - 3i. 

3.  Division.  Verify  the  calculation  in  (7).  Apply  (7)  to 
(26  - 18i)/(6  - 20- 
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4.  Law  for  conjugates.  Verify  (9)  lor  z , = —11  + 10/, 
z2  = “I  + 4/. 

5.  Pure  imaginary  number.  Show  that  z — x + iy  is 
pure  imaginary  if  and  only  if  z.  = — z. 

6.  Multiplication.  If  the  product  of  two  complex  numbers 
is  zero,  show  that  at  least  one  factor  must  be  zero. 

7.  Laws  of  addition  and  multiplication.  Derive  the 
following  laws  for  complex  numbers  from  the  cor- 
responding laws  for  real  numbers. 

Zi  + z2  ~ z2  + zi,  Z1Z2  = Z2Z i ( Commutative  laws ) 

(zi  + Z2)  + z3  = zi  + (z2  + z3), 

(. Associative  laws) 
(ZlZ2)t3  = Zl(Z2t3) 

z i(z2  + Z3)  = Z1Z2  + Z1Z3  ( Distributive  law) 

0 + z = z + 0 = z, 

z + ( — z)  = ( — z)  + z = 0,  z ■ 1 = z. 


8-15  COMPLEX  ARITHMETIC 
Let  Zi  = —2  + 11/,  z2  = 2 — i.  Showing  the  details  of 
your  work,  find,  in  the  form  x + iy: 


8.  ziz2,  (Z1Z2)  9.  Re(zf),  (Rez,)2 

10.  Re  (1/ zi),  1/Re  (z|) 

11.  (zi  - z2)2/ 16,  (z i/4  - z 2/ 4)2 

12.  z i/z 2.  Z2/Z1 

13.  (zi  + z2)(zi  - Z2),  zi  ~ z| 


14.  zi/z2,  (zi/z2) 

15.  4 (zx  + z 2)/ (z  1 - z2) 


16-20 


Let  z = x + iy.  Showing  details,  find,  in  terms 


of  x and  y: 

16.  Im  (1/z),  Im  (1/z2)  17.  Re  z4  - (Re  z2)2 

18.  Re  [(1  + /)16z2]  19.  Re  (z/z),  Im  (z/z) 

20.  Im  (1/z2) 


13. 1 Polar  Form  of  Complex  Numbers. 
Powers  and  Roots 


We  gain  further  insight  into  the  arithmetic  operations  of  complex  numbers  if,  in  addition 
to  the  xy-coordi  nates  in  the  complex  plane,  we  also  employ  the  usual  polar  coordinates 
r,  0 defined  by 

(1)  x = r cos  6,  y = r sin  0. 

We  see  that  then  z = x + iy  takes  the  so-called  polar  form 

(2)  z = r(cos  0 + i sin  6). 

r is  called  the  absolute  value  or  modulus  of  z and  is  denoted  by  |z|.  Hence 

(3)  \z\  = r = Vx2  + y2  = Vzz. 

Geometrically,  |z|  is  the  distance  of  the  point  z from  the  origin  (Fig.  323).  Similarly, 
| z 1 ~ Z2I  is  the  distance  between  zi  and  Z2  (Fig-  324). 

0 is  called  the  argument  of  z and  is  denoted  by  arg  z.  Thus  9 = arg  z and  (Fig.  323) 

(4)  tan  9 = J (z  * 0). 

Geometrically,  6 is  the  directed  angle  from  the  positive  x-axis  to  OP  in  Fig.  323.  Here,  as 
in  calculus,  all  angles  are  measured  in  radians  and  positive  in  the  counterclockwise  sense. 
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EXAMPLE  1 

y 


Fig.  325.  Example  1 


For  z = 0 this  angle  9 is  undefined.  (Why?)  For  a given  z 0 it  is  determined  only  up 
to  integer  multiples  of  277  since  cosine  and  sine  are  periodic  with  period  277.  But  one 
often  wants  to  specify  a unique  value  of  arg  z of  a given  z A 0.  For  this  reason  one  defines 
the  principal  value  Arg  z (with  capital  A!)  of  arg  z by  the  double  inequality 


(5) 


— 77  < Arg  2 g 77. 


Then  we  have  Arg  2 = 0 for  positive  real  z = x,  which  is  practical,  and  Arg  2 = 77  (not 
— 77!)  for  negative  real  z,  e.g.,  for  2 = —4.  The  principal  value  (5)  will  be  important  in 
connection  with  roots,  the  complex  logarithm  (Sec.  13.7),  and  certain  integrals.  Obviously, 
for  a given  2 A 0,  the  other  values  of  arg  2 are  arg  2 = Arg  2 ± 2nir  {n  = ± 1,  ±2,  • ■ ■ ). 


Imaginary 

axis 


Fig.  323.  Complex  plane,  polar  form 
of  a complex  number 


Fig.  324.  Distance  between  two 
points  in  the  complex  plane 


Polar  Form  of  Complex  Numbers.  Principal  Value  Arg  z 

Z = 1 + i (Fig.  325)  has  the  polar  form  z = V2  (cos  + i sin  j7r).  Hence  we  obtain 

|z|  = Vl,  arg  j = \tt  ± 2ii7T  (rt  = 0.  1.  • • ■ ),  and  Arg  z = 577  (the  principal  value). 

Similarly,  z = 3 + 3\/3 i = 6 (cos  gir  + i sin  jj7r),  |c|  = 6,  and  Arg  z = g7T. 

CAUTION!  In  using  (4),  we  must  pay  attention  to  the  quadrant  in  which  2 lies,  since 
tan  9 has  period  77,  so  that  the  arguments  of  2 and  —2  have  the  same  tangent.  Example: 

for  9 1 = arg  (1  + i)  and  92  = arg  (—1  — i ) we  have  tan  9 1 = tan  02  = 1. 

Triangle  Inequality 

Inequalities  such  as  xi  < x2  make  sense  for  real  numbers,  but  not  in  complex  because  there 
is  no  natural  way  of  ordering  complex  numbers.  However,  inequalities  between  absolute  values 
(which  are  real!),  such  as  1 2x | < |22|  (meaning  that  zi  is  closer  to  the  origin  than  22)  are  of 
great  importance.  The  daily  bread  of  the  complex  analyst  is  the  triangle  inequality 


(6)  I21  + 2 2 1 = Izil  + U2I  (Fig-  326) 

which  we  shall  use  quite  frequently.  This  inequality  follows  by  noting  that  the  three 
points  0,  2 1,  and  zi  + 22  are  the  vertices  of  a triangle  (Fig.  326)  with  sides  1 2 j | , |z2|,  and 
1 2 1 + z 2I,  and  one  side  cannot  exceed  the  sum  of  the  other  two  sides.  A formal  proof  is 
left  to  the  reader  (Prob.  33).  (The  triangle  degenerates  if  21  and  z2  lie  on  the  same  straight 
line  through  the  origin.) 
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EXAMPLE  2 


By  induction  we  obtain  from  (6)  the  generalized  triangle  inequality 

(6*)  \zi  + Z2  + ■■■  + Zn\  = kll  + \z2\  -I + Uni; 

that  is,  the  absolute  value  of  a sum  cannot  exceed  the  sum  of  the  absolute  values  of  the  terms. 

Triangle  Inequality 

If  zi  — 1 + i and  Z2  = ~ 2 + 3 i,  then  (sketch  a figure!) 

\zi  + Zz\  = 1:-1  + 4(1  = vT7  = 4.123  < V2  + Vl3  = 5.020. 

Multiplication  and  Division  in  Polar  Form 

This  will  give  us  a “geometrical”  understanding  of  multiplication  and  division.  Let 
Z i = 7i(cos  0i  + i sin  Of)  and  z2  = r2(cos  92  + i sin  Of). 

Multiplication.  By  (3)  in  Sec.  13.1  the  product  is  at  first 

Z\Z2  = r^lXcos  91  cos  02  — sin  91  sin  9f)  + i(sin  9\  cos  02  + cos  9\  sin  02)]. 

The  addition  rules  for  the  sine  and  cosine  [(6)  in  App.  A3.1]  now  yield 

(7)  ziz2  = ^[cosC#!  + 02)  + i sin(0!  + 02)]. 

Taking  absolute  values  on  both  sides  of  (7),  we  see  that  the  absolute  value  of  a product 
equals  the  product  of  the  absolute  values  of  the  factors, 

(8)  U1Z2I  = Uil  U2I  - 

Taking  arguments  in  (7)  shows  that  the  argument  of  a product  equals  the  sum  of  the 
arguments  of  the  factors. 


(9) 


arg  (ziz2)  = arg  zi  + arg  z2  (up  to  multiples  of  2tt). 


Division.  We  have  z\  = {z\/z2)z2.  Hence  Uil  = l(z i/z2)z2\  = U1/Z2I  U2I  and  by 
division  by  \z2\ 


(10) 


Zl 

Z-2 


(z2  A 0). 
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EXAMPLE  3 


EXAMPLE  4 


Similarly,  arg  z ] = arg  [(21/22)22]  = arg  (21/22)  + arg  22  and  by  subtraction  of  arg  22 


(ID 


2 1 

arg  — = arg  21  - arg  22 


(up  to  multiples  of  277). 


Combining  (10)  and  (11)  we  also  have  the  analog  of  (7), 

(12)  = V tcos  (0i  - 02)  + i sin  (0i  - 02)1- 

22  r2 

To  comprehend  this  formula,  note  that  it  is  the  polar  form  of  a complex  number  of  absolute 
value  ri/r2  and  argument  f)\  — 62 . But  these  are  the  absolute  value  and  argument  of  z 1/7.2, 
as  we  can  see  from  (10),  (11),  and  the  polar  forms  of  21  and  72- 

Illustration  of  Formulas  (8)— (11) 

Let  Zi  = ~2  + 2 i and  Z2  — 3 i.  Then  ZiZ  2 = — 6 — 6i,  Z\lz2  = § + (§)<■  Hence  (make  a sketch) 

kiz2\  = 6V2  = 3V8  = |zi||z2l.  Izi/zzl  = 2V2/3  = |zil/|z2|, 

and  for  the  arguments  we  obtain  Arg  zi  = 377/4,  Arg  z2  = 77/2, 

377  fZl\  17 

Arg  (Z1Z2)  = — = Arg  z 1 + Arg  z2  “ 277,  Arg  I — I = — = Arg  z 1 - Arg  z2- 

4 \z2/  4 


Integer  Powers  of  z.  De  Moivre’s  Formula 

From  (8)  and  (9)  with  z\  — z2  = z we  obtain  by  induction  for  n = 0,  1,  2,  • • • 

(13)  zn  — rn  (cos  nd  + i sin  nO). 

Similarly,  (12)  with  z i — 1 and  z2  — Zn  gives  (13)  for  n = — 1,  —2,  • • • . For  \z\  = r = 1,  formula  (13)  becomes 

De  Moivre’s  formula3 

(13*)  (cos  6 + i sin  6)n  = cos  nd  + i sin  nd. 

We  can  use  this  to  express  cos  nd  and  sin  nd  in  terms  of  powers  of  cos  d and  sin  d.  For  instance,  for  n = 2 we 
have  on  the  left  cos2  d + 2 i cos  d sin  d — sin2  d.  Taking  the  real  and  imaginary  parts  on  both  sides  of  (13*) 
with  n = 2 gives  the  familiar  formulas 

cos  2d  = cos2  d — sin2  d,  sin  2d  = 2 cos  d sin  d. 

This  shows  that  complex  methods  often  simplify  the  derivation  of  real  formulas.  Try  n = 3. 


Roots 

If  2 = wn  (n  = 1,  2,  ■ • • ),  then  to  each  value  of  w there  corresponds  one  value  of  z . We 
shall  immediately  see  that,  conversely,  to  a given  z ¥=  0 there  correspond  precisely  n 
distinct  values  of  w.  Each  of  these  values  is  called  an  nth  root  of  z,  and  we  write 


3ABRAHAM  DE  MOIVRE  (1667-1754),  French  mathematician,  who  pioneered  the  use  of  complex  numbers 
in  trigonometry  and  also  contributed  to  probability  theory  (see  Sec.  24.8). 
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(14)  w = *\fz. 

Hence  this  symbol  is  multivalued , namely,  n-valued.  The  n values  of  ~\fz  can  be  obtained 
as  follows.  We  write  z and  w in  polar  form 

Z = r(c os  6 + i sin  6)  and  w = R( cos  cf)  + i sin  </>). 

Then  the  equation  wn  = z becomes,  by  De  Moivre’s  formula  (with  d>  instead  of  6), 

wn  = Rn{ cos  ncj)  + i sin  n<p ) = z = r( cos  6 + i sin  6). 

The  absolute  values  on  both  sides  must  be  equal;  thus,  Rn  = r,  so  that  R = ~\Tr,  where 
Vr  is  positive  real  (an  absolute  value  must  be  nonnegative!)  and  thus  uniquely  determined. 
Equating  the  arguments  n<f>  and  0 and  recalling  that  6 is  determined  only  up  to  integer 
multiples  of  27 r,  we  obtain 

nd>  = 9 + 2Ictt,  thus  d>  = — + ^ 

where  k is  an  integer.  For  k = 0,  1,  • • ■ , n — 1 we  get  n distinct  values  of  w.  Further  integers 
of  k would  give  values  already  obtained.  For  instance,  k = n gives  2kzr / n = 277,  hence 
the  w corresponding  to  k = 0,  etc.  Consequently,  Vz,  for  z i=  0,  has  the  n distinct  values 


(15) 


, 6 + 2k7T  . 6 + 2kzr\ 

= </^(cos + i sin I 


where  k = 0,  1,  ■ • ■ , n — 1.  These  n values  lie  on  a circle  of  radius  v'T-  with  center  at  the 
origin  and  constitute  the  vertices  of  a regular  polygon  of  n sides.  The  value  of  Vz  obtained 
by  taking  the  principal  value  of  arg  z and  k = 0 in  (15)  is  called  the  principal  value  of 

w = V~z. 

Taking  z = 1 in  (15),  we  have  |z|  = r = 1 and  Arg  z = 0.  Then  (15)  gives 


(16) 


Vi 


2k7T  . 2kTT 

cos 1-  i sin , 

n n 


k = 0,  1,  - 1. 


These  n values  are  called  the  nth  roots  of  unity.  They  lie  on  the  circle  of  radius  1 and 
center  0,  briefly  called  the  unit  circle  (and  used  quite  frequently!).  Figures  327-329  show 
V^T  = 1,  — g ± |\/3 i,  v^T  = ±1,  ±i,  andv^T. 
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If  co  denotes  the  value  corresponding  to  k = 1 in  (16),  then  the  n values  of  's/  I can  be 
written  as 

l,  co,  co2,  --,^-1. 

More  generally,  if  w i is  any  77th  root  of  an  arbitrary  complex  number  z (¥=  0),  then  the  n 
values  of  Vz  in  (15)  are 

(17)  vv’i,  Wico,  w ico  , • • • , vv'io; 

because  multiplying  w 1 by  cok  corresponds  to  increasing  the  argument  of  vvi  by  2kTr/n. 
Formula  (17)  motivates  the  introduction  of  roots  of  unity  and  shows  their  usefulness. 


PROBLEM  SET  1K2 


1-8 


POLAR  FORM 


Represent  in  polar  form  and  graph  in  the  complex  plane  as 
in  Fig.  325.  Do  these  problems  very  carefully  because  polar 
forms  will  be  needed  frequently.  Show  the  details. 


1.  1 + 7 

3.  2t,  -21 

V2  + t/3 
-Vs  - 2i/3 


2.  -4  + 41 
4.  -5 

V3  - 10/ 

6.  — 

-2V3  + 5 / 


7.  1 + 277/ 


8. 


-4  + 19/ 
2 + 5/ 


9-14 


PRINCIPAL  ARGUMENT 


Determine  the  principal  value  of  the  argument  and  graph  it 
as  in  Fig.  325. 


9.  -1  + 7 
11.  3 ± 4 / 
13.  (1  + if0 


10.  -5,  -5  - /,  -5  + / 

12.  —77  — 777 

14.  -1  + 0.1/,  -1  - 0.1/ 


15-18 


CONVERSION  TO  x + iy 

Graph  in  the  complex  plane  and  represent  in  the  form  x + iy: 
15.  3 (cos  \tt  — i sin  7)  16.  6 (cos  577  + i sin  g77) 

17.  a/8  (cos  577  + i sin  577) 

18.  \/50  (cos  §77  + / sin  577) 


ROOTS 

19.  CAS  PROJECT.  Roots  of  Unity  and  Their  Graphs. 

Write  a program  for  calculating  these  roots  and  for 
graphing  them  as  points  on  the  unit  circle.  Apply  the 
program  to  zn  = 1 with  n — 2,  3,  ■ ■ ■ , 10.  Then  extend 
the  program  to  one  for  arbitrary  roots,  using  an  idea 
near  the  end  of  the  text,  and  apply  the  program  to 
examples  of  your  choice. 


20.  TEAM  PROJECT.  Square  Root,  (a)  Show  that 
w = Vz  has  the  values 


(18) 


w 1 = Vr 

vv2  = Vr 
= —w  1. 


e . e 

cos  — F 7 sin  — 
2 2 


0 


+ 77+/  sin 


+ 77 


(b)  Obtain  from  (18)  the  often  more  practical  formula 

(19)  Vz  = ±[Vg(k|  +x)  + (sign y)iVg(|z|  + x)] 

where  sign  y = 1 if  y S 0,  sign  y = — 1 if  y < 0,  and 
all  square  roots  of  positive  numbers  are  taken  with 
positive  sign.  Hint:  Use  (10)  in  App.  A3.1  with*  = 0/2. 

(c)  Find  the  square  roots  of  —14/,  —9  — 40/,  and 
1 + V48Z  by  both  (18)  and  (19)  and  comment  on  the 
work  involved. 


(d)  Do  some  further  examples  of  your  own  and  apply 
a method  of  checking  your  results. 


21-27 


ROOTS 


Find  and  graph  all  roots  in  the  complex  plane. 

21.  \Kl  + / 22.  ^3  + 4/ 

23.  ^ 216  24.  ^4 


25. 


28-31 


26. 

EQUATIONS 


27.  V- 


Solve  and  graph  the  solutions.  Show  details. 


28.  z2  ~ ( 6 - 2 i)z  + 17  - 6/  = 0 

29.  z2  + z + 1 - / = 0 


30.  z4  + 324  = 0.  Using  the  solutions,  factor  z4  + 324 
into  quadratic  factors  with  real  coefficients. 

31.  z4  - 6/z2  + 16  = 0 
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32-35  INEQUALITIES  AND  EQUALITY 

32.  Triangle  inequality.  Verify  (6)  for  zi  = 3 + i, 

Z2  = “2  + 4i 

33.  Triangle  inequality.  Prove  (6). 


34.  Re  and  Im.  Prove  | Re  ~|  S \z\,  | Im  z £ |z|. 

35.  Parallelogram  equality.  Prove  and  explain  the  name 

Ui  + z2|2+  k1-Z2|2  = 2(k1|2+  |z2|2). 


13.3  Derivative.  Analytic  Function 

Just  as  the  study  of  calculus  or  real  analysis  required  concepts  such  as  domain, 
neighborhood,  function,  limit,  continuity,  derivative,  etc.,  so  does  the  study  of  complex 
analysis.  Since  the  functions  live  in  the  complex  plane,  the  concepts  are  slightly  more 
difficult  or  different  from  those  in  real  analysis.  This  section  can  be  seen  as  a reference 
section  where  many  of  the  concepts  needed  for  the  rest  of  Part  D are  introduced. 

Circles  and  Disks.  Half-Planes 

The  unit  circle  |z|  = 1 (Fig.  330)  has  already  occurred  in  Sec.  13.2.  Figure  331  shows  a 
general  circle  of  radius  p and  center  a.  Its  equation  is 

\z  ~ a\  = p 


y 


Fig.  330.  Unit  circle 


y 


y 


X 


Fig.  331.  Circle  in  the 
complex  plane 


Fig.  332.  Annulus  in  the 
complex  plane 


because  it  is  the  set  of  all  z whose  distance  z ~ « from  the  center  a equals  p.  Accordingly, 
its  interior  (“open  circular  disk”)  is  given  by  — «|  < p,  its  interior  plus  the  circle 
itself  (“closed  circular  disk”)  by  |z  — a|  = p,  and  its  exterior  by  z ~ u\  > p.  As  an 
example,  sketch  this  for  a = 1 + i and  p = 2,  to  make  sure  that  you  understand  these 
inequalities. 

An  open  circular  disk  k — a I < p is  also  called  a neighborhood  of  a or,  more  precisely, 
a p-neighborhood  of  a.  And  a has  infinitely  many  of  them,  one  for  each  value  of  p (>  0), 
and  a is  a point  of  each  of  them,  by  definition! 

In  modern  literature  any  set  containing  a p-neighborhood  of  a is  also  called  a neigh- 
borhood of  a. 

Figure  332  shows  an  open  annulus  (circular  ring)  pi  < z — « < p2,  which  we  shall 
need  later.  This  is  the  set  of  all  z whose  distance  \z  — a\  from  a is  greater  than  pi  but 
less  than  p2.  Similarly,  the  closed  annulus  p\  = z — a\  = p2  includes  the  two  circles. 


Half-Planes.  By  the  (open)  upper  half-plane  we  mean  the  set  of  all  points  z = x + iy 
such  that  y > 0.  Similarly,  the  condition  y < 0 defines  the  lower  half-plane,  x > 0 the 
right  half-plane,  and  x < 0 the  left  half-plane. 
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For  Reference:  Concepts  on  Sets 
in  the  Complex  Plane 

To  our  discussion  of  special  sets  let  us  add  some  general  concepts  related  to  sets  that  we 
shall  need  throughout  Chaps.  13-18;  keep  in  mind  that  you  can  find  them  here. 

By  a point  set  in  the  complex  plane  we  mean  any  sort  of  collection  of  finitely  many 
or  infinitely  many  points.  Examples  are  the  solutions  of  a quadratic  equation,  the 
points  of  a line,  the  points  in  the  interior  of  a circle  as  well  as  the  sets  discussed  just 
before. 

A set  S is  called  open  if  every  point  of  S has  a neighborhood  consisting  entirely  of 
points  that  belong  to  S.  For  example,  the  points  in  the  interior  of  a circle  or  a square  form 
an  open  set,  and  so  do  the  points  of  the  right  half-plane  Re  z = x > 0. 

A set  S is  called  connected  if  any  two  of  its  points  can  be  joined  by  a chain  of  finitely 
many  straight-line  segments  all  of  whose  points  belong  to  S.  An  open  and  connected  set 
is  called  a domain.  Thus  an  open  disk  and  an  open  annulus  are  domains.  An  open  square 
with  a diagonal  removed  is  not  a domain  since  this  set  is  not  connected.  (Why?) 

The  complement  of  a set  S in  the  complex  plane  is  the  set  of  all  points  of  the  complex 
plane  that  do  not  belong  to  S.  A set  S is  called  closed  if  its  complement  is  open.  For  example, 
the  points  on  and  inside  the  unit  circle  form  a closed  set  (“closed  unit  disk”)  since  its 
complement  lz|  > 1 is  open. 

A boundary  point  of  a set  S is  a point  every  neighborhood  of  which  contains  both  points 
that  belong  to  S and  points  that  do  not  belong  to  S.  For  example,  the  boundary  points  of 
an  annulus  are  the  points  on  the  two  bounding  circles.  Clearly,  if  a set  S is  open,  then  no 
boundary  point  belongs  to  S',  if  S is  closed,  then  every  boundary  point  belongs  to  S.  The 
set  of  all  boundary  points  of  a set  S is  called  the  boundary  of  S. 

A region  is  a set  consisting  of  a domain  plus,  perhaps,  some  or  all  of  its  boundary  points. 
WARNING!  “Domain”  is  the  modern  term  for  an  open  connected  set.  Nevertheless,  some 
authors  still  call  a domain  a “region”  and  others  make  no  distinction  between  the  two  terms. 


Complex  Function 

Complex  analysis  is  concerned  with  complex  functions  that  are  differentiable  in  some 
domain.  Hence  we  should  first  say  what  we  mean  by  a complex  function  and  then  define 
the  concepts  of  limit  and  derivative  in  complex.  This  discussion  will  be  similar  to  that  in 
calculus.  Nevertheless  it  needs  great  attention  because  it  will  show  interesting  basic 
differences  between  real  and  complex  calculus. 

Recall  from  calculus  that  a real  function /defined  on  a set  S of  real  numbers  (usually  an 
interval)  is  a rule  that  assigns  to  every  x in  S a real  number /(x),  called  the  value  of/ at  x. 
Now  in  complex,  S is  a set  of  complex  numbers.  And  a function  / defined  on  S is  a rule 
that  assigns  to  every  z in  S a complex  number  w,  called  the  value  of  / at  z.  We  write 

w = f(z). 

Here  z varies  in  S and  is  called  a complex  variable.  The  set  S is  called  the  domain  of 
definition  of/ or,  briefly,  the  domain  off.  (In  most  cases  S will  be  open  and  connected, 
thus  a domain  as  defined  just  before.) 

Example:  w = f(z)  = z2  + 3z  is  a complex  function  defined  for  all  z;  that  is,  its  domain 
S is  the  whole  complex  plane. 

The  set  of  all  values  of  a function  / is  called  the  range  off 
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EXAMPLE  1 


EXAMPLE  2 


w is  complex,  and  we  write  w = u + iv,  where  u and  v are  the  real  and  imaginary 
parts,  respectively.  Now  w depends  on  z = x + iy.  Hence  u becomes  a real  function  of  x 
and  y,  and  so  does  u.  We  may  thus  write 

w = f(z)  = u(x,  y ) + iv(x,  y ). 

This  shows  that  a complex  function  f(z)  is  equivalent  to  a pair  of  real  functions  u{x , y) 
and  v(x,  y),  each  depending  on  the  two  real  variables  x and  y. 

Function  of  a Complex  Variable 

Let  w = f(z)  = z2  + 3 z-  Find  u and  v and  calculate  the  value  of/ at  z — 1 + 3 i. 

Solution,  u = R tf(z)  = x2  — y2  + 3x  and  v = 2 xy  + 3 y.  Also, 

/(I  + 3/)  = (1  + 3/)2  + 3(1  + 3/)  = 1 - 9 + 6/  + 3 + 9 / = -5  + 15/. 

This  shows  that  1,3)  = —5  and  u(l,  3)  = 15.  Check  this  by  using  the  expressions  for  u and  v. 

Function  of  a Complex  Variable 

Let  w — f(z ) — 2 iz  + 6z.  Find  u and  v and  the  value  of/ at  z — | + 4 /. 

Solution.  f{z)  — 2/(jc  + iy)  + 6{x  — iy)  gives  u{x,  y)  = 6x  — 2 y and  v(x,  y)  = 2x  — 6 y.  Also, 

/(i  + 4 /)  = 2/(|  + 40  + 6(|  - 40  = i - 8 + 3 - 24/  = -5  - 23/. 

Check  this  as  in  Example  1 . 

Remarks  on  Notation  and  Terminology 

1.  Strictly  speaking,  f(z)  denotes  the  value  of  / at  z,  but  it  is  a convenient  abuse  of 
language  to  talk  about  the  function  f(z ) (instead  of  the  function  /),  thereby  exhibiting  the 
notation  for  the  independent  variable. 

2.  We  assume  all  functions  to  be  single-valued  relations,  as  usual:  to  each  z in  S there 
corresponds  but  one  value  w = f(z)  (but,  of  course,  several  z may  give  the  same  value 
w = f(z),  just  as  in  calculus).  Accordingly,  we  shall  not  use  the  term  “multivalued 
function”  (used  in  some  books  on  complex  analysis)  for  a multivalued  relation,  in  which 
to  a z there  corresponds  more  than  one  w. 

Limit,  Continuity 

A function  f(z)  is  said  to  have  the  limit  l as  z approaches  a point  z(|,  written 

(1)  lim  f(z)  = l, 

z->z0 

if  / is  defined  in  a neighborhood  of  zo  (except  perhaps  at  Zo  itself)  and  if  the  values  of 
/ are  “close”  to  / for  all  z “close”  to  z.q',  in  precise  terms,  if  for  every  positive  real  e we  can 
find  a positive  real  5 such  that  for  all  z A z o in  the  disk  |z  — zol  <8  (Fig.  333)  we  have 

(2)  l/(z)  - Z|  < e; 

geometrically,  if  for  every  z A zo  in  that  5-disk  the  value  of/ lies  in  the  disk  (2). 

Formally,  this  definition  is  similar  to  that  in  calculus,  but  there  is  a big  difference. 
Whereas  in  the  real  case,  x can  approach  an  x0  only  along  the  real  line,  here,  by  definition, 


622 


CHAP.  13  Complex  Numbers  and  Functions.  Complex  Differentiation 


EXAMPLE  3 


z may  approach  Zofrom  any  direction  in  the  complex  plane.  This  will  be  quite  essential 
in  what  follows. 

If  a limit  exists,  it  is  unique.  (See  Team  Project  24.) 

A function  /(z)  is  said  to  be  continuous  at  z = z0  if /(zo)  is  defined  and 
(3)  Hm  /(z)  = f(zo)- 

Z~>Z0 

Note  that  by  definition  of  a limit  this  implies  that  f(z)  is  defined  in  some  neighborhood 
of  z0. 

f(z)  is  said  to  be  continuous  in  a domain  if  it  is  continuous  at  each  point  of  this  domain. 


Fig.  333.  Limit 


Derivative 

The  derivative  of  a complex  function /at  a point  zo  is  written  / (zo)  and  is  defined  by 


(4) 


Azo)  = Hm 


/(zo  + Az)  - /(zo) 
Az 


provided  this  limit  exists.  Then/is  said  to  be  differentiable  at  z0.  If  we  write  A z = z — Zo, 
we  have  z = z0  + Az  and  (4)  takes  the  form 


(4') 


/'(z0)  = Hm 

Z^Zo 


m - /(zo) 

z - ZO 


Now  comes  an  important  point.  Remember  that,  by  the  definition  of  limit, /(z)  is  defined 
in  a neighborhood  of  zo  and  z in  (4  ) may  approach  zo  from  any  direction  in  the  complex 
plane.  Hence  differentiability  at  z0  means  that,  along  whatever  path  z approaches  z0,  the 
quotient  in  {4')  always  approaches  a certain  value  and  all  these  values  are  equal.  This  is 
important  and  should  be  kept  in  mind. 

Differentiability.  Derivative 

The  function  f(z)  = ~2  is  differentiable  for  all  z and  has  the  derivative  f\z)  = 2z  because 


Az)  = lint 


(z  + Az)2  — z 
Az 


= lim 
Az— >0 


z2  + 2z  Az  + (Az)2  - z2 


Az 


= lim  (2z  + Az)  = 2z. 
Az— »0 
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EXAMPLE  4 


DEFINITION 


The  differentiation  rules  are  the  same  as  in  real  calculus,  since  their  proofs  are  literally 
the  same.  Thus  for  any  differentiable  functions  / and  g and  constant  c we  have 

(cfY  = cf\  (/+  g)'  =f'  + g,  ( fg )'  =f'g  + fg  , (0  =fg__Js_ 

as  well  as  the  chain  rule  and  the  power  rule  ( zn )'  = nzn~ 1 (n  integer). 

Also,  if/(z)  is  differentiable  at  Zo,  it  is  continuous  at  zo-  (See  Team  Project  24.) 

z not  Differentiable 

It  may  come  as  a surprise  that  there  are  many  complex  functions  that  do  not  have  a derivative  at  any  point.  For 

instance,  f{z)  = z = x — iy  is  such  a function.  To  see  this,  we  write  Az  = Ax  + i A y and  obtain 

f(z  + Az)  - /(z)  _ (z  + A z)  ~ z A z Ax  - i Ay 

A z A z A z Ax  + iAy' 

If  Ay  = 0,  this  is  +1.  If  Ax  = 0,  this  is  —1.  Thus  (5)  approaches  +1  along  path  I in  Fig.  334  but  —1  along 
path  II.  Hence,  by  definition,  the  limit  of  (5)  as  Az  — * 0 does  not  exist  at  any  z. 


i 


Fig.  334.  Paths  in  (5) 

Surprising  as  Example  4 may  be,  it  merely  illustrates  that  differentiability  of  a complex 
function  is  a rather  severe  requirement. 

The  idea  of  proof  (approach  of  z from  different  directions)  is  basic  and  will  be  used 
again  as  the  crucial  argument  in  the  next  section. 

Analytic  Functions 

Complex  analysis  is  concerned  with  the  theory  and  application  of  “analytic  functions,” 
that  is,  functions  that  are  differentiable  in  some  domain,  so  that  we  can  do  “calculus  in 
complex.”  The  definition  is  as  follows. 


Analyticity 

A function /(z)  is  said  to  be  analytic  in  a domain  D if /(z)  is  defined  and  differentiable 
at  all  points  of  D.  The  function /(z)  is  said  to  be  analytic  at  a point  z = Zo  in  D if 
f(z)  is  analytic  in  a neighborhood  of  z0- 

Also,  by  an  analytic  function  we  mean  a function  that  is  analytic  in  some  domain. 


Hence  analyticity  of/(z)  at  zo  means  that/(z)  has  a derivative  at  every  point  in  some 
neighborhood  of  zo  (including  zo  itself  since,  by  definition,  zo  is  a point  of  all  its 
neighborhoods).  This  concept  is  motivated  by  the  fact  that  it  is  of  no  practical  interest 
if  a function  is  differentiable  merely  at  a single  point  zo  but  not  throughout  some 
neighborhood  of  zo-  Team  Project  24  gives  an  example. 

A more  modern  term  for  analytic  in  D is  holomorphic  in  D. 
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EXAMPL  Polynomials,  Rational  Functions 

The  nonnegative  integer  powers  1,  z,  ZZ,  • • • are  analytic  in  the  entire  complex  plane,  and  so  are  polynomials, 
that  is,  functions  of  the  form 


/(z)  = c0  + cjz  + c2z2  + ■ ■ • + cnzn 


where  Co,  • ■ ■ , cn  are  complex  constants. 

The  quotient  of  two  polynomials  g(z)  and  h(z). 


f(z ) = 


g(z) 
h(z )' 


is  called  a rational  function.  This/is  analytic  except  at  the  points  where  h(z)  = 0;  here  we  assume  that  common 
factors  of  g and  h have  been  canceled. 

Many  further  analytic  functions  will  be  considered  in  the  next  sections  and  chapters. 


The  concepts  discussed  in  this  section  extend  familiar  concepts  of  calculus.  Most 
important  is  the  concept  of  an  analytic  function,  the  exclusive  concern  of  complex 
analysis.  Although  many  simple  functions  are  not  analytic,  the  large  variety  of  remaining 
functions  will  yield  a most  beautiful  branch  of  mathematics  that  is  very  useful  in 
engineering  and  physics. 


PROBLEMS ET  TV3 


1-8 


REGIONS  OF  PRACTICAL  INTEREST 


Determine  and  sketch  or  graph  the  sets  in  the  complex  plane 
given  by 


1.  U + 1 - 5*|  S | 

2.  0 < kl  < 1 

3.  7r  < k — 4 + 2i\  < 37 r 

4.  —77  < Im  z < 77 

5.  |arg  z\  < 377 

6.  Re  (1  /z)  < 1 


7.  Re  z £ -1 


8.  k + i|  a k — /| 

9.  WRITING  PROJECT.  Sets  in  the  Complex  Plane. 

Write  a report  by  formulating  the  corresponding 
portions  of  the  text  in  your  own  words  and  illustrating 
them  with  examples  of  your  own. 


COMPLEX  FUNCTIONS  AND  THEIR  DERIVATIVES 


10-12 


Function  Values.  Find  Re/,  and  Im/and  their 


values  at  the  given  point  7. 


10.  /(z)  = 5z2  - 12z  + 3 + 2 i at  4 - 3/ 

11.  /(z)  = 1/(1  - z)  at  1 - « 

12.  /(z)  = (z  - 2 )/(z  + 2)  at  81 

13.  CAS  PROJECT.  Graphing  Functions.  Find  and  graph 
Re/,  Im/,  and  |/|  as  surfaces  over  the  z-plane.  Also 
graph  the  two  families  of  curves  Re/(z)  = const  and 


Im/(z)  = const  in  the  same  figure,  and  the  curves 
|/(z)|  = const  in  another  figure,  where  (a)/(z)  = z2, 
(b)/(z)  = l/z,  (c)/(z)  = z4 


14-17  Continuity.  Find  out,  and  give  reason,  whether 
/(z)  is  continuous  at  z = 0 if  /( 0)  = 0 and  for  z f 0 the 
function /is  equal  to: 


14.  (Re  z2)/ kl  15.  |z|2Im(l/z) 

16.  (Imz2)/|z|2  17.  (Re  z)/(l  - kl) 


18-23 

of 


Differentiation.  Find  the  value  of  the  derivative 


18.  (z  — i')/(z  + i)  at*  19.  (z  — 4i)8  at  = 3 + 4/ 

20.  (1.5z  + 2/)/(3/z  — 4)  at  any  7.  Explain  the  result. 

21.  /(I  - zf  at  0 

22.  (iz3  + 3z2)3  at  2/  23.  z3/(z  + if  at  i 

24.  TEAM  PROJECT.  Limit,  Continuity,  Derivative 
(a)  Limit.  Prove  that  (1)  is  equivalent  to  the  pair  of 
relations 


lim  Re/(z)  = Re  I,  lim  Im/(z)  = Im  I. 

z *2-0  Z^Zo 

(b)  Limit.  If  lim  f(x)  exists,  show  that  this  limit  is 

z^z0 

unique. 

(c)  Continuity.  If  zi,  72,  ■ ■ ■ are  complex  numbers  for 

which  lim  Zn  = a,  and  if  f(z)  is  continuous  at  z — a, 
n—>  oc 

show  that  lim  /(zn)  = f(a ). 

n— >00 ' 
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(d)  Continuity.  If/(z)  is  differentiable  at  z0.  show  that 
f(z)  is  continuous  at  z0. 

(e)  Differentiability.  Show  that  /Tz)  = Re  z — x is  not 
differentiable  at  any  z.  Can  you  find  other  such  functions? 

(f)  Differentiability.  Show  that  f{z)  = z|2  is  dif- 
ferentiable only  at  z — 0;  hence  it  is  nowhere  analytic. 


25.  WRITING  PROJECT.  Comparison  with  Calculus. 

Summarize  the  second  part  of  this  section  beginning  with 
Complex  Function , and  indicate  what  is  conceptually 
analogous  to  calculus  and  what  is  not. 


13 A Cauchy-Riemann  Equations. 

Laplaces  Equation 

As  we  saw  in  the  last  section,  to  do  complex  analysis  (i.e.,  “calculus  in  the  complex”)  on 
any  complex  function,  we  require  that  function  to  be  analytic  on  some  domain  that  is 
differentiable  in  that  domain. 

The  Cauchy-Riemann  equations  are  the  most  important  equations  in  this  chapter 
and  one  of  the  pillars  on  which  complex  analysis  rests.  They  provide  a criterion  (a  test) 
for  the  analyticity  of  a complex  function 

W = f(z)  = u(x,  y)  + iv(x,  y). 

Roughly,  / is  analytic  in  a domain  I)  if  and  only  if  the  first  partial  derivatives  of  u and  v 
satisfy  the  two  Cauchy-Riemann  equations4 

(1)  UX  — tty,  Uy  — ~VX 

everywhere  in  D\  here  ux  = du/dx  and  uy  = du/dy  (and  similarly  for  v)  are  the  usual 
notations  for  partial  derivatives.  The  precise  formulation  of  this  statement  is  given  in 
Theorems  1 and  2. 

Example:  fifi)  = zZ  = x2  — y2  + 2 fry  is  analytic  for  all  z (see  Example  3 in  Sec.  13.3), 
and  u = x2  — y2  and  v = 2xy  satisfy  (1),  namely,  ux  = 2x  = vy  as  well  as  uy  = 
— 2 y = — vx.  More  examples  will  follow. 


THEOREM  I 


Cauchy-Riemann  Equations 

Let  f(z)  = u{x,y)  + iv(x,y)  be  defined  and  continuous  in  some  neighborhood  of  a 
point  z — x + iy  and  differentiable  at  z itself  Then,  at  that  point,  the  first-order 
partial  derivatives  of  u and  v exist  and  satisfy  the  Cauchy-Riemann  equations  (1). 

Hence,  iff(z)  is  analytic  in  a domain  D,  those  partial  derivatives  exist  and  satisfy 
(1)  at  all  points  of  D. 


4The  French  mathematician  AUGUSTIN-LOUIS  CAUCHY  (see  Sec.  2.5)  and  the  German  mathematicians 
BERNHARD  RIEMANN  (1826-1866)  and  KARL  WEIERSTRASS  (1815-1897;  see  also  Sec.  15.5)  are  the 
founders  of  complex  analysis.  Riemann  received  his  Ph.D.  (in  1851)  under  Gauss  (Sec.  5.4)  at  Gottingen,  where 
he  also  taught  until  he  died,  when  he  was  only  39  years  old.  He  introduced  the  concept  of  the  integral  as  it  is 
used  in  basic  calculus  courses,  and  made  important  contributions  to  differential  equations,  number  theory,  and 
mathematical  physics.  He  also  developed  the  so-called  Riemannian  geometry,  which  is  the  mathematical 
foundation  of  Einstein’s  theory  of  relativity;  see  Ref.  [GenRef9]  in  App.  1. 
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PROOF  By  assumption,  the  derivative  f'iz)  at  z exists.  It  is  given  by 


(2) 


Az) 


lim 

Az— >0 


f(z  + Az)  - /(z) 

A z 


The  idea  of  the  proof  is  very  simple.  By  the  definition  of  a limit  in  complex  (Sec.  13.3), 
we  can  let  Az  approach  zero  along  any  path  in  a neighborhood  of  z.  Thus  we  may  choose 
the  two  paths  I and  II  in  Fig.  335  and  equate  the  results.  By  comparing  the  real  parts  we 
shall  obtain  the  first  Cauchy-Riemann  equation  and  by  comparing  the  imaginary  parts  the 
second.  The  technical  details  are  as  follows. 

We  write  Az  = Ax  + i Ay.  Then  z + Az  = x + Ax  + i(y  + Ay),  and  in  terms  of  u 
and  v the  derivative  in  (2)  becomes 


(3)  f'iz) 


[u(x  + Ax,  y + Ay)  + iv(x  + Ax,  y + Ay)]  — [m(x,  y)  + iv(x,  y)] 

lim 1 1 — 

Az^o  Ax  + i Ay 


We  first  choose  path  1 in  Fig.  335.  Thus  we  let  Ay  — >0  first  and  then  Ax— >0.  After  Ay 
is  zero,  Az  = Ax.  Then  (3)  becomes,  if  we  first  write  the  two  M-terms  and  then  the  two 
u-terms, 


f'iz) 


ufx  + Ax,  y)  — w(x,  v)  u(x  + Ax,  y)  — vfx,  y) 

lim  1-  i lim  

Aa;— >0  Ax  Ax^°  Ax 


I 

Fig.  335  Paths  in  (2) 

Since  / (z)  exists,  the  two  real  limits  on  the  right  exist.  By  definition,  they  are  the  partial 
derivatives  of  u and  v with  respect  to  x.  Hence  the  derivative  / (z)  of  /(z)  can  be  written 

(4)  f'iz)  = «i  + ivx. 


Z 


Similarly,  if  we  choose  path  II  in  Fig.  335,  we  let  Ax  — > 0 first  and  then  Ay  — * 0.  After 
Ax  is  zero,  Az  = i Ay,  so  that  from  (3)  we  now  obtain 


f'iz) 


uix,  y + Ay)  — ufx,  y)  u(x,  y + Ay)  — u(x,  y) 

lim  F i lim  : — 

Ay— *0  i Ay  /A  y 


Since  f'iz)  exists,  the  limits  on  the  right  exist  and  give  the  partial  derivatives  of  u and  v 
with  respect  to  y;  noting  that  l/i  = —i,  we  thus  obtain 


(5)  f'iz)  = ~iUy  + Vy. 

The  existence  of  the  derivative/Vz)  thus  implies  the  existence  of  the  four  partial  derivatives 
in  (4)  and  (5).  By  equating  the  real  parts  ux  and  vy  in  (4)  and  (5)  we  obtain  the  first 
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EXAMPLE  1 


THEOREM  2 


EXAMPLE  2 


EXAMPLE  3 


Cauchy-Riemann  equation  (1).  Equating  the  imaginary  parts  gives  the  other.  This  proves 
the  first  statement  of  the  theorem  and  implies  the  second  because  of  the  definition  of 
analyticity. 

Formulas  (4)  and  (5)  are  also  quite  practical  for  calculating  derivatives  f'(z),  as  we  shall  see. 

Cauchy-Riemann  Equations 

f(z)  = z2  is  analytic  for  all  z.  It  follows  that  the  Cauchy-Riemann  equations  must  be  satisfied  (as  we  have 
verified  above). 

For/(z)  = z = x — iy  we  have  u — x,  V = — y and  see  that  the  second  Cauchy-Riemann  equation  is  satisfied, 
uy  = — vx  = 0,  but  the  first  is  not:  ux  = \ vy  = — \.  We  conclude  that/(z)  = z is  not  analytic,  confirming 
Example  4 of  Sec.  13.3.  Note  the  savings  in  calculation! 


The  Cauchy-Riemann  equations  are  fundamental  because  they  are  not  only  necessary  but 
also  sufficient  for  a function  to  be  analytic.  More  precisely,  the  following  theorem  holds. 


Cauchy-Riemann  Equations 

If  two  real-valued  continuous  functions  u(x,  y)  and  v(x,  y)  of  two  real  variables  x 
and  y have  continuous  first  partial  derivatives  that  satisfy  the  Cauchy-Riemann 
equations  in  some  domain  D,  then  the  complex  function  f(z)  = u(x,y ) + iv(x,y ) is 
analytic  in  D. 


The  proof  is  more  involved  than  that  of  Theorem  1 and  we  leave  it  optional  (see  App.  4). 

Theorems  1 and  2 are  of  great  practical  importance,  since,  by  using  the  Cauchy-Riemann 
equations,  we  can  now  easily  find  out  whether  or  not  a given  complex  function  is  analytic. 

Cauchy-Riemann  Equations.  Exponential  Function 

I s/(z)  = u{x,y)  + iv(x,y ) = CVcos  y + i siny)  analytic? 

Solution.  We  have  u = excos  y,v  = ex  sin  y and  by  differentiation 

ux  = ex  cos  y,  vy  = ex  cos  y 

uy  = — e^siny,  vx  = ex  sin  y. 

We  see  that  the  Cauchy-Riemann  equations  are  satisfied  and  conclude  that/(z)  is  analytic  for  all  z.  (/(z)  will 
be  the  complex  analog  of  ex  known  from  calculus.) 

An  Analytic  Function  of  Constant  Absolute  Value  Is  Constant 

The  Cauchy-Riemann  equations  also  help  in  deriving  general  properties  of  analytic  functions. 

For  instance,  show  that  if/(z)  is  analytic  in  a domain  D and  |/(z)|  = k = const  in  D.  then/(z)  = const  in 
D.  (We  shall  make  crucial  use  of  this  in  Sec.  18.6  in  the  proof  of  Theorem  3.) 

Solution.  By  assumption,  |/|2  = \u  + iv\2  = u2  + v2  = k2.  By  differentiation, 

uux  + vvx  = 0, 

UUy  + VVy  = 0. 

Now  use  vx=  — Uy  in  the  first  equation  and  vy  = ux  in  the  second,  to  get 


(6) 


(a)  uux  — vuy  = 0, 

(b)  UUy  — vux  = 0. 
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THEOREM  3 


PROOF 


To  get  rid  of  uy,  multiply  (6a)  by  u and  (6b)  by  v and  add.  Similarly,  to  eliminate  ux.  multiply  (6a)  by  — v and 
(6b)  by  u and  add.  This  yields 


(«2  + v2)ux  = 0, 

(if2  + V2)Uy  = 0. 

If  k2  = u2  + v2  = 0,  then  u = v = 0;  hence  / = 0.  If  k2  = u2  + v2  A 0,  then  ux  = uy  = 0.  Hence,  by  the 
Cauchy-Riemann  equations,  also  ux  = uy  = 0.  Together  this  implies  u = const  and  v = const;  hence 
/ = const. 


We  mention  that,  if  we  use  the  polar  form  z = r( cos  0 + i sin  9)  and  set  f(z)  = u(r,  6)  + 
iv(r,  6 ),  then  the  Cauchy-Riemann  equations  are  (Prob.  1) 

7 Vg,  (r  > 0). 

1 

--Ug 

Laplace’s  Equation.  Harmonic  Functions 

The  great  importance  of  complex  analysis  in  engineering  mathematics  results  mainly  from 
the  fact  that  both  the  real  part  and  the  imaginary  part  of  an  analytic  function  satisfy  Laplace’s 
equation,  the  most  important  PDE  of  physics.  It  occurs  in  gravitation,  electrostatics,  fluid 
flow,  heat  conduction,  and  other  applications  (see  Chaps.  12  and  18). 


(7) 


Vr  = 


Laplace’s  Equation 

If  f(z)  = u(x,  y ) + iv(x,  y)  is  analytic  in  a domain  D,  then  both  u and  v satisfy 

Laplace’s  equation 

(8)  V M tlXX  “F  tlyy  0 
(V2  read  “nabla  squared”)  and 

(9)  V2u  = vxx  + vyy  = 0, 

in  D and  have  continuous  second  partial  derivatives  in  D. 


Differentiating  ux  = vy  with  respect  to  x and  uy  = —vx  with  respect  to  y,  we  have 


(10) 


uxx  Uyx> 


Myy  VXy. 


Now  the  derivative  of  an  analytic  function  is  itself  analytic,  as  we  shall  prove  later  (in 
Sec.  14.4).  This  implies  that  u and  v have  continuous  partial  derivatives  of  all  orders;  in 
particular,  the  mixed  second  derivatives  are  equal:  vyx  = vxy.  By  adding  (10)  we  thus 
obtain  (8).  Similarly,  (9)  is  obtained  by  differentiating  ux  = vy  with  respect  to  y and 
uy  = — vx  with  respect  to  x and  subtracting,  using  uxy  = uyx. 


Solutions  of  Laplace’s  equation  having  continuous  second-order  partial  derivatives  are  called 
harmonic  functions  and  their  theory  is  called  potential  theory  (see  also  Sec.  12.1 1).  Hence 
the  real  and  imaginary  parts  of  an  analytic  function  are  harmonic  functions. 
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If  two  harmonic  functions  u and  v satisfy  the  Cauchy-Riemann  equations  in  a domain 
D,  they  are  the  real  and  imaginary  parts  of  an  analytic  function  / in  D.  Then  v is  said  to 
be  a harmonic  conjugate  function  of  u in  D.  (Of  course,  this  has  absolutely  nothing  to 
do  with  the  use  of  “conjugate”  for  z.) 

How  to  Find  a Harmonic  Conjugate  Function  by  the  Cauchy-Riemann  Equations 

Verify  that  u = xz  — y2  — y is  harmonic  in  the  whole  complex  plane  and  find  a harmonic  conjugate  function 
V of  u. 

Solution.  V2u  = 0 by  direct  calculation.  Now  ux  = lx  and  uy  = — 2y  — 1.  Hence  because  of  the  Cauchy- 
Riemann  equations  a conjugate  v of  u must  satisfy 

vy  = ux  = 2x,  vx  = — uy  = 2y  + 1. 

Integrating  the  first  equation  with  respect  to  y and  differentiating  the  result  with  respect  to  x,  we  obtain 

dh 

V = 2xy  + h(x),  vx  = 2y  H . 

dx 

A comparison  with  the  second  equation  shows  that  dh/dx  = 1.  This  gives  h(x)  = x + c.  Hence  v = 2xy  + x + c 
(r  any  real  constant)  is  the  most  general  harmonic  conjugate  of  the  given  u.  The  corresponding  analytic  function  is 

f(z ) = u + iv  = x2  — y2  — y + i{  2xy  + x + c)  = z2  + iz  + ic. 


Example  4 illustrates  that  a conjugate  of  a given  harmonic  function  is  uniquely  determined 
up  to  an  arbitrary  real  additive  constant. 

The  Cauchy-Riemann  equations  are  the  most  important  equations  in  this  chapter.  Their 
relation  to  Laplace’s  equation  opens  a wide  range  of  engineering  and  physical  applications, 
as  shown  in  Chap.  18. 


^RQBL-EM=S^FT— 


1.  Cauchy-Riemann  equations  in  polar  form.  Derive  (7) 
from  (1). 


2-11  CAUCHY-RIEMANN  EQUATIONS 

Are  the  following  functions  analytic?  Use  (1)  or  (7). 

2-  f(z)  = izz 

3.  f(z)  = e~2'1  (cos  2v  — i sin  2y) 

4.  f{z)  — ex  (cos  v — i sin  y) 

5.  f(z)  = Re  (z2)  - i Im  (z2) 

6-  f(z)  = l/(z  - z5)  7.  /(z)  = i/z8 

8.  /(z)  = Arg  27 rz 

9.  /(z)  = 3t r2/(z3  + 4t r2z) 

10.  /(z)  = In  |z|  + i Arg  z 

11.  /(z)  = cos  x cosh  y — i sin  x sinh  y 


12-19 


HARMONIC  FUNCTIONS 


Are  the  following  functions  harmonic?  If  your  answer 
is  yes,  find  a corresponding  analytic  function  /(z)  = 
u(x,  y)  + iv(x,  y). 

12.  u = x2  + y2  13.  u = xy 


14.  v = xy  15.  u = x/ (x2  + y2) 

16.  u = sin x coshy  17.  v = (2x  + l)v 

18.  u = x3  — 3xy2 

19.  v = ex  sin  2 y 

20.  Laplace’s  equation.  Give  the  details  of  the  derivative 
of  (9). 


21-24 


Determine  a and  b so  that  the  given  function  is 


harmonic  and  find  a harmonic  conjugate. 
' cos  av 


22.  u = cos  ax  cosh  2y 

23.  u = ax3  + bxy 

24.  u = cosh  ax  cos  y 

25.  CAS  PROJECT.  Equipotential  Lines.  Write  a 
program  for  graphing  equipotential  lines  u = const  of 
a harmonic  function  u and  of  its  conjugate  v on  the 
same  axes.  Apply  the  program  to  (a)  u = x2  — y2, 
v — 2xy,  (b)  u — x3  3xy2,  v = 3x2y  — y3. 

26.  Apply  the  program  in  Prob.  25  to  u = ex  cos  y, 
v = ex  sin  y and  to  an  example  of  your  own. 
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27.  Harmonic  conjugate.  Show  that  if  u is  harmonic  and 
v is  a harmonic  conjugate  of  u,  then  u is  a harmonic 
conjugate  of  — v. 

28.  Illustrate  Prob.  27  by  an  example. 

29.  Two  further  formulas  for  the  derivative.  Formulas  (4), 
(5),  and  (11)  (below)  are  needed  from  time  to  time.  Derive 

(11)  f (z)  Ux  illy,  f (z)  Vy  ~F  ivx. 


30.  TEAM  PROJECT.  Conditions  for/(z)  = const.  Let 

/(z)  be  analytic.  Prove  that  each  of  the  following 
conditions  is  sufficient  for/(z)  = const. 

(a)  Re/(z)  = const 

(b)  I m f(z)  = const 

(c)  /'(z)  = 0 

(d)  |/(z)|  = const  (see  Example  3) 


13. 1 Exponential  Function 

In  the  remaining  sections  of  this  chapter  we  discuss  the  basic  elementary  complex 
functions,  the  exponential  function,  trigonometric  functions,  logarithm,  and  so  on.  They 
will  be  counterparts  to  the  familiar  functions  of  calculus,  to  which  they  reduce  when  z = x 
is  real.  They  are  indispensable  throughout  applications,  and  some  of  them  have  interesting 
properties  not  shared  by  their  real  counterparts. 

We  begin  with  one  of  the  most  important  analytic  functions,  the  complex  exponential 
function 


ez,  also  written  exp  z. 

The  definition  of  ez  in  terms  of  the  real  functions  ex,  cos  y,  and  sin  y is 

(1)  ez  = ex(cos  y + i sin  v). 

This  definition  is  motivated  by  the  fact  the  ez  extends  the  real  exponential  function  ex  of 
calculus  in  a natural  fashion.  Namely: 

(A)  ez  = ex  for  real  z = x because  cos  y = 1 and  sin  y = 0 when  y = 0. 

(B)  ez  is  analytic  for  all  z.  (Proved  in  Example  2 of  Sec.  13.4.) 

(C)  The  derivative  of  ez  is  ez,  that  is, 

(2)  (ez)'  = ez. 

This  follows  from  (4)  in  Sec.  13.4, 

(ez)'  = (excos  y)x  + i(ex  sin  y)x  = excosy  + iex  siny  = ez. 

REMARK.  This  definition  provides  for  a relatively  simple  discussion.  We  could  define  ez 
by  the  familiar  series  1 + x + x2/2!  + x3/3!  + • • • with  x replaced  by  z,  but  we  would 
then  have  to  discuss  complex  series  at  this  very  early  stage.  (We  will  show  the  connection 
in  Sec.  15.4.) 

Further  Properties.  A function /(z)  that  is  analytic  for  all  z is  called  an  entire  function. 
Thus,  ez  is  entire.  lust  as  in  calculus  the  functional  relation 

Zi  +Zo  Z 1 

e z = ee 


(3) 
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holds  for  any  zi  = X\  + iy1  and  z2  = x 2 + (>’2-  Indeed,  by  (1), 

eZleZ2  = eXl(cos  y\  + i sin  yjJ  eX2(cos  y2  + i sin  y2). 

Since  eXleX2  = ex'  +X2  for  these  real  functions,  by  an  application  of  the  addition  formulas 
for  the  cosine  and  sine  functions  (similar  to  that  in  Sec.  13.2)  we  see  that 

eZleZz  = eXl  + X2[cos  (y!  + y2)  + i sin  (yx  + y2)]  = eZl+Zz 

as  asserted.  An  interesting  special  case  of  (3)  is  = x,  z2  = ;y;  then 

(4)  ez  = exeiy. 

Furthermore,  for  z = ty  we  have  from  (1)  the  so-called  Euler  formula 


(5) 


ely  = cos  y + i sin  y. 


Hence  the  polar  form  of  a complex  number,  z = r(cos  6 + i sin  6),  may  now  be  written 

(6) 


Z = re18. 


From  (5)  we  obtain 


(7) 


e27Ti  = 1 


(8)  e17*/2  = i. 


as  well  as  the  important  formulas  (verify!) 

e™  = -1, 
Another  consequence  of  (5)  is 
(9) 


e-7ri/2  = -i,  e~™  = -\. 


ely\  = | cosy  + i sin y | = \/cos2 y + sin2 y = 1. 


That  is,  for  pure  imaginary  exponents,  the  exponential  function  has  absolute  value  1,  a 
result  you  should  remember.  From  (9)  and  (1), 


(10)  = ex. 


Hence 


arg  ez  = y ± 2mr  (n  = 0,  1,  2,  • ■ ■ ), 


since  \ez\  = ex  shows  that  (1)  is  actually  ez  in  polar  form. 
From  \ez\  = ex  A 0 in  (10)  we  see  that 


(ID 


e A 0 


for  all  z. 


So  here  we  have  an  entire  function  that  never  vanishes,  in  contrast  to  (nonconstant) 
polynomials,  which  are  also  entire  (Example  5 in  Sec.  13.3)  but  always  have  a zero,  as 
is  proved  in  algebra. 
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Periodicity  of  ex  with  period  2m, 

(12)  ez+Zrri  = ez  for  all  z 

is  a basic  property  that  follows  from  (1)  and  the  periodicity  of  cos  y and  sin  y.  Hence  all 
the  values  that  w = ez  can  assume  are  already  assumed  in  the  horizontal  strip  of  width  277 


(13) 


— 77  < y = 77 


(Fig.  336). 


This  infinite  strip  is  called  a fundamental  region  of  ez. 


EXAMPLE  1 Function  Values.  Solution  of  Equations 

Computation  of  values  from  (1)  provides  no  problem.  For  instance, 

gi.4-o.6i  = gi-4(cos  0 6 _ sin  o 6)  = 4.055(0.8253  - 0.5646;)  = 3.347  - 2.289 i 

\e1A~16i\  = e1A  = 4.055,  Arg  ® = -0.6. 

To  illustrate  (3),  take  the  product  of 

e2+i  = e1 2 *(cos  1 + i sin  1)  and  e^~l  = £4(cos  1 — i sin  1) 

and  verify  that  it  equals  A4(cos2  1 + sin2  1)  = e6  = e(2+*)+(4-*) 

To  solve  the  equation  ez  = 3 + 4 i,  note  first  that  \ez\  = ex  = 5,  x = In  5 = 1.609  is  the  real  part  of  all 
solutions.  Now,  since  ex  — 5, 

£xcosy  = 3,  ex  sin  y — 4,  cosy  = 0.6,  siny  = 0.8,  y = 0.927. 


Ans.  z — 1.609  + 0.927 i ± 2mri  ( n = 0,  1,  2,  • • • ).  These  are  infinitely  many  solutions  (due  to  the  periodicity 
of  ez).  They  lie  on  the  vertical  line  x — 1.609  at  a distance  2tt  from  their  neighbors. 


To  summarize:  many  properties  of  ez  = exp  z parallel  those  of  ex;  an  exception  is  the 
periodicity  of  ez  with  277 i,  which  suggested  the  concept  of  a fundamental  region.  Keep 
in  mind  that  ez  is  an  entire  function.  (Do  you  still  remember  what  that  means?) 


y 


K 


X 


Fig.  336.  Fundamental  region  of  the 
exponential  function  ez  in  the  z-plane 


FR-QB-L-£M-S€-T— 11^ 


1.  ez  is  entire.  Prove  this. 


2-7  Function  Values.  Find  ez  in  the  form  u + iv 
and  \ez\  if  z equals 

2.  3 + 4 i 3.  2-771(1  + 0 

4.  0.6  — 1.8;  5.  2 + 377; 

6.  ll77i/2  7.  \fl  + g77; 


8-13  Polar  Form.  Write  in  exponential  form  (6): 

8.  9.  4 + 3; 

10.  Vi,  W 11.  -6.3 

12.  1/(1  - z)  13.  1 + i 

Real  and  Imaginary  Parts.  Find  Re  and  Im  of 


14-17 


14.  e~ 


15.  exp  (zz) 
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16.  e1/z  17.  exp  (z3) 

18.  TEAM  PROJECT.  Further  Properties  of  the  Ex- 
ponential Function,  (a)  Analyticity.  Show  that  ez  is 

entire.  What  about  e1^?  ez?  e^cos  ky  + i sin  ky)7  (Use 
the  Cauchy-Riemann  equations.) 

(b)  Special  values.  Find  all  z such  that  (i)  ez  is  real, 
(ii)  \e~z\  < 1,  (iii)  ez  = ?. 

(c)  Harmonic  function.  Show  that  u = exy  cos 

(xz/2  — y2/2)  is  harmonic  and  find  a conjugate. 


(d)  Uniqueness.  It  is  interesting  that  f(z ) = ez  is 
uniquely  determined  by  the  two  properties  fix  + iO)  = 
ex  and  f'(z')  = /(z),  where  / is  assumed  to  be  entire. 
Prove  this  using  the  Cauchy-Riemann  equations. 


19-22 


Equations.  Find  all  solutions  and  graph  some 


of  them  in  the  complex  plane. 


19.  ez  = 1 20.  ez  = 4 + 3 i 

21.  ez  = 0 22.  ez  = -2 


13.6  Trigonometric  and  Hyperbolic  Functions. 
Eulers  Formula 

Just  as  we  extended  the  real  ex  to  the  complex  ez  in  Sec.  13.5,  we  now  want  to  extend 
the  familiar  real  trigonometric  functions  to  complex  trigonometric  functions.  We  can  do 
this  by  the  use  of  the  Euler  formulas  (Sec.  13.5) 

e%x  = cos  x + i sin  x,  e~lx  = cos  x — i sin  x. 

By  addition  and  subtraction  we  obtain  for  the  real  cosine  and  sine 

cos  x = l(eix  + e~ix),  sin  * = -(eix  - <T“). 

2 i 

This  suggests  the  following  definitions  for  complex  values  z = x + iy: 

(1)  cosz  = i (efe  + e~iz),  sin z = (eiz  - e~iz). 

2 1 


It  is  quite  remarkable  that  here  in  complex,  functions  come  together  that  are  unrelated  in 
real.  This  is  not  an  isolated  incident  but  is  typical  of  the  general  situation  and  shows  the 
advantage  of  working  in  complex. 

Furthermore,  as  in  calculus  we  define 


(2) 


tan  z = 


sin  z 
cos  z ’ 


cot  z = 


cos  z 
sin  z 


and 

(3) 


sec  z = 


1 

COS  z ’ 


CSC  z = 


1 

sinz' 


Since  ez  is  entire,  cos  z and  sin  z are  entire  functions,  tan  z and  sec  z are  not  entire;  they 
are  analytic  except  at  the  points  where  cos  z is  zero;  and  cot  z and  esc  z are  analytic  except 
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EXAMPLE  1 


EXAMPLE  2 


where  sin  z is  zero.  Formulas  for  the  derivatives  follow  readily  from  ( ez )'  = ez  and  (1)— (3); 
as  in  calculus, 

(4)  (cosz/  = — sinz,  (sinz/  = cos  z,  (tanz/  = sec2  z, 

etc.  Equation  (1)  also  shows  that  Euler’s  formula  is  valid  in  complex'. 

(5)  eiz  = cos  z + i sin  z for  all  z. 


The  real  and  imaginary  parts  of  cos  z and  sin  z are  needed  in  computing  values,  and  they 
also  help  in  displaying  properties  of  our  functions.  We  illustrate  this  with  a typical  example. 

Real  and  Imaginary  Parts.  Absolute  Value.  Periodicity 

Show  that 


(6) 

and 

(7) 


(a)  cos  z = cos  * cosh  y — i sin  x sinh  y 

(b)  sin  z — sin  x cosh  y + i cos  x sinh  y 


(a)  |cosz|2  = cos2  x + sinh2y 

(b)  | sin  z|2  — sin2  x + sinh2y 


and  give  some  applications  of  these  formulas. 
Solution.  From  (1), 


cos  Z = ^(e*x+iy)  + e-*”+iy>) 

= \e~y{cosx  + /sin  x)  + \ey(cosx  — i sin*) 

= h(eV  + e~V ) cos*  — \i{ey  — e~y ) sin*. 

This  yields  (6a)  since,  as  is  known  from  calculus, 

(8)  coshy  = \{ey  + e~y),  sinhy  = \{ey  — e~y)\ 

(6b)  is  obtained  similarly.  From  (6a)  and  cosh2  y = 1 + sinh2  y we  obtain 

|cosz|2  = (cos2*)(l  + sinh2y)  + sin2*  sinh2 y. 

Since  sin2*  + cos2*  = 1,  this  gives  (7a),  and  (7b)  is  obtained  similarly. 

For  instance,  cos  (2  + 3i)  = cos  2 cosh  3 — i sin  2 sinh  3 = —4.190  — 9.109i. 

From  (6)  we  see  that  sin  z and  cos  z are  periodic  with  period  2tt,  just  as  in  real.  Periodicity  of  tan  z and  cot  z 
with  period  77  now  follows. 

Formula  (7)  points  to  an  essential  difference  between  the  real  and  the  complex  cosine  and  sine;  whereas 
| cos  *|^1  and  | sin  * | = 1,  the  complex  cosine  and  sine  functions  are  no  longer  bounded  but  approach  infinity 
in  absolute  value  as  y — > oo,  since  then  sinh  y — > in  (7). 

Solutions  of  Equations.  Zeros  of  cos  z and  sin  z 

Solve  (a)  cos  z — 5 (which  has  no  real  solution!),  (b)  cos  z — 0,  (c)  sin  z = 0. 

Solution,  (a)  e2iz  — 10eiz  +1=0  from  (1)  by  multiplication  by  eiz.  This  is  a quadratic  equation  in  eiz, 
with  solutions  (rounded  off  to  3 decimals) 

= e~y+ix  = 5 ± V25  - 1 = 9.899  and  0.101. 

Thus  e~y  = 9.899  or  0.101,  eix  = 1,  y = ±2.292,  * = 2/277.  Ans.  z — ±2/777  ± 2.292 i (n  = 0,  1,  2,  • • • ). 

Can  you  obtain  this  from  (6a)? 


SEC.  13.6  Trigonometric  and  Hyperbolic  Functions.  Euler’s  Formula 


635 


(b)  cos  x = 0,  sinh  y = 0 by  (7a),  y = 0.  Ans.  z = ±|(2«  + l)ir  (n  = 0,  1.  2,  ■ ■ • ). 

(c)  sin  x = 0,  sinh  y = 0 by  (7b),  Ans.  z = ±mr  ( n = 0,  1,  2,  • • • )■ 

Hence  the  only  zeros  of  cos  z and  sin  z are  those  of  the  real  cosine  and  sine  functions. 

General  formulas  for  the  real  trigonometric  functions  continue  to  hold  for  complex 
values.  This  follows  immediately  from  the  definitions.  We  mention  in  particular  the 
addition  rules 

cos  (z.\  ± z2)  = cos  z r cos  Z2  + sin  7!  sin  z2 
sin  {z i ± z2)  = sinz1cosz2  ± sinz2coszi 

and  the  formula 

(10)  cos2z  + sin2z  = 1. 

Some  further  useful  formulas  are  included  in  the  problem  set. 

Hyperbolic  Functions 

The  complex  hyperbolic  cosine  and  sine  are  defined  by  the  formulas 


(11)  cosh  z = \{ez  + e z),  sinh  z = \{ez  — e z). 


This  is  suggested  by  the  familiar  definitions  for  a real  variable  [see  (8)].  These  functions 
are  entire,  with  derivatives 

(12)  (coshz),  = sinh  z,  (sinhz/  = coshz, 

as  in  calculus.  The  other  hyperbolic  functions  are  defined  by 


(13) 


tanh  z = 

sinh  z 

coth  z = 

cosh  z 

cosh  z ’ 

sinh  z 

sech  z = 

1 

csch  z = 

1 

cosh  z ’ 

sinh  z 

Complex  Trigonometric  and  Hyperbolic  Functions  Are  Related.  If  in  (11),  we  replace  z 
by  iz  and  then  use  (1),  we  obtain 

(14)  cosh  iz  = cos  z,  sinh  iz  = i sin  z. 

Similarly,  if  in  (1)  we  replace  z by  iz  and  then  use  (11),  we  obtain  conversely 

(15)  cos  iz  = cosh  z,  sin  iz  = i sinh  z. 

Here  we  have  another  case  of  unrelated  real  functions  that  have  related  complex  analogs, 
pointing  again  to  the  advantage  of  working  in  complex  in  order  to  get  both  a more  unified 
formalism  and  a deeper  understanding  of  special  functions.  This  is  one  of  the  main  reasons 
for  the  importance  of  complex  analysis  to  the  engineer  and  physicist. 
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PR  O B L EMS  ET  13. 6 


1-4 


FORMULAS  FOR  HYPERBOLIC  FUNCTIONS 


Show  that 


1.  cosh  z = cosh  x cos  y + i sinh  x sin  y 
sinh  z — sinh  x cos  y + i cosh  x sin  y. 


2.  cosh  (zi  + z2)  = cosh  Zi  cosh  Z2  + sinh  zi  sinh  z2 
sinh  (zi  + z2)  = sinh  Zi  cosh  Z2  + cosh  z\  sinh  z2. 


3.  cosh2z  — sinh2  z = 1,  cosh2  z + sinh2  z = cosh  2 z 

4.  Entire  Functions.  Prove  that  cos  z,  sin  z,  cosh  z,  and 
sinh  z are  entire. 

5.  Harmonic  Functions.  Verify  by  differentiation  that 
Im  cos  z and  Re  sin  z are  harmonic. 


6-12 


Function  Values.  Find,  in  the  form  u + iv. 


6.  sin  2777  7.  cos  i,  sin  i 

8.  COS  777,  cosh  777 

9.  cosh  (—  1 + 2 i),  cos  ( — 2 — i) 

10.  sinh  (3  + 4 i),  cosh  (3  + 4 i) 


11.  sin  777,  COS  (§77  — 777) 

12.  COS  §77  7,  COS  [§77(1  + i)  | 


13-15 


Equations  and  Inequalities.  Using  the  defini- 


tions, prove: 


13.  cos  z is  even,  cos  (— z)  = cos  z,  and  sin  z is  odd, 
sin  (— z)  = —sin  z. 

14.  | sinh  v I S |cosz|  S coshy,  | sinh y | S |sinz|  S coshy. 
Conclude  that  the  complex  cosine  and  sine  are  not 
bounded  in  the  whole  complex  plane. 

15.  sinz1cosz2  = §[sin(zi  + z2)  + sin(z!  - z2)] 


16-19 


Equations.  Find  all  solutions. 


16.  sin  z =100  17.  coshz  = 0 


18.  cosh  z = — 1 19.  sinh  z = 0 

20.  Re  tan  z and  Im  tan  z.  Show  that 


Re  tan  z 
Im  tan  z 


sin  x cos  x 
cos2.*  + sinh2y 
sinh  y cosh  y 
cos2*  + sinh2y 


13. / Logarithm.  General  Power.  Principal  Value 

We  finally  introduce  the  complex  logarithm,  which  is  more  complicated  than  the  real 
logarithm  (which  it  includes  as  a special  case)  and  historically  puzzled  mathematicians 
for  some  time  (so  if  you  first  get  puzzled — which  need  not  happen! — be  patient  and  work 
through  this  section  with  extra  care). 

The  natural  logarithm  of  z = x + iy  is  denoted  by  In  z (sometimes  also  by  log  z)  and 
is  defined  as  the  inverse  of  the  exponential  function;  that  is,  w = In  z is  defined  for  z V 0 
by  the  relation 


(Note  that  z = 0 is  impossible,  since  ew  ¥=  0 for  all  vv;  see  Sec.  13.5.)  If  we  set  w = it  + iv 
and  z = re™,  this  becomes 


ew  = eu+iv  = r£ie 

Now,  from  Sec.  13.5,  we  know  that  eu+w  has  the  absolute  value  ew  and  the  argument  v. 
These  must  be  equal  to  the  absolute  value  and  argument  on  the  right: 


v = 0. 
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EXAMPLE  1 


eu  = r gives  u = In  r.  where  In  r is  the  familiar  real  natural  logarithm  of  the  positive 
number  r = |z| . Hence  w = u + iv  = In  z is  given  by 

(1)  In  z = In  r + iQ  (r  = |z|  > 0,  9 = arg  z). 

Now  comes  an  important  point  (without  analog  in  real  calculus).  Since  the  argument  of 
Z is  determined  only  up  to  integer  multiples  of  27 T,  the  complex  natural  logarithm 
In  z (z  A 0)  is  infinitely  many-valued. 

The  value  of  In  z corresponding  to  the  principal  value  Arg  z (see  Sec.  13.2)  is  denoted 
by  Ln  z (Ln  with  capital  L)  and  is  called  the  principal  value  of  In  z.  Thus 

(2)  Ln  z = ln  |z|  + i Arg  z (z  A 0). 

The  uniqueness  of  Arg  z for  given  z (A  0)  implies  that  Ln  z is  single- valued,  that  is,  a 
function  in  the  usual  sense.  Since  the  other  values  of  arg  z differ  by  integer  multiples  of  277, 
the  other  values  of  ln  z are  given  by 

(3)  In z = Lnz  ± 2mri  (n  = 1,2, •••). 

They  all  have  the  same  real  part,  and  their  imaginary  parts  differ  by  integer  multiples 
of  277. 

If  z is  positive  real,  then  Arg  z = 0,  and  Ln  z becomes  identical  with  the  real  natural 
logarithm  known  from  calculus.  If  z is  negative  real  (so  that  the  natural  logarithm  of 
calculus  is  not  defined!),  then  Arg  z = 77  and 

Ln  z = ln  |z|  + 77/  (z  negative  real). 

From  (1)  and  <?lnr  = r for  positive  real  r we  obtain 

(4a)  elnz  = z 

as  expected,  but  since  arg  ( ez ) = y ± 2/; 77  is  multivalued,  so  is 

(4b)  ln  ( ez ) = z ± 2mri,  n = 0, 


Natural  Logarithm.  Principal  Value 

In  1 = 0,  ±2771,  ±4t7 /,  • • • 
ln4  = 1.386294  ± 2mri 
ln  (—1)  = ±771,  ±3771,  ±5771,  • • ■ 
ln  (-4)  = 1.386294  ± (2 n + 1)77 1 
ln  i = 771/2,  —377/2,  577i/2,  • • • 
ln  4i  = 1.386294  + 77i/2  ± 2mri 
ln(— 4i)  = 1.386294  — 771/2  ± 2mri 
ln  (3  — 4i)  = ln  5 + / arg  (3  — 4 1) 

= 1.609438  - 0.927295 1 ± 2mri 


Ln  1 = 0 
Ln  4 = 1.386294 
Ln  (— 1)  = 77/ 

Ln  (—4)  = 1.386294  + 77/ 

Ln  i = 77i'/2 

Ln  4i  = 1.386294  + 771/2 
Ln  (-40  = 1.386294  - 771/2 
Ln  (3  - 40  = 1.609438  - 0.927295i 


(Fig.  337) 
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EXAMPLE  2 


THEOREM  1 


PROOF 


-0.9  + 6 k - # 

l 

-0.9  + 4k  - f 

I 

-0.9  + 2k  - I 

0 1 1 — 1 

-0.9  - 1 \ 2 “ 

I 

-0.9-2 jz  - 

Fig.  337.  Some  values  of  In  (3  — 4/)  in  Example  1 

The  familiar  relations  for  the  natural  logarithm  continue  to  hold  for  complex  values,  that  is, 

(5)  (a)  ln(ziz2)  = Inz!  + ln+2,  (b)  In  (zi/z2)  = hi  zx  - In  z2 

but  these  relations  are  to  be  understood  in  the  sense  that  each  value  of  one  side  is  also 
contained  among  the  values  of  the  other  side;  see  the  next  example. 

Illustration  of  the  Functional  Relation  (5)  in  Complex 

Let 


If  we  take  the  principal  values 

Ln  zi  = Ln  22  = ni, 

then  (5a)  holds  provided  we  write  ln(z!z2)  = ln  1 = 2tt/;  however,  it  is  not  true  for  the  principal  value, 
Ln  (ziz2)  = Ln  1 = 0. 


Analyticity  of  the  Logarithm 

For  every  n = 0,  ±1,  ±2,  •••  formula  (3)  defines  a function,  which  is  analytic, 
except  at  0 and  on  the  negative  real  axis,  and  has  the  derivative 

(6)  (ln  z)'  = — (z  not  0 or  negative  real). 


We  show  that  the  Cauchy-Riemann  equations  are  satisfied.  From  (l)-(3)  we  have 


1 


ln  z = In  r + i(8  + c)  = -f  ln  (x2  + y2)  + i I arctan  — + c 


y 


where  the  constant  c is  a multiple  of  2tt.  By  differentiation, 


‘i  2,2  vy  i , , , ,2 
x + y 1 + ( y/x ) 


1 


y 2,2 


= -Vx  = 


1 + (y/xr  V x 


y 


X 
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EXAMPLE  3 


Hence  the  Cauchy-Riemann  equations  hold.  [Confirm  this  by  using  these  equations  in  polar 
form,  which  we  did  not  use  since  we  proved  them  only  in  the  problems  (to  Sec.  13.4).] 
Formula  (4)  in  Sec.  13.4  now  gives  (6), 


(In  z)' 


I IV JQ 


x . 1 

x2  + y2  1 1 + (y/xf 


x ~ iy  = 1 

x2  + y2  z 


Each  of  the  infinitely  many  functions  in  (3)  is  called  a branch  of  the  logarithm.  The 
negative  real  axis  is  known  as  a branch  cut  and  is  usually  graphed  as  shown  in  Fig.  338. 
The  branch  for  n = 0 is  called  the  principal  branch  of  In  z. 

y 


Fig.  338.  Branch  cut  for  In  z 


General  Powers 

General  powers  of  a complex  number  z = x + iy  are  defined  by  the  formula 

(7)  zc  = eclnz  ( c complex,  z ¥=  0). 

Since  In  z is  infinitely  many-valued,  zc  will,  in  general,  be  multivalued.  The  particular  value 

zc  = ecLn  z 


is  called  the  principal  value  of  zc. 

If  c = n = 1,  2,  ■ ■ • , then  zn  is  single-valued  and  identical  with  the  usual  nth  power  of  z. 
If  c = — 1,  — 2,  ■ ■ • , the  situation  is  similar. 

If  c = 1/n,  where  n = 2,  3,  • • • , then 

-c=^  = ea/n)lnz  (z#0), 

the  exponent  is  determined  up  to  multiples  of  iTri/n  and  we  obtain  the  n distinct  values 
of  the  nth  root,  in  agreement  with  the  result  in  Sec.  13.2.  If  c = p/q,  the  quotient  of  two 
positive  integers,  the  situation  is  similar,  and  zc  has  only  finitely  many  distinct  values. 
However,  if  c is  real  irrational  or  genuinely  complex,  then  zc  is  infinitely  many-valued. 

General  Power 


i‘  = ellni  = exp  (r  In  i)  = exp 


i ± lliTTi 


_ — (7T/2)- -t-  27177 


All  these  values  are  real,  and  the  principal  value  (n  = 0)  is  e 7r'2. 
Similarly,  by  direct  calculation  and  multiplying  out  in  the  exponent. 


(1  + i)2  1 = exp  [(2  — i)  In  (1  + /)]  = exp  [(2  — i)  {In  V2  + \iri  ± Imri}} 
= 2e7r/4±2mr[sin  (|  In  2)  + i cos  (|  In  2)]. 
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It  is  a convention  that  for  real  positive  z = x the  expression  zc  means  ec  ln  x where  In  x 
is  the  elementary  real  natural  logarithm  (that  is,  the  principal  value  Ln  z (z  = x > 0)  in 
the  sense  of  our  definition).  Also,  if  z = e,  the  base  of  the  natural  logarithm,  ?c  = ec  is 
conventionally  regarded  as  the  unique  value  obtained  from  (1)  in  Sec.  13.5. 

From  (7)  we  see  that  for  any  complex  number  a, 

(8)  az  = ez  ln  “. 


We  have  now  introduced  the  complex  functions  needed  in  practical  work,  some  of  them 
(■ ez , cos  z,  sin  z,  cosh  z,  sinh  z)  entire  (Sec.  13.5),  some  of  them  (tan  z,  cot  z,  tanh  z,  coth  z) 
analytic  except  at  certain  points,  and  one  of  them  (In  z)  splitting  up  into  infinitely  many 
functions,  each  analytic  except  at  0 and  on  the  negative  real  axis. 

For  the  inverse  trigonometric  and  hyperbolic  functions  see  the  problem  set. 


PROBLEM  SET  13.7 


1-4 


VERIFICATIONS  IN  THE  TEXT 


1.  Verify  the  computations  in  Example  1. 


2.  Verify  (5)  for  zi  = — i and  z2  = — L 


3.  Prove  analyticity  of  Ln  z by  means  of  the  Cauchy- 
Riemann  equations  in  polar  form  (Sec.  13.4). 

4.  Prove  (4a)  and  (4b). 

COMPLEX  NATURAL  LOGARITHM  In  z 


5-11 


Principal  Value  Ln  z.  Find  Ln  z when  z equals 


5.-11 


6.  4 + 4/ 


7.  4 - 4i  8.  1 ± i 

9.  0.6  + 0.8/  10.  -15  ± 0.1/ 

11.  ei 


All  Values  of  ln  z.  Find  all  values  and  graph 
them  in  the  complex  plane. 

12.  ln  e 13.  ln  1 

14.  ln  (—7)  15.  ln  (e*) 

16.  ln  (4  + 3/) 

17.  Show  that  the  set  of  values  of  ln  (iz)  differs  from  the 
set  of  values  of  2 ln  /. 


12-16 

some  of 


18-21  Equations.  Solve  for  z. 

18.  ln  z = — 7T//2  19.  ln  z = 4 — 3/ 

20.  ln  z = e — zri  21.  ln  z = 0.6  + 0.4/ 


22-28 


General  Powers.  Find  the  principal  value. 


Show  details. 


22.  (2 /)2i  23.  (1  + i)1-i 

24.  (1  - i)1+i  25.  (— 3)3-* 


26.  (if2  27.  (-l)2-i 

28.  (3  + 4/)1/3 

29.  How  can  you  find  the  answer  to  Prob.  24  from  the 
answer  to  Prob.  23? 

30.  TEAM  PROJECT.  Inverse  Trigonometric  and 
Hyperbolic  Functions.  By  definition,  the  inverse  sine 
w = arcsin  z is  the  relation  such  that  sin  w = z.  The 
inverse  cosine  vv  = arccos  z is  the  relation  such  that 
cos  w = z.  The  inverse  tangent,  inverse  cotangent, 
inverse  hyperbolic  sine,  etc.,  are  defined  and  denoted 
in  a similar  fashion.  (Note  that  all  these  relations  are 
multivalued.)  Using  sin  w = (eiw  — e~lw)/(2 i)  and 
similar  representations  of  cos  w,  etc.,  show  that 

(a)  arccos  z = — / ln  (z  + Vz2  — 1) 

(b)  arcsin z = — /In  (iz  + V" 1 — z2) 

(c)  arccosh  z = In  (z  + Vz2  — 1) 

(d)  arcsinh  z = ln  (z  + \/z2  + 1) 

1 i + z 

(e)  arctan  z = — In 

2 i - z 

1 1 + z 

(f ) arctanh  z = — In 

2 1 - z 

(g)  Show  that  w = arcsin  z is  infinitely  many-valued, 
and  if  w i is  one  of  these  values,  the  others  are  of  the 
form  w i ± 2mr  and  tt  — wq  ± 2nir,  n — 0,  1,  • ■ • . 
(The principal  value  of  w = u + iv  = arcsin  z is  defined 
to  be  the  value  for  which  — 7t/2  SkS  7t/2  if  v = 0 
and  — 7t/2  < u < tt/2  if  v < 0.) 


Summary  of  Chapter  13 
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T I O N S AND  PROBLEMS 


1.  Divide  15  + 23 1 by  —3  + 71.  Check  the  result  by 
multiplication. 

2.  What  happens  to  a quotient  if  you  take  the  complex 
conjugates  of  the  two  numbers?  If  you  take  the  absolute 
values  of  the  numbers? 


3.  Write  the  two  numbers  in  Prob.  1 in  polar  form.  Find 
the  principal  values  of  their  arguments. 

4.  State  the  definition  of  the  derivative  from  memory. 
Explain  the  big  difference  from  that  in  calculus. 

5.  What  is  an  analytic  function  of  a complex  variable? 

6.  Can  a function  be  differentiable  at  a point  without  being 
analytic  there?  If  yes,  give  an  example. 

7.  State  the  Cauchy-Riemann  equations.  Why  are  they  of 
basic  importance? 

8.  Discuss  how  ez , cos  z,  sin  z,  cosh  z,  sinh  z are  related. 


9.  In  z is  more  complicated  than  In  x.  Explain.  Give 
examples. 

10.  How  are  general  powers  defined?  Give  an  example. 
Convert  it  to  the  form  x + iy. 


11-16 


Complex  Numbers.  Find,  in  the  form  x + iy , 


showing  details, 

11.  (2  + 3 if 
13.  1/(4  + 3i) 


12.  (1  - i)1 

14.  Vi 


15.  (1  + 0/(1  - 0 16-  e7™/2,  e_7ri/2 


17-20 


Polar  Form.  Represent  in  polar  form,  with  the 


principal  argument. 


17.  -4  - 
19.  —151 
21-24 

21.  V8T 
23.  W 


4 i 18.  12  + 1,  12  - i 

20.  0.6  + 0.81 

Roots.  Find  and  graph  all  values  of: 
22.  A/— 321 

24.  V 1 


25-30  Analytic  Functions.  Find/(z)  = u(x,  y)  + iv(x,  y) 
with  u or  v as  given.  Check  by  the  Cauchy-Riemann  equations 
for  analyticity. 


25  u — xy  26.  v = y/(x2  + y2) 

21.  v = — e_2xsin2y  28.  u = cos  3x  cosh  3y 

29.  u = exp(— (x2  — y2)/2)  cosxy 

30.  v = cos  2x  sinh  2y 


31-35 


Special  Function  Values.  Find  the  value  of: 


31.  cos  (3  - 1)  32.  Ln  (0.6  + 0.81) 

33.  tan  1 

34.  sinh  (1  + 7 rl),  sin  (1  + 7 rl) 

35.  cosh  (7 r + 7 rl) 


SUMMARY  ~QFrCH  ft  PTER-  TV  ~~ 

Complex  Numbers  and  Functions.  Complex  Differentiation 


For  arithmetic  operations  with  complex  numbers 

(1)  z = x + iy  = reie  = r(cos  6 + i sin  6), 

r = |z|  = Vx2  + y2,  0 = arctan  (y/x),  and  for  their  representation  in  the  complex 
plane,  see  Secs.  13.1  and  13.2. 

A complex  function /(z)  = u(x,  y)  + ivix,  v)  is  analytic  in  a domain  D if  it  has 
a derivative  (Sec.  13.3) 


(2) 


= Bo 


f(z  + Az)  - /(z) 

Az 


everywhere  in  D.  Also,  /(z)  is  analytic  at  a point  z = Zo  if  it  has  a derivative  in  a 
neighborhood  of  zo  (not  merely  at  z0  itself). 
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If  /(z)  is  analytic  in  D,  then  u(x,y ) and  v(x,  y ) satisfy  the  (very  important!) 

Cauchy-Riemann  equations  (Sec.  13.4) 

dll  dv  dll  dv 

dx  dy  ’ dy  dx 

everywhere  in  D.  Then  u and  v also  satisfy  Laplace’s  equation 

(4)  MXX  VXX  + Vyy  0 

everywhere  in  D.  If  n(x,  y)  and  v (x,  y)  are  continuous  and  have  continuous  partial 
derivatives  in  D that  satisfy  (3)  in  D,  then  f(z)  = u(x,  y)  + iv(x,  y)  is  analytic  in 
D.  See  Sec.  13.4.  (More  on  Laplace’s  equation  and  complex  analysis  follows  in 
Chap.  18.) 

The  complex  exponential  function  (Sec.  13.5) 

(5)  ez  = exp  z = ex  (cos  y + i sin  y) 

reduces  to  ex  if  z = x (y  = 0).  It  is  periodic  with  277/  and  has  the  derivative  ez. 

The  trigonometric  functions  are  (Sec.  13.6) 

cos  z = \{elz  + e~lz)  = cos  x cosh  y — i sin  x sinh  y 

(6)  i . _. 

sin  z = — ( elz  — e lz)  = sin  x cosh  v + i cos  x sinh  v 
2 i 


and,  furthermore, 


tan  z = (sinz)/cosz,  cotz  = 1/tan  z,  etc. 

The  hyperbolic  functions  are  (Sec.  13.6) 

(7)  cosh  z = \(ez  + e~z)  = cos  iz,  sinh  z = |(ez  — e~z)  = —i  sin  iz 

etc.  The  functions  (5) — (7)  are  entire,  that  is,  analytic  everywhere  in  the  complex 
plane. 

The  natural  logarithm  is  (Sec.  13.7) 

(8)  lnz  = ln|z|  + /argz  = ln|z|  + / Argz  ± 2mri 

where  z ¥=  0 and  n = 0,  1,  • • • . Arg  z is  the  principal  value  of  arg  z,  that  is, 
— 77  < Arg  zS  77.  We  see  that  In  z is  infinitely  many-valued.  Taking  n = 0 gives 
the  principal  value  Ln  z of  In  z;  thus  Ln  z = ln|z|  + / Arg  7. 

General  powers  are  defined  by  (Sec.  13.7) 


(9) 


(c  complex,  z A 0). 


■ chapter!  4 


Chapter  1 3 laid  the  groundwork  for  the  study  of  complex  analysis,  covered  complex  num- 
bers in  the  complex  plane,  limits,  and  differentiation,  and  introduced  the  most  important 
concept  of  analyticity.  A complex  function  is  analytic  in  some  domain  if  it  is  differentiable 
in  that  domain.  Complex  analysis  deals  with  such  functions  and  their  applications.  The 
Cauchy-Riemann  equations,  in  Sec.  13.4,  were  the  heart  of  Chapter  13  and  allowed  a means 
of  checking  whether  a function  is  indeed  analytic.  In  that  section,  we  also  saw  that  analytic 
functions  satisfy  Laplace’s  equation,  the  most  important  PDE  in  physics. 

We  now  consider  the  next  part  of  complex  calculus,  that  is,  we  shall  discuss  the  first 
approach  to  complex  integration.  It  centers  around  the  very  important  Cauchy  integral 
theorem  (also  called  the  Cauchy-Goursat  theorem)  in  Sec.  14.2.  This  theorem  is  important 
because  it  allows,  through  its  implied  Cauchy  integral  formula  of  Sec.  14.3,  the  evaluation 
of  integrals  having  an  analytic  integrand.  Furthermore,  the  Cauchy  integral  formula  shows 
the  surprising  result  that  analytic  functions  have  derivatives  of  all  orders.  Hence,  in  this 
respect,  complex  analytic  functions  behave  much  more  simply  than  real-valued  functions 
of  real  variables,  which  may  have  derivatives  only  up  to  a certain  order. 

Complex  integration  is  attractive  for  several  reasons.  Some  basic  properties  of  analytic 
functions  are  difficult  to  prove  by  other  methods.  This  includes  the  existence  of  derivatives 
of  all  orders  just  discussed.  A main  practical  reason  for  the  importance  of  integration  in 
the  complex  plane  is  that  such  integration  can  evaluate  certain  real  integrals  that  appear 
in  applications  and  that  are  not  accessible  by  real  integral  calculus. 

Finally,  complex  integration  is  used  in  connection  with  special  functions,  such  as 
gamma  functions  (consult  [GenRefl]),  the  error  function,  and  various  polynomials  (see 
[GenReflO]).  These  functions  are  applied  to  problems  in  physics. 

The  second  approach  to  complex  integration  is  integration  by  residues,  which  we  shall 
cover  in  Chapter  16. 

Prerequisite:  Chap.  13. 

Section  that  may  be  omitted  in  a shorter  course:  14.1,  14.5. 

References  and  Answers  to  Problems:  App.  1 Part  D,  App.  2. 

14.1  Line  Integral  in  the  Complex  Plane 

As  in  calculus,  in  complex  analysis  we  distinguish  between  definite  integrals  and  indefinite 
integrals  or  antiderivatives.  Here  an  indefinite  integral  is  a function  whose  derivative 
equals  a given  analytic  function  in  a region.  By  inverting  known  differentiation  formulas 
we  may  find  many  types  of  indefinite  integrals. 

Complex  definite  integrals  are  called  (complex)  line  integrals.  They  are  written 

f(z)  dz. 
c 
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Here  the  integrand  f(z)  is  integrated  over  a given  curve  C or  a portion  of  it  (an  arc,  but 
we  shall  say  “curve”  in  either  case,  for  simplicity).  This  curve  C in  the  complex  plane  is 
called  the  path  of  integration.  We  may  represent  C by  a parametric  representation 

(1)  z(t)  = x(t)  = iy(t ) («  = f = b ). 

The  sense  of  increasing  t is  called  the  positive  sense  on  C,  and  we  say  that  C is  oriented 
by  (1). 

For  instance,  z(t)  = t + 3it  (0  ^ t g 2)  gives  a portion  (a  segment)  of  the  line  y = 3x. 
The  function  z(t ) = 4 cos  t + 4i  sin  t (—77  s=  t Si  77)  represents  the  circle  7 =4,  and  so 
on.  More  examples  follow  below. 

We  assume  C to  be  a smooth  curve,  that  is,  C has  a continuous  and  nonzero  derivative 

dz 

z(t)  = — = x(t)  + iy(t) 
dt 

at  each  point.  Geometrically  this  means  that  C has  everywhere  a continuously  turning 
tangent,  as  follows  directly  from  the  definition 

z(t  + At)  - z(t ) 

z(t)  = lim  (Fig.  339). 

At^O  At 

Here  we  use  a dot  since  a prime  ' denotes  the  derivative  with  respect  to  7. 


Definition  of  the  Complex  Line  Integral 

This  is  similar  to  the  method  in  calculus.  Let  C be  a smooth  curve  in  the  complex  plane 
given  by  (1),  and  let  f(z)  be  a continuous  function  given  (at  least)  at  each  point  of  C.  We 
now  subdivide  (we  “ partition ”)  the  interval  flgfg£>in(l)by  points 

to  (=  a),  H,  Li- 1,  f«(=  b) 

where  fo  < fi  < ■ ■ • < tn.  To  this  subdivision  there  corresponds  a subdivision  of  C by 
points 


ZOi  Zl,  Zn—lt  Zn(  Z) 


(Fig.  340), 


Fig.  339.  Tangent  vector  z(f)  of  a curve  C in  the 
complex  plane  given  by  z(f).  The  arrowhead  on  the 
curve  indicates  the  positive  sense  (sense  of  increasing  t) 
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where  Zj  = z(tj).  On  each  portion  of  subdivision  of  C we  choose  an  arbitrary  point,  say, 
a point  between  zo  and  (that  is,  t,\  — z(t ) where  t satisfies  f0  = r = G),  a point  ij2 
between  zi  and  7.2,  etc.  Then  we  form  the  sum 

n 

(2)  Sn  f(.Cm)  ^Zm  where  Zm  Zm—1- 

m=  1 

We  do  this  for  each  n = 2,  3,  • • • in  a completely  independent  manner,  but  so  that  the 
greatest  |Atm|  = | tm  — tm_il  approaches  zero  as  This  implies  that  the  greatest 

|A^m|  also  approaches  zero.  Indeed,  it  cannot  exceed  the  length  of  the  arc  of  C from 
Zm-i  to  zm  and  the  latter  goes  to  zero  since  the  arc  length  of  the  smooth  curve  C is  a 
continuous  function  of  t.  The  limit  of  the  sequence  of  complex  numbers  S2,  S3,  ■ ■ • thus 
obtained  is  called  the  line  integral  (or  simply  the  integral)  of  f(z.)  over  the  path  of 
integration  C with  the  orientation  given  by  (1).  This  line  integral  is  denoted  by 


(3) 


f(z)  dz,  or  by 
c 


°f(z)  dz 
c 


if  C is  a closed  path  (one  whose  terminal  point  Z coincides  with  its  initial  point  ’0,  as 
for  a circle  or  for  a curve  shaped  like  an  8). 

General  Assumption.  All  paths  of  integration  for  complex  line  integrals  are  assumed  to 
be  piecewise  smooth,  that  is,  they  consist  of  finitely  many  smooth  curves  joined  end  to  end. 


Basic  Properties  Directly  Implied  by  the  Definition 

1.  Linearity.  Integration  is  a linear  operation,  that  is,  we  can  integrate  sums  term  by 
term  and  can  take  out  constant  factors  from  under  the  integral  sign.  This  means  that 
if  the  integrals  off  and  /2  over  a path  C exist,  so  does  the  integral  of  k1f  + k2/2 
over  the  same  path  and 


(4) 


[kifiiz)  + k2f2(z)]  dz  = A-] 


Jc 


fi(z.)  dz  + k2 


Jc 


h (z)  dz. 


Jc 


2.  Sense  reversal  in  integrating  over  the  same  path,  from  z0  to  Z (left)  and  from  Z to 
z 0 (right),  introduces  a minus  sign  as  shown. 


(5) 


f(z)  dz  = 


f(z)  dz. 


Zo 


3.  Partitioning  of  path  (see  Fig.  341) 


(6) 


f(z)  dz  = 


f(z)  dz  + 


f(z)  dz. 


Jc1 


c 


1 


2 


0 


Fig.  341.  Partitioning  of  path  [formula  (6)] 
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Existence  of  the  Complex  Line  Integral 

Our  assumptions  that  f(z)  is  continuous  and  C is  piecewise  smooth  imply  the  existence 
of  the  line  integral  (3).  This  can  be  seen  as  follows. 

As  in  the  preceding  chapter  let  us  write  f(z)  = u(x,  y ) + iv(x,  y).  We  also  set 

Cm  Cm  ir/m  and  A Z rn  Axm  "F  iAym. 


Then  (2)  may  be  written 

(7)  Sn  "F  d])iAxm  T iAym) 

where  u = u(£m,  r]m),  v = v(Cm,  r)m ) and  we  sum  over  m from  1 to  n.  Performing  the 
multiplication,  we  may  now  split  up  Sn  into  four  sums: 

Sn  u Axm  — v Aym  + i [ u A ym  + v Axm]  . 

These  sums  are  real.  Since  / is  continuous,  u and  v are  continuous.  Hence,  if  we  let  n 
approach  infinity  in  the  aforementioned  way,  then  the  greatest  Axm  and  Aym  will  approach 
zero  and  each  sum  on  the  right  becomes  a real  line  integral: 


(8) 


lim  Sn 


f(z)  dz 
c 


u dx  — 

v dy  + i 

u dy  + 

v dx 

c J 

c 

. 

c J 

c 

This  shows  that  under  our  assumptions  on/and  C the  line  integral  (3)  exists  and  its  value 
is  independent  of  the  choice  of  subdivisions  and  intermediate  points  £m. 

First  Evaluation  Method: 

Indefinite  Integration  and  Substitution  of  Limits 

This  method  is  the  analog  of  the  evaluation  of  definite  integrals  in  calculus  by  the  well- 
known  formula 


b 

f(x)  dx  = F(b)  — F(a ) 

a 

where  [ F'(x ) = fix)]. 

It  is  simpler  than  the  next  method,  but  it  is  suitable  for  analytic  functions  only.  To 
formulate  it,  we  need  the  following  concept  of  general  interest. 

A domain  D is  called  simply  connected  if  every  simple  closed  curve  (closed  curve 
without  self-intersections)  encloses  only  points  of  D. 

For  instance,  a circular  disk  is  simply  connected,  whereas  an  annulus  (Sec.  13.3)  is  not 
simply  connected.  (Explain!) 


SEC.  14.1  Line  Integral  in  the  Complex  Plane 


647 


THEOREM  1 


EXAMPLE  1 
EXAMPLE  2 
EXAMPLE  3 

EXAMPLE  4 


THEOREM  2 


Indefinite  Integration  of  Analytic  Functions 

Let  f(z)  be  analytic  in  a simply  connected  domain  D.  Then  there  exists  an  indefinite 
integral  of  f{z)  in  the  domain  D,  that  is,  an  analytic  function  F(z)  such  that 
F ( z ) = f(z)  in  D,  and  for  all  paths  in  D joining  two  points  zo  and  Zi  in  D we  have 


(9) 


(-Z 1 

f(z)  dz  = F(zi)  ~ F(z0)  [F'(z)  = f{z)l 

Zo 


( Note  that  we  can  write  z o and  z \ instead  of  C,  since  we  get  the  same  value  for  all 
those  C from  zo  to  Z\.) 


This  theorem  will  be  proved  in  the  next  section. 

Simple  connectedness  is  quite  essential  in  Theorem  1,  as  we  shall  see  in  Example  5. 
Since  analytic  functions  are  our  main  concern,  and  since  differentiation  formulas  will  often 
help  in  finding  F(z)  for  a given /(z)  = F (z),  the  present  method  is  of  great  practical  interest. 

If/(z)  is  entire  (Sec.  13.5),  we  can  take  for  D the  complex  plane  (which  is  certainly 
simply  connected). 


l+i 

0 


1 

3 


(1  + if  = 


2 

3 


2 

— i 
3 


cos  zdz  = sin  z 


= 2 sin  77/  = 2 i sinh  77  = 23.097/ 


ez/2dz  = 2ez’2 


= 2(e4-3™/2  - et+'"i/2)  = 0 


since  ez  is  periodic  with  period  277/. 

' di 


= Ln  / — Ln  (— Z)  = I I = /77.  Here  D is  the  complex  plane  without  0 and  the  negative  real 


axis  (where  Ln  z is  not  analytic).  Obviously,  D is  a simply  connected  domain. 


Second  Evaluation  Method: 

Use  of  a Representation  of  a Path 

This  method  is  not  restricted  to  analytic  functions  but  applies  to  any  continuous  complex 
function. 


Integration  by  the  Use  of  the  Path 

Let  C be  a piecewise  smooth  path,  represented  by  z = z(t),  where  a zz  t = b.  Let 
f(z)  be  a continuous  function  on  C.  Then 


(10) 


f(z)  dz 
c 


rb 

mom  dt 
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PROOF 


EXAMPLE  5 


The  left  side  of  (10)  is  given  by  (8)  in  terms  of  real  line  integrals,  and  we  show  that 
the  right  side  of  (10)  also  equals  (8).  We  have  z = x + iy,  hence  z = x + iy.  We  simply 
write  « for  u\x(t),  y(t)\  and  v for  v\x(t),  v(r)] . We  also  have  dx  = xdt  and  dy  = y dt. 
Consequently,  in  (10) 


,b 

timid)  dt 


b 

(n  + iv)(x  + iy)  dt 


[u  dx  — v dy  + i (n  dy  + v dx)\ 


(u  dx  — v dy)  + i 


(u  dy  + v dx). 


COMMENT.  In  (7)  and  (8)  of  the  existence  proof  of  the  complex  line  integral  we  referred 
to  real  line  integrals.  If  one  wants  to  avoid  this,  one  can  take  (10)  as  a definition  of  the 
complex  line  integral. 


Steps  in  Applying  Theorem  2 

(A)  Represent  the  path  C in  the  form  z(t)  (a  t S=  b). 

(B)  Calculate  the  derivative  z(i)  = dz/dt. 

(C)  Substitute  z{t)  for  every  z in  f(z)  (hence  x(t)  for  x and  y(t)  for  y). 

(D)  Integrate  f[z(t)]z(t)  over  t from  a to  b. 


A Basic  Result:  Integral  of  1/z  Around  the  Unit  Circle 

We  show  that  by  integrating  1/z  counterclockwise  around  the  unit  circle  (the  circle  of  radius  1 and  center  0; 
see  Sec.  13.3)  we  obtain 


(ID 


= 27 Ti 


(C  the  unit  circle,  counterclockwise). 


This  is  a very  important  result  that  we  shall  need  quite  often. 

Solution.  (A)  We  may  represent  the  unit  circle  C in  Fig.  330  of  Sec.  13.3  by 

z(t)  = cos  t + i sin  t = elt 

so  that  counterclockwise  integration  corresponds  to  an  increase  of  t from  0 to  277. 

(B)  Differentiation  gives  z(t)  = ielt  (chain  rule!). 

(C)  By  substitution, /(z(0)  = l/z(f)  — e~xt. 

(D)  From  (10)  we  thus  obtain  the  result 


(0  ^ t ^ 277), 


dz 

z 


2tt  r 2tt 

e~ltielt  dt  = / dt  = 277/. 
o 


Check  this  result  by  using  z(t)  = cos  t + i sin  t. 

Simple  connectedness  is  essential  in  Theorem  1.  Equation  (9)  in  Theorem  1 gives  0 for  any  closed  path 
because  then  zi  = Zo>  so  that  F(zi)  — F(zo)  — 0.  Now  1/z  is  not  analytic  at  z = 0.  But  any  simply  connected 
domain  containing  the  unit  circle  must  contain  z — 0,  so  that  Theorem  1 does  not  apply — it  is  not  enough  that 
1/z  is  analytic  in  an  annulus,  say,  2 < \z\  <2^  because  an  annulus  is  not  simply  connected! 


SEC.  14.1  Line  Integral  in  the  Complex  Plane 


649 


EXAMPLE  6 


EXAMPLE 


Integral  of  1/zm  with  Integer  Power  m 

Let  f(z)  (z  ' Zo)m  where  m is  the  integer  and  Zo  a constant.  Integrate  counterclockwise  around  the  circle  C 

of  radius  p with  center  at  z0  (Fig-  342). 


Fig.  342.  Path  in  Example  6 


Solution.  We  may  represent  C in  the  form 


z(0  = Zo  + p(cos  t + i sin  t)  = Zo  + pea 

Then  we  have 

, mimt  j • it  ». 

(z  — Zo)  = p e , dz  = Ipe  at 

and  obtain 


(0  ^ t^  277). 


f(Z-Zo)”**  = 

f2w  r 

pmemt  ipe*  dt  = ipm+1\ 

'c  J 

o + 

^i(m+ l)f 


dt. 


By  the  Euler  formula  (5)  in  Sec.  13.6  the  right  side  equals 


■ r 277- 

r 2V 

ipm+1 

COS  (m  + 1 )t  dt  + 

sin  (m  + 1 )t  dt 

-'o 

•'o 

If  m — — 1,  we  have  pm+1  = 1,  cos  0 = 1,  sin  0 = 0.  We  thus  obtain  277/'.  For  integer  m =£  — 1 each  of  the  two 
integrals  is  zero  because  we  integrate  over  an  interval  of  length  277,  equal  to  a period  of  sine  and  cosine.  Hence 
the  result  is 


(12) 


P (z  ~ zo )mdz  = 
'c 


(m  = —1), 

(m  =£  — 1 and  integer). 


Dependence  on  path.  Now  comes  a very  important  fact.  If  we  integrate  a given  function 
f(z ) from  a point  zo  to  a point  z\  along  different  paths,  the  integrals  will  in  general  have 
different  values.  In  other  words,  a complex  line  integral  depends  not  only  on  the  endpoints 
of  the  path  but  in  general  also  on  the  path  itself.  The  next  example  gives  a first  impression 
of  this,  and  a systematic  discussion  follows  in  the  next  section. 


Integral  of  a Nonanalytic  Function.  Dependence  on  Path 

Integrate /(z)  = Re  z = x from  0 to  1 + 2/  (a)  along  C*  in  Fig.  343,  (b)  along  C consisting  of  Ci  and  C 2- 

Solution,  (a)  C*  can  be  represented  by  z(f)  = t + 2/7  (0  § t £ 1).  Hence  j(f)  =1+21  and  f[z(t) ] = 
x(t)  = t on  C*.  We  now  calculate 


Re  z dz 
'c* 


f(l  + 2 i)dt  = -(1  + 21)  = 
2 


1 

- + i. 
2 


o 
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2 - o z = 1 + 2i 


C V 

/ 

/ I 

fr  I 

1 * 

Fig.  343.  Paths  in  Example  7 


(b)  We  now  have 


Ci-  z(t ) = r, 

C2:  z(t)  = 1 + it. 


z(t)  = i,  mm=x(t)  = t (o^(si) 

z(0  = «,  /(zW)  = X(f)  = 1 (0  S t S 2). 


Using  (6)  we  calculate 


Re  z dz  = | Re  z dz  + Re  z dz  = t clt  + 1 ■ i dt  = — I-  2 i. 

C2  ^0  ■a)  ^ 


Note  that  this  result  differs  from  the  result  in  (a). 


Bounds  for  Integrals.  ML-lnequality 

There  will  be  a frequent  need  for  estimating  the  absolute  value  of  complex  line  integrals. 
The  basic  formula  is 


PROOF 


(13) 


f(z)  dz 


■_  ML 


(ML-inequality); 


Jc 

L is  the  length  of  C and  M a constant  such  that  \f(z)\  = M everywhere  on  C. 

Taking  the  absolute  value  in  (2)  and  applying  the  generalized  inequality  (6*)  in  Sec.  13.2, 
we  obtain 


\Sn\ 


n 

2 /(£m)  Atm 

m= 1 


m=l  m=l 


Now  | Azto|  is  the  length  of  the  chord  whose  endpoints  are  zm-i  and  zm  (see  Fig.  340). 
Hence  the  sum  on  the  right  represents  the  length  L*  of  the  broken  line  of  chords  whose 
endpoints  are  z0,  Zi,  ■ • ■ , zn  (=  2).  If  n approaches  infinity  in  such  a way  that  the  greatest 
| Atm\  and  thus  | Azm|  approach  zero,  then  L*  approaches  the  length  L of  the  curve  C,  by 
the  definition  of  the  length  of  a curve.  From  this  the  inequality  (13)  follows. 


We  cannot  see  from  (13)  how  close  to  the  bound  ML  the  actual  absolute  value  of  the 
integral  is,  but  this  will  be  no  handicap  in  applying  (13).  For  the  time  being  we  explain 
the  practical  use  of  (13)  by  a simple  example. 
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Estimation  of  an  Integral 


Fig.  344.  Path  in 
Example  8 


Find  an  upper  bound  for  the  absolute  value  of  the  integral 

dz , C the  straight-line  segment  from  0 to  1 + Fig.  344. 

Solution.  L = V2  and  |/(z)|  = \z2\  £ 2 on  C gives  by  (13) 


!dz 


s 2V2  = 2.8284. 


The  absolute  value  of  the  integral  is  | — § + §*[=§  V2  = 0.9428  (see  Example  1). 


Summary  on  Integration.  Line  integrals  of  f(z)  can  always  be  evaluated  by  (10),  using 
a representation  (1)  of  the  path  of  integration.  If/(z)  is  analytic,  indefinite  integration  by 
(9)  as  in  calculus  will  be  simpler  (proof  in  the  next  section). 


PR-QB^f^SFT=1~4^T 


1-10 


FIND  THE  PATH  and  sketch  it. 


1.  z(t)  = (1  + \i)t  (2  S t S 5) 

2.  z(t)  = 3 + i + (1  - i)t  (0S1S  3) 

3.  z(t)  = t + lit2  (1S(S2) 

4.  z(t)  = t + (1  - tfi  (-IS(SI) 

5.  z(t)  = 3 - i + Vl0e-it  (OSiS  2t r) 

6.  z(f)  = 1 + i + e (0  S t S 2) 

7.  z(t)  = 2 + 4e’rit/2  (0S1S2) 


8.  z{t)  = 5e_it  (0S1S  tt/2) 

9.  z(t)  = f + it 3 (-2  S f S 2) 

10.  z(t)  = 2 cos  t + i sin  t (0S(S  27r) 


11-20 


FIND  A PARAMETRIC  REPRESENTATION 


and  sketch  the  path. 

11.  Segment  from  (—1,  1)  to  (1,  3) 

12.  From  (0,  0)  to  (2,  1)  along  the  axes 

13.  Upper  half  of  |z  — 2 + z'|  =2  from  (4,  — 1)  to  (0,  — 1) 

14.  Unit  circle,  clockwise 

15.  x2  — 4y2  = 4,  the  branch  through  (2,  0) 

16.  Ellipse  4x2  + 9y2  = 36,  counterclockwise 

17.  \z  + a + ib | = r,  clockwise 

18.  y = l/x  from  (1,  1)  to  (5,  5) 

19.  Parabola  y = 1 - \x2  (-2  £ x £ 2) 

20.  4(x  - if  + 5{y  + l)2  = 20 


21-30 


INTEGRATION 


Integrate  by  the  first  method  or  state  why  it  does  not  apply 
and  use  the  second  method.  Show  the  details. 


21. 


Re  z dz,  C the  shortest  path  from  1 + i to  3 + 3 i 


22.  Rezcfe,  C the  parabola  y = 1 + g(x  — l)2  from 
■'c 

1 + i to  3 + 3 i 

23.  ez  dz,  C the  shortest  path  from  iri  to  liri 

■'c 

24.  cos  Izdz,  C the  semicircle  |z|  = 7T,  xSO  from 

■'C 

— 77  i to  77 1 

f 2 

25.  z exp  (z  ) dz,  C from  1 along  the  axes  to  i 
■'c 

26.  (z  + z_1)  dz,  C the  unit  circle,  counterclockwise 

■'c 

27.  sec2  z dz,  any  path  from  77/4  to  77(/4 

■'c 

28.  ( — o ) dz,  C the  circle  \z  — 2i|  = 4, 

Jc  Vz  - H (z  - 20  ) 

clockwise 

f 2 

29.  Im  z dz  counterclockwise  around  the  triangle  with 

■'c 

vertices  0,  I , / 

30.  Re  z2  (/z  clockwise  around  the  boundary  of  the  square 

■'C 

with  vertices  0,  i,  1 + ;,  1 

31.  CAS  PROJECT.  Integration.  Write  programs  for  the 
two  integration  methods.  Apply  them  to  problems  of 
your  choice.  Could  you  make  them  into  a joint  program 
that  also  decides  which  of  the  two  methods  to  use  in  a 
given  case? 


c 
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32.  Sense  reversal.  Verify  (5)  for  f(z)  = z2 *,  where  C is 
the  segment  from  — 1 — i to  1 + i. 

33.  Path  partitioning.  Verify  (6)  for  f{z)  = 1 /z  and  Ci 
and  C2  the  upper  and  lower  halves  of  the  unit  circle. 

34.  TEAM  EXPERIMENT.  Integration,  (a)  Comparison. 

First  write  a short  report  comparing  the  essential  points 
of  the  two  integration  methods. 

(b)  Comparison.  Evaluate  f(z)  dz  by  Theorem  1 

■'c 

and  check  the  result  by  Theorem  2,  where: 

(i)  f(z)  — z4  and  C is  the  semicircle  |z|  =2  from 
—2 i to  2 i in  the  right  half-plane, 


(ii)  f(z)  = e2z  and  C is  the  shortest  path  from  0 to 
1 + 2 i. 

(c)  Continuous  deformation  of  path.  Experiment 
with  a family  of  paths  with  common  endpoints,  say, 
z(t)  = t + ia  sin  t,  0 S t = 7 r,  with  real  parameter  a. 
Integrate  nonanalytic  functions  (Re  z.  Re  (z2),  etc.)  and 
explore  how  the  result  depends  on  a.  Then  take  analytic 
functions  of  your  choice.  (Show  the  details  of  your 
work.)  Compare  and  comment. 

(d)  Continuous  deformation  of  path.  Choose  another 
family,  for  example,  semi-ellipses  z(t)  = a cos  t + 
i sin  t , — 7t/2  StS  7t/2,  and  experiment  as  in  (c). 

35.  ML-inequality.  Find  an  upper  bound  of  the  absolute 
value  of  the  integral  in  Prob.  21. 


14.1  Cauchys  Integral  Theorem 

This  section  is  the  focal  point  of  the  chapter.  We  have  just  seen  in  Sec.  14.1  that  a line 
integral  of  a function /(z)  generally  depends  not  merely  on  the  endpoints  of  the  path,  but 
also  on  the  choice  of  the  path  itself.  This  dependence  often  complicates  situations.  Hence 
conditions  under  which  this  does  not  occur  are  of  considerable  importance.  Namely,  if 
f(z)  is  analytic  in  a domain  I)  and  D is  simply  connected  (see  Sec.  14.1  and  also  below), 
then  the  integral  will  not  depend  on  the  choice  of  a path  between  given  points.  This  result 
(Theorem  2)  follows  from  Cauchy’s  integral  theorem,  along  with  other  basic  consequences 
that  make  Cauchy’s  integral  theorem  the  most  important  theorem  in  this  chapter  and 
fundamental  throughout  complex  analysis. 

Let  us  continue  our  discussion  of  simple  connectedness  which  we  started  in  Sec.  14.1. 

1.  A simple  closed  path  is  a closed  path  (defined  in  Sec.  14.1)  that  does  not  intersect 
or  touch  itself  as  shown  in  Fig.  345.  For  example,  a circle  is  simple,  but  a curve 
shaped  like  an  8 is  not  simple. 


Simple 


Simple  Not  simple 

Fig.  345.  Closed  paths 


Not  simple 


2.  A simply  connected  domain  D in  the  complex  plane  is  a domain  (Sec.  13.3)  such 

that  every  simple  closed  path  in  D encloses  only  points  of  D.  Examples:  The  interior 
of  a circle  (“open  disk”),  ellipse,  or  any  simple  closed  curve.  A domain  that  is  not 
simply  connected  is  called  multiply  connected.  Examples:  An  annulus  (Sec.  13.3), 
a disk  without  the  center,  for  example,  0 < |z|  < 1.  See  also  Fig.  346. 

More  precisely,  a bounded  domain  D (that  is,  a domain  that  lies  entirely  in  some 
circle  about  the  origin)  is  called  p-fold  connected  if  its  boundary  consists  of  p closed 
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THEOREM  1 


EXAMPLE  1 


EXAMPLE  2 


Simply 

Simply 

Doubly 

Triply 

connected 

connected 

connected 

connected 

Fig.  346.  Simply  and  multiply  connected  domains 


connected  sets  without  common  points.  These  sets  can  be  curves,  segments,  or  single 
points  (such  as  z = OforO  < |z|  < 1,  for  which/?  = 2).  Thus,  D has p — 1 “holes,” 
where  “hole”  may  also  mean  a segment  or  even  a single  point.  Hence  an  annulus 
is  doubly  connected  ( p = 2). 


Cauchy’s  Integral  Theorem 

Iffiz)  is  analytic  in  a simply  connected  domain  D,  then  for  every  simple  closed  path 
C in  D, 

(1)  < 

. 

> f(z)  dz  = 0.  See  Fig.  347. 

c 

Fig.  347.  Cauchy’s  integral  theorem 


Before  we  prove  the  theorem,  let  us  consider  some  examples  in  order  to  really  understand 
what  is  going  on.  A simple  closed  path  is  sometimes  called  a contour  and  an  integral  over 
such  a path  a contour  integral.  Thus,  (1)  and  our  examples  involve  contour  integrals. 


Entire  Functions 


ez  dz  = 0,  cos  zdz  = 0,  f zn  dz  = 0 (n  = 0,  1,  • • • ) 


for  any  closed  path,  since  these  functions  are  entire  (analytic  for  all  z )■ 


Points  Outside  the  Contour  Where  f(x)  is  Not  Analytic 

f f dz 

<p  sec  zdz  = 0,  <f  = 0 

Jc  Jc  z2  + 4 

where  C is  the  unit  circle,  sec  z = 1/cos  z is  not  analytic  at  ~ ±7r/2,  ±3tt/2,  • •• , but  all  these  points  lie 

outside  C;  none  lies  on  C or  inside  C.  Similarly  for  the  second  integral,  whose  integrand  is  not  analytic  at 
Z = ±2 i outside  C.  I 
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EXAMPLE  3 


EXAMPLE  4 


EXAMPLE  5 


PROOF 


Nonanalytic  Function 


zdz  = I e dt  = 2i ri 


where  C:  z(t)  = ea  is  the  unit  circle.  This  does  not  contradict  Cauchy’s  theorem  because /(z)  = z is  not 
analytic. 


Analyticity  Sufficient,  Not  Necessary 


dz 

— = 0 

z2 

where  C is  the  unit  circle.  This  result  does  not  follow  from  Cauchy's  theorem,  because/(z)  = 1/z2  is  not  analytic 
atz  = 0.  Hence  the  condition  that  f be  analytic  in  D is  sufficient  rather  than  necessary  for  (1)  to  be  true. 

Simple  Connectedness  Essential 


dz 

= 2’77‘t 

z 

for  counterclockwise  integration  around  the  unit  circle  (see  Sec.  14.1).  C lies  in  the  annulus  | < |z|  < § where 
1/z  is  analytic,  but  this  domain  is  not  simply  connected,  so  that  Cauchy’s  theorem  cannot  be  applied.  Hence  the 
condition  that  the  domain  D be  simply  connected  is  essential. 

In  other  words,  by  Cauchy’s  theorem,  if/(z)  is  analytic  on  a simple  closed  path  C and  everywhere  inside  C, 
with  no  exception,  not  even  a single  point,  then  (1)  holds.  The  point  that  causes  trouble  here  is  z = 0 where  1/z 
is  not  analytic. 


Cauchy  proved  his  integral  theorem  under  the  additional  assumption  that  the  derivative 
/ ( z ) is  continuous  (which  is  true,  but  would  need  an  extra  proof).  His  proof  proceeds  as 
follows.  From  (8)  in  Sec.  14.1  we  have 


°/(z)  dz 
Jc 


(>  (u  dx  — v dy)  + i <>  (u  dy  + v dx). 

■'c  ■'c 


Since  f(z)  is  analytic  in  D,  its  derivative  f’iz)  exists  in  D.  Since  f\z.)  is  assumed  to  be 
continuous,  (4)  and  (5)  in  Sec.  13.4  imply  that  u and  v have  continuous  partial  derivatives 
in  D.  Hence  Green's  theorem  (Sec.  10.4)  (with  u and  — v instead  of  h\  and  /*2)  is  applicable 
and  gives 


<>  (u  dx  — v dy)  = 

Jc  R 


dv 

dx 


dll 

dy 


dx  dy 


where  R is  the  region  bounded  by  C.  The  second  Cauchy-Riemann  equation  (Sec.  13.4) 
shows  that  the  integrand  on  the  right  is  identically  zero.  Hence  the  integral  on  the  left  is 
zero.  In  the  same  fashion  it  follows  by  the  use  of  the  first  Cauchy-Riemann  equation  that 
the  last  integral  in  the  above  formula  is  zero.  This  completes  Cauchy’s  proof. 


Goursat’s  proof  without  the  condition  that  f\z)  is  continuous 1 is  much  more  complicated. 
We  leave  it  optional  and  include  it  in  App.  4. 


^DOUARD  GOURSAT  (1858-1936),  French  mathematician  who  made  important  contributions  to  complex 
analysis  and  PDEs.  Cauchy  published  the  theorem  in  1 825.  The  removal  of  that  condition  by  Goursat  (see  Transactions 
Amer.  Math  Soc.,  vol.  1,  1900)  is  quite  important  because,  for  instance,  derivatives  of  analytic  functions  are  also 
analytic.  Because  of  this,  Cauchy’s  integral  theorem  is  also  called  Cauchy-Goursat  theorem. 
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THEOREM  2 


PROOF 


Independence  of  Path 

We  know  from  the  preceding  section  that  the  value  of  a line  integral  of  a given  function 
f(z)  from  a point  z \ to  a point  zi  will  in  general  depend  on  the  path  C over  which  we 
integrate,  not  merely  on  zi  and  zi-  It  is  important  to  characterize  situations  in  which  this 
difficulty  of  path  dependence  does  not  occur.  This  task  suggests  the  following  concept. 
We  call  an  integral  of  f(z)  independent  of  path  in  a domain  D if  for  every  z i,  z-2  in  D 
its  value  depends  (besides  on  f(z),  of  course)  only  on  the  initial  point  z i and  the  terminal 
point  Z2 , but  not  on  the  choice  of  the  path  C in  I)  [so  that  every  path  in  D from  - ] to  ~2 
gives  the  same  value  of  the  integral  of/(z)]. 


Independence  of  Path 

If  f(z)  is  analytic  in  a simply  connected  domain  D,  then  the  integral  of  f(z)  is 
independent  of  path  in  D. 


Let  z\  and  z2  be  any  points  in  I).  Consider  two  paths  C i and  C2  in  D from  z±  to  z2  without 
further  common  points,  as  in  Fig.  348.  Denote  by  C2  the  path  C2  with  the  orientation 
reversed  (Fig.  349).  Integrate  from  z.\  over  C \ to  z2  and  over  C2  back  to  z,\ . This  is  a 
simple  closed  path,  and  Cauchy’s  theorem  applies  under  our  assumptions  of  the  present 
theorem  and  gives  zero: 


(2') 


fdz  + 


fdz  = 0,  thus 


fdz 


fdz. 


J •'/-i*  *!/"1  •'pj! 

W W2  Cj  C2 

But  the  minus  sign  on  the  right  disappears  if  we  integrate  in  the  reverse  direction,  from 
z i to  "2,  which  shows  that  the  integrals  of  /(z)  over  C\  and  C2  are  equal, 


(2) 


f(z)  dz  = f(z)  dz 


Jc1 


(Fig.  348). 


This  proves  the  theorem  for  paths  that  have  only  the  endpoints  in  common.  For  paths  that 
have  finitely  many  further  common  points,  apply  the  present  argument  to  each  “loop” 
(portions  of  C\  and  C2  between  consecutive  common  points;  four  loops  in  Fig.  350).  For 
paths  with  infinitely  many  common  points  we  would  need  additional  argumentation  not 
to  be  presented  here. 


F'g-  348,  Formula  (2) 


Fig.  349.  Formula  (2,J 


Fig.  350.  Paths  with  more 
common  points 
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EXAMPLE  6 


THEOREM  3 


PROOF 


Principle  of  Deformation  of  Path 

This  idea  is  related  to  path  independence.  We  may  imagine  that  the  path  C2  in  (2)  was 
obtained  from  C\  by  continuously  moving  Cf  (with  ends  fixed!)  until  it  coincides  with 
C2.  Figure  351  shows  two  of  the  infinitely  many  intermediate  paths  for  which  the  integral 
always  retains  its  value  (because  of  Theorem  2).  Hence  we  may  impose  a continuous 
deformation  of  the  path  of  an  integral,  keeping  the  ends  fixed.  As  long  as  our  deforming 
path  always  contains  only  points  at  which  /(z)  is  analytic,  the  integral  retains  the  same 
value.  This  is  called  the  principle  of  deformation  of  path. 


Fig.  351.  Continuous  deformation  of  path 

A Basic  Result:  Integral  of  Integer  Powers 

From  Example  6 in  Sec.  14.1  and  the  principle  of  deformation  of  path  it  follows  that 

c (2iri  (m  = — 1) 

(3)  nz-z0)mdz  = \ 

J t 0 (m  =£  — 1 and  integer) 

for  counterclockwise  integration  around  any  simple  closed  path  containing  Zq  in  its  interior. 

Indeed,  the  circle  \z  — Zol  = P in  Example  6 of  Sec.  14.1  can  be  continuously  deformed  in  two  steps  into  a path 
as  just  indicated,  namely,  by  first  deforming,  say,  one  semicircle  and  then  the  other  one.  (Make  a sketch). 


Existence  of  Indefinite  Integral 

We  shall  now  justify  our  indefinite  integration  method  in  the  preceding  section  [formula 
(9)  in  Sec.  14.1],  The  proof  will  need  Cauchy’s  integral  theorem. 


Existence  of  Indefinite  Integral 

If  f(z)  is  analytic  in  a simply  connected  domain  D,  then  there  exists  an  indefinite 
integral  F(z)  of  f(z)  in  D — thus,  F ( z ) = f(z) — which  is  analytic  in  D,  and  for  all 
paths  in  D joining  any  two  points  Zo  ond  Si  in  A the  integral  off(z)from  Zo  to  Zi 
can  be  evaluated  by  formula  (9)  in  Sec.  14.1. 


The  conditions  of  Cauchy’s  integral  theorem  are  satisfied.  Hence  the  line  integral  of  f(z) 
from  any  z0  in  D to  any  z in  I)  is  independent  of  path  in  D.  We  keep  z0  fixed.  Then  this 
integral  becomes  a function  of  z,  call  if  F(z), 


(4) 


F(z) 


f(z*)  dz* 
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which  is  uniquely  determined.  We  show  that  this  F(z)  is  analytic  in  D and  F\z)  = f(z). 
The  idea  of  doing  this  is  as  follows.  Using  (4)  we  form  the  difference  quotient 


(5) 


F(z  + Az)  - F(z) 

A z 


A z 


rZ+Az 

Az*)  dz* 


f(z*)  dz* 


A z 


rZ+Az 

Az*)  dz*. 


We  now  subtract /(z)  from  (5)  and  show  that  the  resulting  expression  approaches  zero  as 
Az  — » 0.  The  details  are  as  follows. 

We  keep  z fixed.  Then  we  choose  z + Az  in  D so  that  the  whole  segment  with  endpoints 
Z and  z + Az  is  in  D (Fig.  352).  This  can  be  done  because  /lisa  domain,  hence  it  contains 
a neighborhood  of  z.  We  use  this  segment  as  the  path  of  integration  in  (5).  Now  we  subtract 
/(z).  This  is  a constant  because  z is  kept  fixed.  Hence  we  can  write 


rz+Az 


/(z)  dz*  = f(z) 


z+Az 


dz*  = Az)  Az. 


Thus 


Az)  = — 
Az  J 


z+Az 


Az)  dz*. 


By  this  trick  and  from  (5)  we  get  a single  integral: 


F(z  + Az)  - F(z) 

Az 


- Az) 


Az 


rZ+AZ 

[Az*)  ~Az)]dz*. 


Since  /(z)  is  analytic,  it  is  continuous  (see  Team  Project  (24d)  in  Sec.  13.3).  An  e > 0 
being  given,  we  can  thus  find  a S > 0 such  that  |/(z*)  — /(z)|  < e when  |z*  — z|  < 8. 
Hence,  letting  | Az|  < 5,  we  see  that  the  ;V/ /.-inequality  (Sec.  14.1)  yields 


F(z  + Az)  - F(z) 

Az 


- Az) 


l 

I Az| 


rZ+Az 

U(Z*)  - Az)]  dz* 


e|  Az| 


6. 


By  the  definition  of  limit  and  derivative,  this  proves  that 


F\z) 


F(z  + Az)  - F(z) 

lim  

Az 


Az)- 


Since  z is  any  point  in  D.  this  implies  that  F(z)  is  analytic  in  D and  is  an  indefinite  integral 
or  antiderivative  of  /(z)  in  /),  written 


F(z) 


Az)  dz. 


Fig.  352.  Path  of  integration 
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Also,  if  G'(z)  = /(z),  then  F'(z)  — G'(z)  = 0 in  D\  hence  F{z)  — G(z)  is  constant  in  D 
(see  Team  Project  30  in  Problem  Set  13.4).  That  is,  two  indefinite  integrals  of  f(z)  can 
differ  only  by  a constant.  The  latter  drops  out  in  (9)  of  Sec.  14.1,  so  that  we  can  use  any 
indefinite  integral  of/(z).  This  proves  Theorem  3. 


Cauchy’s  Integral  Theorem 
for  Multiply  Connected  Domains 

Cauchy’s  theorem  applies  to  multiply  connected  domains.  We  first  explain  this  for  a 
doubly  connected  domain  D with  outer  boundary  curve  C\  and  inner  Ci  (Fig.  353).  If 
a function  /(z)  is  analytic  in  any  domain  D*  that  contains  D and  its  boundary  curves,  we 
claim  that 


(6) 


° f(z)  dz,  = <>  f(z)dz 
Jc,  Jc2 


(Fig.  353) 


both  integrals  being  taken  counterclockwise  (or  both  clockwise,  and  regardless  of  whether 
or  not  the  full  interior  of  C2  belongs  to  D*). 


PROOF  By  two  cuts  Ci  and  C2  (Fig.  354)  we  cut  D into  two  simply  connected  domains  D\  and 
£>2  in  which  and  on  whose  boundaries /(z)  is  analytic.  By  Cauchy’s  integral  theorem  the 
integral  over  the  entire  boundary  of  Z>i  (taken  in  the  sense  of  the  arrows  in  Fig.  354)  is 
zero,  and  so  is  the  integral  over  the  boundary  of  D2,  and  thus  their  sum.  In  this  sum  the 
integrals  over  the  cuts  Ci  and  C2  cancel  because  we  integrate  over  them  in  both 
directions — this  is  the  key — and  we  are  left  with  the  integrals  over  Ci  (counterclockwise) 
and  C2  (clockwise;  see  Fig.  354);  hence  by  reversing  the  integration  over  C2  (to 
counterclockwise)  we  have 


<>  fdz  ~ 0 

-'Ci 


fdz  = 0 


and  (6)  follows. 


For  domains  of  higher  connectivity  the  idea  remains  the  same.  Thus,  for  a triply  connected 
domain  we  use  three  cuts  C\,  C2,  C3  (Fig.  355).  Adding  integrals  as  before,  the  integrals 
over  the  cuts  cancel  and  the  sum  of  the  integrals  over  Ci  (counterclockwise)  and  C2,  C3 
(clockwise)  is  zero.  Hence  the  integral  over  C\  equals  the  sum  of  the  integrals  over  C2 
and  C3,  all  three  now  taken  counterclockwise.  Similarly  for  quadruply  connected  domains, 
and  so  on. 
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Fig.  354.  Doubly  connected  domain 


Fig.  355.  Triply  connected  domain 


1-8 


COMMENTS  ON  TEXT  AND  EXAMPLES 


1.  Cauchy’s  Integral  Theorem.  Verify  Theorem  1 for 
the  integral  of  z2  over  the  boundary  of  the  square  with 
vertices  ±1  ± i.  Hint.  Use  deformation. 


2.  For  what  contours  C will  it  follow  from  Theorem  1 that 


(a) 


(b) 


f exp(l/z2) 


0? 


3.  Deformation  principle.  Can  we  conclude  from 
Example  4 that  the  integral  is  also  zero  over  the  contour 
in  Prob.  1? 

4.  If  the  integral  of  a function  over  the  unit  circle  equals 
2 and  over  the  circle  of  radius  3 equals  6,  can  the 
function  be  analytic  everywhere  in  the  annulus 

l < Id  < 3? 

5.  Connectedness.  What  is  the  connectedness  of  the 
domain  in  which  (cos  z2)/(z4  + 1)  is  analytic? 

6.  Path  independence.  Verify  Theorem  2 for  the  integral 
of  ez  from  0 to  1 + i (a)  over  the  shortest  path  and 
(b)  over  the  x-axis  to  1 and  then  straight  up  to  1 + i. 

7.  Deformation.  Can  we  conclude  in  Example  2 that 
the  integral  of  l/(z2  + 4)  over  (a)  |z  — 2|  =2  and 
(b)  |z  — 2|  = 3 is  zero? 

8.  TEAM  EXPERIMENT.  Cauchy’s  Integral  Theorem. 

(a)  Main  Aspects.  Each  of  the  problems  in  Examples 
1-5  explains  a basic  fact  in  connection  with  Cauchy’s 
theorem.  Find  five  examples  of  your  own,  more 
complicated  ones  if  possible,  each  illustrating  one  of 
those  facts. 

(b)  Partial  fractions.  Write  /(z)  in  terms  of  partial 
fractions  and  integrate  it  counterclockwise  over  the  unit 
circle,  where 


(i) 


2z  + 3/ 

/(*)  = 2 , 1 
Z + 4 


(ii)  f(z) 


z + 1 
z2  + 2z  ' 


(c)  Deformation  of  path.  Review  (c)  and  (d)  of  Team 
Project  34,  Sec.  14. 1 , in  the  light  of  the  principle  of  defor- 
mation of  path.  Then  consider  another  family  of  paths 


with  common  endpoints,  say,  z(f)  — t + ia{t  — f2), 
OStS  1,  a a real  constant,  and  experiment  with  the 
integration  of  analytic  and  nonanalytic  functions  of 
your  choice  over  these  paths  (e.g.,  z,  Im  z,  z2.  Re  z2, 
Im  z2,  etc.). 


9-19 


CAUCHY’S  THEOREM  APPLICABLE? 


Integrate  /(z)  counterclockwise  around  the  unit  circle. 
Indicate  whether  Cauchy’s  integral  theorem  applies.  Show 


the  details. 

9.  /(z)  = exp  (-z2) 
11.  /(z)  = l/(2z  - 1) 
13.  /(z)  = l/(z4  - 1.1) 
15.  /(z)  = Im  z 
17.  /(z)  = l/|z|2 
19.  /(z)  = z3  cot  z 


10.  /(z)  = tan  \z 
12.  f(z)  = S3 
14.  /(z)  = 1/z 
16.  /(Z)  = 1/(7TZ  - 1) 
18.  /(z)  = l/(4z  - 3) 


20-30 


FURTHER  CONTOUR  INTEGRALS 


Evaluate  the  integral.  Does  Cauchy's  theorem  apply?  Show 
details. 


20.  <j>  Ln  (1  — z)  dz,  C the  boundary  of  the  parallelogram 

J c 

with  vertices  ±i.  ±(1  + i). 


Use  partial  fractions. 
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Use  partial  fractions. 


25. 


C consists  of  |z|  =2  counterclockwise  and 


|z|  = 1 clockwise. 


26. 


o cothgzcfe,  C the  circle  \z 
■c 


I'fn'l 


1 clockwise. 


27.  — - — dz,  C consists  of  |z|  = 1 counterclockwise 
Jc  2 

and  | z | =3  clockwise. 

I tan 

28.  — dz,  C the  boundary  of  the  square  with 

vertices  ±1,  ±i  clockwise. 

f sin  z . . 

29.  9 dz,  C:  |z  — 4 — 2i\  = 5.5  clockwise. 

]c  z + 2 iz 

f 2z3  + z2  + 4 

30.  9 — dz,  C:  | z — 2 1 =4  clockwise.  Use 

k z4  + 4z2 

partial  fractions. 


Cauchys  Integral  Formula 

Cauchy’s  integral  theorem  leads  to  Cauchy’s  integral  formula.  This  formula  is  useful  for 
evaluating  integrals  as  shown  in  this  section.  It  has  other  important  roles,  such  as  in  proving 
the  surprising  fact  that  analytic  functions  have  derivatives  of  all  orders,  as  shown  in  the 
next  section,  and  in  showing  that  all  analytic  functions  have  a Taylor  series  representation 
(to  be  seen  in  Sec.  15.4). 


THEOREM  1 


Cauchy’s  Integral  Formula 

Let  f{z)  be  analytic  in  a simply  connected  domain  D.  Then  for  any  point  Zo  in  D 
and  any  simple  closed  path  C in  D that  encloses  zo  (Fig-  356), 


(1) 


m 

i ~ z0 


dz  = 2tt  if  {zo) 


(Cauchy’s  integral  formula) 


the  integration  being  taken  counterclockwise.  Alternatively  (for  representing /(zo) 
by  a contour  integral,  divide  (1)  by  277;), 


(1*) 


/feo)  = ^ + 


m 

Z - Zo 


dz 


(Cauchy’s  integral  formula). 


PROOF  By  addition  and  subtraction,  f(z)  — fizo)  + [ /(z)  ~ f(zo)]-  Inserting  this  into  (1)  on  the 
left  and  taking  the  constant  factor /(z0)  out  from  under  the  integral  sign,  we  have 


(2) 


m 


— dz=f(.zo)$  — 


dz 


Zo 


+ O 


f(z)  -fizo) 


Z - Zo 


dz. 


The  first  term  on  the  right  equals /(z0)  • 277 which  follows  from  Example  6 in  Sec.  14.2 
with  m = — 1 . If  we  can  show  that  the  second  integral  on  the  right  is  zero,  then  it  would 
prove  the  theorem.  Indeed,  we  can.  The  integrand  of  the  second  integral  is  analytic,  except 
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EXAMPLE  1 


EXAMPLE  2 


EXAMPLE  3 


at  zo-  Hence,  by  (6)  in  Sec.  14.2,  we  can  replace  C by  a small  circle  K of  radius  p and 
center  z0  (Fig.  357),  without  altering  the  value  of  the  integral.  Since /(z)  is  analytic,  it  is 
continuous  (Team  Project  24,  Sec.  13.3).  Hence,  an  e > 0 being  given,  we  can  find  a 
8 > 0 such  that  |/(z)  — /(zo)l  < e for  all  z in  the  disk  |z  — Zol  <8.  Choosing  the  radius 
p of  K smaller  than  8,  we  thus  have  the  inequality 


Fig.  356.  Cauchy’s  integral  formula  Fig.  357.  Proof  of  Cauchy’s  integral  formula 


m -f(z  o) 


e 

< - 

P 


Z - Zo 

at  each  point  of  K.  The  length  of  K is  277p.  Hence,  by  the  ML- inequality  in  Sec.  14.1, 

' f(z)  - /(zo) 


JK 


z - Zo 


dz 


< — 27 Tp  = 2776. 
P P 


Since  e (>  0)  can  be  chosen  arbitrarily  small,  it  follows  that  the  last  integral  in  (2)  must 
have  the  value  zero,  and  the  theorem  is  proved.  ■ 


Cauchy’s  Integral  Formula 


z ~ 2 


- dz  = 2iriez 


2t Tie2  = 46.4268r 


for  any  contour  enclosing  z0  = 2 (since  ez  is  entire),  and  zero  for  any  contour  for  which  z0  = 2 lies  outside 
(by  Cauchy’s  integral  theorem). 


Cauchy’s  Integral  Formula 


= 2T7i[|z3-3]|z=i/2 

77 

= 677/ 

8 


(Zo  = h inside  C). 


Integration  Around  Different  Contours 

Integrate 

Z2  + 1 z2  + 1 

g(z)  = = 

Z - 1 (z  + 1)(Z  - 1) 


counterclockwise  around  each  of  the  four  circles  in  Fig.  358. 
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Solution.  g{z)  is  not  analytic  at  —1  and  1.  These  are  the  points  we  have  to  watch  for.  We  consider  each 
circle  separately. 

(a)  The  circle  \z  — l|  = 1 encloses  the  point  zo  = 1 where  g(z)  is  not  analytic.  Hence  in  (1)  we  have  to 
write 


g(z)  = 


z2  + 1 1 

z + 1 Z - 1 


thus 


m = 


2 , i 

Z + 1 

z + 1 


and  (1)  gives 


+ 1 
- 1 


dz  = 2t7;/(1)  = 2t n 


z + 1 
z + 1 


2tz/. 


(b)  gives  the  same  as  (a)  by  the  principle  of  deformation  of  path. 

(c)  The  function  g(z)  is  as  before,  but/(z)  changes  because  we  must  take  Zo  = — 1 (instead  of  1).  This  gives 
a factor  z — Zo  = Z + 1 in  (1).  Hence  we  must  write 


g(z)  = 


Z2  + 1 1 

Z - 1 z + 1 ’ 


thus 


/(z)  = 


Z - 1 


Compare  this  for  a minute  with  the  previous  expression  and  then  go  on: 


2 i 

cZ  - 1 


dz  = 27n/(— 1)  = liri 


+ 1 


Z - 1 


= — 2l7(. 


(d)  gives  0.  Why? 


Fig.  358.  Example  3 


Multiply  connected  domains  can  be  handled  as  in  Sec.  14.2.  For  instance,  if  f(z)  is 
analytic  on  C\  and  C2  and  in  the  ring-shaped  domain  bounded  by  C\  and  C2  (Fig.  359) 
and  Zo  is  anY  point  in  that  domain,  then 


/feo)  = ^ t 


1 


^dZ  + -L. 

Z - zo  2.TTI  J 


m 


dz, 


(3) 


c, 


C, 


Z - Zo 
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where  the  outer  integral  (over  C i)  is  taken  counterclockwise  and  the  inner  clockwise,  as 
indicated  in  Fig.  359. 


Fig.  359.  Formula  (3) 


1-4 


CONTOUR  INTEGRATION 


Integrate  z2/(z2  ~ 
around  the  circle. 
1.  | + 1|  = 1 
3.  |z  + i\  = 1.4 


1)  by  Cauchy’s  formula  counterclockwise 

2.  |z  - 1 - i\  = tt/2 
4.  |z  + 5 — 5/|  = 7 


5-8 


Integrate  the  given  function  around  the  unit  circle. 


5.  (cos  3 z)/ (6 z)  6.  e2z/ (ttz  ~ i) 

7.  z3/(2z  ~ i ) 8.  (z2  sin  z)/(4z  - 1) 


9.  CAS  EXPERIMENT.  Experiment  to  find  out  to  what 
extent  your  CAS  can  do  contour  integration.  For  this, 
use  (a)  the  second  method  in  Sec.  14. 1 and  (b)  Cauchy’s 
integral  formula. 


10.  TEAM  PROJECT.  Cauchy’s  Integral  Theorem. 

Gain  additional  insight  into  the  proof  of  Cauchy’s 
integral  theorem  by  producing  (2)  with  a contour 
enclosing  zo  (as  in  Fig-  356)  and  taking  the  limit  as  in 
the  text.  Choose 


(a) 


(b) 


sin  z 


and  (c)  another  example  of  your  choice. 


11-19 


FURTHER  CONTOUR  INTEGRALS 


Integrate  counterclockwise  or  as  indicated.  Show  the 
details. 


11.  I f"  , C:  4*  2 + (y  - 2)2  = 4 

JC  z + 4 

f z 

12.  9 — dz , C the  circle  with  center  — 1 and 

Jc  z2  + 4z  + 3 

radius  2 


f z + 2 

13.  -dz,  C:\z~  \\  =2 

]cz~2 

14.  | ^ dz,  C : |z|  = 0.6 

Jc  ze2  - 2 iz 


i cosh  (z2  — 7 ri) 

15.  ® dz,  C the  boundary  of  the  square 

Jc  z - 7 ri 

with  vertices  ±2,  ±2,  ±4j. 

I tan  z 

16.  — — dz,  C the  boundary  of  the  triangle  with 

Jc  z - i 

vertices  0 and  ± 1 + 2 i. 


17. 


f Ln  (z  + 1)  , 

t 2,*’ 

JC  Z + 1 


C:  |z  - f|  = 1.4 


18.  <p  dz,  C consists  of  the  boundaries  of  the 

Jc  4z2  - 8iz 

squares  with  vertices  ±3,  ±3;  counterclockwise  and 
±1,  ±i  clockwise  (see  figure). 


19.  9 — dz,  C consists  of  |z|  =2  counter- 

Jc  z2(z  -1-0 

clockwise  and  |z|  = 1 clockwise. 

20.  Show  that  <P  (z  — zi)_1(z  — Z2 )_1  dz  = 0 for  a simple 

Jc 

closed  path  C enclosing  zi  and  z2,  which  are 
arbitrary. 
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14.4  Derivatives  of  Analytic  Functions 

As  mentioned,  a surprising  fact  is  that  complex  analytic  functions  have  derivatives  of  all 
orders.  This  differs  completely  from  real  calculus.  Even  if  a real  function  is  once 
differentiable  we  cannot  conclude  that  it  is  twice  differentiable  nor  that  any  of  its  higher 
derivatives  exist.  This  makes  the  behavior  of  complex  analytic  functions  simpler  than  real 
functions  in  this  aspect.  To  prove  the  surprising  fact  we  use  Cauchy’s  integral  formula. 


THEOREM  1 


Derivatives  of  an  Analytic  Function 

Iff(z ) is  analytic  in  a domain  D,  then  it  has  derivatives  of  all  orders  in  D,  which 
are  then  also  analytic  functions  in  D.  The  values  of  these  derivatives  at  a point  zo 
in  D are  given  by  the  formulas 


(1  ) 


f'(zo)  = C) 


277 i Jr 


f(z) 

(z  - Zof 


dz 


a") 


and  in  general 

(1) 


„ 2!  f /(z) 

/ (Z o)  = — <>  7 73* 

277 l J (z  - Zo ) 


fn\z0)  = yr~.  f ~ : ' w+i  * 

Jr  (Z  Z0) 


/(z) 


(n  = 1,2,--); 


here  C is  any  simple  closed  path  in  D that  encloses  z o and  whose  full  interior  belongs 
to  D;  and  we  integrate  counterclockwise  around  C (Fig.  360). 


Fig.  360.  Theorem  1 and  its  proof 


COMMENT.  For  memorizing  (1),  it  is  useful  to  observe  that  these  formulas  are  obtained 
formally  by  differentiating  the  Cauchy  formula  (1*),  Sec.  14.3,  under  the  integral  sign 
with  respect  to  Zo- 

PROOF  We  prove  (1,)>  starting  from  the  definition  of  the  derivative 


/'(zo) 


lim 

Az— >0 


f(z0  + a z)  - f(z  o) 

Az 
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On  the  right  we  represent /(zo  + Az)  and  f(zo)  by  Cauchy’s  integral  formula: 


f(z o + Az)  - f(z o)  _ 1 

Az  27t/Az 


m 

z - (z0  + Az) 


dz 


m 


dz 


z ~ Zo 


We  now  write  the  two  integrals  as  a single  integral.  Taking  the  common  denominator 
gives  the  numerator /(z){z  — Zo  — [z  — (zo  + Az)] } = /(z)  Az,  so  that  a factor  Az  drops 
out  and  we  get 


/(zo  + Az) -/(zo)  1 f /(z) 

= (>  dz. 

Az  277/  Jc  (z  - z0  - Az)(z  - Zo) 

Clearly,  we  can  now  establish  (X)  by  showing  that,  as  Az- *0,  the  integral  on  the  right 
approaches  the  integral  in  (1  ).  To  do  this,  we  consider  the  difference  between  these  two 
integrals.  We  can  write  this  difference  as  a single  integral  by  taking  the  common 
denominator  and  simplifying  the  numerator  (as  just  before).  This  gives 


o 

Jc 


/(z) 

(z  - Zo  - Az)(z  - Zo) 


dz  - 


o 

■*c 


m 

(z  — Zo)2 


dz 


o 

Jc 


/(z)  Az 

(z  - z0  - Az)(z  - z0)2 


dz. 


We  show  by  the  ML-inequality  (Sec.  14.1)  that  the  integral  on  the  right  approaches  zero 
as  Az— >0. 

Being  analytic,  the  function  /(z)  is  continuous  on  C,  hence  bounded  in  absolute  value, 
say,  \f(z)\  = K.  Let  d be  the  smallest  distance  from  zo  to  the  points  of  C (see  Fig.  360). 
Then  for  all  z on  C, 


I z - zo  1 2 = d2,  hence 


lz  - zol2  d2 


Furthermore,  by  the  triangle  inequality  for  all  z on  C we  then  also  have 

d g |z  - z0l  = lz  - Zo  “ Az  + Az|  S |z  - z0  - Az|  + |Az|. 

We  now  subtract  |Az|  on  both  sides  and  let  |Az|  = d/2,  so  that  — |Az|  g —d/2.  Then 


\d^  d - | Az|  = lz 


zo 


Az|.  Flence 


1 


lz  - Zo  - Azl  d ' 
Let  L be  the  length  of  C.  If  | Azl  = d/2 , then  by  the  ML-inequality 


o 

Jc 


/(z)  Az 

(z  - zo  - Az)(z  - zo)2 


dz 


KL  | Az|  - • 

d d2 


This  approaches  zero  as  Az— >0.  Formula  (]')  is  proved. 

Note  that  we  used  Cauchy’s  integral  formula  (1*),  Sec.  14.3,  but  if  all  we  had  known 
about  /(zo)  is  the  fact  that  it  can  be  represented  by  (1*),  Sec.  14.3,  our  argument  would 
have  established  the  existence  of  the  derivative  / (z0)  of /(z).  This  is  essential  to  the 
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EXAMPLE  1 


EXAMPLE  2 


EXAMPLE  3 


THEOREM  2 


continuation  and  completion  of  this  proof,  because  it  implies  that  (l ")  can  be  proved  by 
a similar  argument,  with  / replaced  by  / , and  that  the  general  formula  (1)  follows  by 
induction.  ■ 

Applications  of  Theorem  1 

Evaluation  of  Line  Integrals 

From  (1  ),  for  any  contour  enclosing  the  point  77/  (counterclockwise) 


1 — cos  „ dz  = 277/(cos  z)' 

Jr  (z  - tri) 


= — 277/  sin  77/  = 277  sinh  77. 


Jc  (Z  - iri) 

From  (l"),  for  any  contour  enclosing  the  point  — / we  obtain  by  counterclockwise  integration 


,3  'dz  = 77/(;4  - 3z2  + 6)" 


= 77/[12z  - 6]z=_j  = -1877/. 


Jc  (Z  + 0° 

By  (F),  for  any  contour  for  which  1 lies  inside  and  ±2/  lie  outside  (counterclockwise), 


(z  - i rcr  + 4) 


dz  = 277/ 


= 277/ 


+ 4 


ez(z2  + 4)  - ez2z 
(z2  + 4)2 


6^77 
-i  25 


- i ~ 2.050/. 


Cauchy’s  Inequality.  Liouville’s  and  Morera’s  Theorems 

We  develop  other  general  results  about  analytic  functions,  further  showing  the  versatility 
of  Cauchy’s  integral  theorem. 


Cauchy’s  Inequality.  Theorem  1 yields  a basic  inequality  that  has  many  applications. 
To  get  it,  all  we  have  to  do  is  to  choose  for  C in  (1)  a circle  of  radius  r and  center  z0  and 
apply  the  ML-inequality  (Sec.  14.1);  with  |/(z)|  g M on  C we  obtain  from  (1) 


\fn\zo)\  = ~ 

277 


; m 

c(z-  z of 


~[dz 


n\  1 

' : M r 277 r. 


2tt 


n+ 1 


This  gives  Cauchy’s  inequality 


(2)  I/-WI^. 

rn 

To  gain  a first  impression  of  the  importance  of  this  inequality,  let  us  prove  a famous 
theorem  on  entire  functions  (definition  in  Sec.  13.5).  (For  Liouville,  see  Sec.  11.5.) 


Liouville’s  Theorem 

If  an  en  tire  function  is  bounded  in  absolute  value  in  the  whole  complex  plane,  then 
this  function  must  be  a constant. 


SEC.  14.4 


Derivatives  of  Analytic  Functions 


667 


PROOF  By  assumption,  |/(z)|  is  bounded,  say,  \f(z)\  <K  for  all  z.  Using  (2),  we  see  that 
l/feo)!  < K/r.  Since /(z)  is  entire,  this  holds  for  every  r,  so  that  we  can  take  r as  large 
as  we  please  and  conclude  that  /'  (zo)  = 0.  Since  zo  is  arbitrary,  / (z)  = ux  + ivx  = 0 for 
all  z (see  (4)  in  Sec.  13.4),  hence  ux  = vx  = 0,  and  uy  = vy  = 0 by  the  Cauchy-Riemann 
equations.  Thus  u = const,  v = const,  and  f = u + iv  = const  for  all  z.  This  completes 
the  proof. 

Another  very  interesting  consequence  of  Theorem  1 is 


THEOREM  3 


Morera’s2  Theorem  (Converse  of  Cauchy’s  Integral  Theorem) 

Iff(z)  is  continuous  in  a simply  connected  domain  D and  if 


(3) 


< > /(z)  dz  = 0 
2c 


for  every  closed  path  in  D,  then  f(z)  is  analytic  in  D. 


PROOF 


In  Sec.  14.2  we  showed  that  if/(z)  is  analytic  in  a simply  connected  domain  D.  then 


F(z) 


f(z*)  dz* 


is  analytic  in  D and  h '(z.)  = f(z.).  In  the  proof  we  used  only  the  continuity  of  /(z)  and  the 
property  that  its  integral  around  every  closed  path  in  D is  zero;  from  these  assumptions 
we  concluded  that  F(z ) is  analytic.  By  Theorem  1,  the  derivative  of  F(z)  is  analytic,  that 
is, /(z)  is  analytic  in  D,  and  Morera’s  theorem  is  proved. 


This  completes  Chapter  14. 


1-7 


CONTOUR  INTEGRATION.  UNIT  CIRCLE 


Integrate  counterclockwise  around  the  unit  circle. 

„6 


1. 


3. 


5. 


7. 


sin  z 


c z 
c <■ 


-dz 


dz,  n = 1,  2, 


cosh  2z 
l 1\4 

C (z  - 2) 


dz 


cos  z 


c z 


2. 


4. 


6. 


- dz,  n = 0,  1, 


c (2z  - l)6 
ez  cos  z 
c (z  - 7 r/4)3 
dz 


dz 


dz 


'c  (z  - 2 i)Hz  ~ i/2Y 


8-19 


INTEGRATION.  DIFFERENT  CONTOURS 


Integrate.  Show  the  details.  Hint.  Begin  by  sketching  the 
contour.  Why? 


8. 


z + sinz 
c (z  - if 


- dz,  C the  boundary  of  the  square  with 


vertices  ±2,  ±2i  counterclockwise. 


9. 


tan  7rz 


dz,  C the  ellipse  16v2  + y2  = 1 clockwise. 


10. 


4z3  - 6 

c z(z  - 1 - if 


dz,  C consists  of  |z|  =3  counter- 


clockwise and  \z\  = 1 clockwise. 


2GIACINTO  MORERA  (1856-1909),  Italian  mathematician  who  worked  in  Genoa  and  Turin. 
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11.  o 


Jc 


(1  + z)  sin  z 
(2 z ~ l)1 2 


dz. 


C:\z~  i\  = 2 counterclockwise. 


18. 


sinh  z 


dz,  C:  |z| 


1 counterclockwise,  n integer. 


12. 


13. 


exp  (z2) 


c z(z  ~ 2 O' 

Ln  z 


dz,  C:  z — 3i  = 2 clockwise. 


14.  o 


c (z  2) 

Ln  (z  + 3) 


dz,  C:  \z  ~ 3 1 = 2 counterclockwise. 


- dz,  C the  boundary  of  the  square 


c (z  - 2)(z  + 1 r 
with  vertices  ±1.5,  ±1.5/,  counterclockwise. 

I cosh  4z 

15.  tp  — dz,  C consists  of  |z|  =6  counterclock- 

Jc  (z  ~ 4)3 4 5 6 7 8 9 10 11 

wise  and  \z  — 3|  =2  clockwise. 

4 z 


16. 


e 


c z(z  - 2 if 


dz,  C consists  of  z — / =3  counter- 


clockwise and  |z|  = 1 clockwise. 

f e~z  sin  z . . 

17.  O dz,  C consists  of  |z|  = 5 counterclock- 

Jc  (z  - 4)3 

wise  and  | z — 3 1 = § clockwise. 


19.  0 dz,  C : |z|  = 1,  counterclockwise. 

'c  (4z  - 7 Tif 

20.  TEAM  PROJECT.  Theory  on  Growth 

(a)  Growth  of  entire  functions.  If  /(z)  is  not  a 

constant  and  is  analytic  for  all  (finite)  z,  and  R and 
M are  any  positive  real  numbers  (no  matter  how 
large),  show  that  there  exist  values  of  z for  which 
|z|  > R and  |/(z)|  > M.  Hint.  Use  Liouville’s 
theorem. 

(b)  Growth  of  polynomials.  If  /(z)  is  a polynomial 
of  degree  n > 0 and  M is  an  arbitrary  positive 
real  number  (no  matter  how  large),  show  that 
there  exists  a positive  real  number  R such  that 
|/(z)|  > M for  all  |z|  > R. 

(c)  Exponential  function.  Show  that  /(z)  = ex  has 
the  property  characterized  in  (a)  but  does  not  have 
that  characterized  in  (b). 

(d)  Fundamental  theorem  of  algebra.  If  f(z)  is  a 
polynomial  in  z,  not  a constant,  then  f(z)  = 0 for 
at  least  one  value  of  z.  Prove  this.  Hint.  Use  (a). 


SE£PEE^Q33EOSE3iSH^ESTIONS  AND  PROBLEMS 


1.  What  is  a parametric  representation  of  a curve?  What 
is  its  advantage? 

2.  What  did  we  assume  about  paths  of  integration  z = z(0? 
What  is  z = dz/dt  geometrically? 

3.  State  the  definition  of  a complex  line  integral  from 
memory. 

4.  Can  you  remember  the  relationship  between  complex 
and  real  line  integrals  discussed  in  this  chapter? 

5.  How  can  you  evaluate  a line  integral  of  an  analytic 
function?  Of  an  arbitrary  continous  complex  function? 

6.  What  value  do  you  get  by  counterclockwise  integration 
of  1/z  around  the  unit  circle?  You  should  remember 
this.  It  is  basic. 

7.  Which  theorem  in  this  chapter  do  you  regard  as  most 
important?  State  it  precisely  from  memory. 

8.  What  is  independence  of  path?  Its  importance?  State  a 
basic  theorem  on  independence  of  path  in  complex. 

9.  What  is  deformation  of  path?  Give  a typical  example. 

10.  Don’t  confuse  Cauchy’s  integral  theorem  (also  known 
as  Cauchy-Goursat  theorem)  and  Cauchy’s  integral 
formula.  State  both.  How  are  they  related? 

11.  What  is  a doubly  connected  domain?  How  can  you 
extend  Cauchy’s  integral  theorem  to  it? 


12.  What  do  you  know  about  derivatives  of  analytic 
functions? 

13.  How  did  we  use  integral  formulas  for  derivatives  in 
evaluating  integrals? 

14.  How  does  the  situation  for  analytic  functions  differ 
with  respect  to  derivatives  from  that  in  calculus? 

15.  What  is  Liouville’s  theorem?  To  what  complex  func- 
tions does  it  apply? 

16.  What  is  Morera’s  theorem? 

17.  If  the  integrals  of  a function  /(z)  over  each  of  the  two 
boundary  circles  of  an  annulus  D taken  in  the  same 
sense  have  different  values,  can/(z)  be  analytic  every- 
where in  D1  Give  reason. 


18.  Is  Im  0 f(z)  dz  = 0 Im/(z)  dz ? Give  reason. 


19.  Is 


| /(z)  dz  = | l/(z)l  fife? 


20.  How  would  you  find  a bound  for  the  left  side  in  Prob.  19? 


21-30 


INTEGRATION 


Integrate  by  a suitable  method. 


21. 


z sinh  (z2)  dz  from  0 to  Tti/2. 

'c 


Summary  of  Chapter  14 
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22.  (|z|  + z)  dz  clockwise  around  the  unit  circle. 

d n 


27.  I (z2  + z 2)  dz  from  0 to  2 + 2 shortest  path. 


23.  z e dz  counterclockwise  around  |z|  =7 r. 
■'c 


24.  Re  z dz  from  0 to  3 + 27;  along  y = jr  . 

■'C 

f tan  7Tz 

25.  <iz  clockwise  around  z — 1|  =0.1. 

Jc(z-  l)2 

26.  I (z2  + z2)  dz  from  z = 0 horizontally  to  z = 2,  then 


Ln  z 


, dz  counterclockwise  around  |z  — l|  = g. 


28. 


29. 


30.  sin  z dz  from  0 to  (1  + ;). 

■'c 


'c  (z  - 2;) 


2 1 

1 1 dz  clockwise  around 

z + 2;  z + 4; 


lz  - 1 = 2.5. 


vertically  upward  to  2 + 2 i. 


SUMMARY  OF  CH  APTER  14 

Complex  Integration 


The  complex  line  integral  of  a function  f(z)  taken  over  a path  C is  denoted  by 


(1) 


f(z)  dz  or,  if  C is  closed,  also  by 


o f(z)  (Sec.  14.1). 


Jc  JC 

lifiz)  is  analytic  in  a simply  connected  domain  D,  then  we  can  evaluate  (1)  as  in 
calculus  by  indefinite  integration  and  substitution  of  limits,  that  is. 


(2) 


m dz  = F(Zl)  - F(zo)  \F\z)  = f(z)] 

c 


for  every  path  C in  D from  a point  zo t0  a point  Z\  (see  Sec.  14.1).  These  assumptions 
imply  independence  of  path,  that  is,  (2)  depends  only  on  z0  and  z,\  (and  on  f(z), 
of  course)  but  not  on  the  choice  of  C (Sec.  14.2).  The  existence  of  an  F{z)  such  that 
F (z)  = f(z)  is  proved  in  Sec.  14.2  by  Cauchy’s  integral  theorem  (see  below). 

A general  method  of  integration,  not  restricted  to  analytic  functions,  uses  the 
equation  z = z(t)  of  C,  where  fll/i  b. 


(3) 


f(.z)  dz 
c 


b 

mt))z(t ) dt 


Cauchy’s  integral  theorem  is  the  most  important  theorem  in  this  chapter.  It  states 
that  if  f(z)  is  analytic  in  a simply  connected  domain  D,  then  for  every  closed  path 
C in  D (Sec.  14.2), 


(4) 


< > f(z)  dz  = 0. 
•'c 
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Under  the  same  assumptions  and  for  any  zo  in  D and  closed  path  C in  D containing 
z0  in  its  interior  we  also  have  Cauchy’s  integral  formula 


(5) 


= h * 


m 

l - Jo 


dz. 


Furthermore,  under  these  assumptions /(z)  has  derivatives  of  all  orders  in  D that  are 
themselves  analytic  functions  in  D and  (Sec.  14.4) 


(6) 


fn\zo)  = — 

277/ 


f(z) 


c (z  - z o) 


n+1 


dz 


(n  = 1,  2,  ■ ■ • 


This  implies  Morera’s  theorem  (the  converse  of  Cauchy’s  integral  theorem)  and 
Cauchy’s  inequality  (Sec.  14.4),  which  in  turn  implies  Liouville’s  theorem  that  an 
entire  function  that  is  bounded  in  the  whole  complex  plane  must  be  constant. 


m 


CHAPTER 


Power  Series,  Taylor  Series 


In  Chapter  14,  we  evaluated  complex  integrals  directly  by  using  Cauchy’s  integral  formula, 
which  was  derived  from  the  famous  Cauchy  integral  theorem.  We  now  shift  from  the 
approach  of  Cauchy  and  Goursat  to  another  approach  of  evaluating  complex  integrals, 
that  is,  evaluating  them  by  residue  integration.  This  approach,  discussed  in  Chapter  16, 
first  requires  a thorough  understanding  of  power  series  and,  in  particular,  Taylor  series. 
(To  develop  the  theory  of  residue  integration,  we  still  use  Cauchy’s  integral  theorem!) 

In  this  chapter,  we  focus  on  complex  power  series  and  in  particular  Taylor  series.  They 
are  analogs  of  real  power  series  and  Taylor  series  in  calculus.  Section  15.1  discusses 
convergence  tests  for  complex  series,  which  are  quite  similar  to  those  for  real  series.  Thus, 
if  you  are  familiar  with  convergence  tests  from  calculus,  you  may  use  Sec.  15.1  as  a 
reference  section.  The  main  results  of  this  chapter  are  that  complex  power  series  represent 
analytic  functions,  as  shown  in  Sec.  15.3,  and  that,  conversely,  every  analytic  function 
can  be  represented  by  power  series,  called  a Taylor  series,  as  shown  in  Sec.  15.4.  The  last 
section  (15.5)  on  uniform  convergence  is  optional. 

Prerequisite:  Chaps.  13,  14. 

Sections  that  may  be  omitted  in  a shorter  course:  15.1,  15.5. 

References  and  Answers  to  Problems:  App.  1 Part  D,  App.  2. 


15.i  Sequences,  Series,  Convergence  Tests 

The  basic  concepts  for  complex  sequences  and  series  and  tests  for  convergence  and 
divergence  are  very  similar  to  those  concepts  in  (real)  calculus.  Thus  if  you  feel  at  home 
with  real  sequences  and  series  and  want  to  take  for  granted  that  the  ratio  test  also  holds 
in  complex,  skip  this  section  and  go  to  Section  15.2. 

Sequences 

The  basic  definitions  are  as  in  calculus.  An  infinite  sequence  or,  briefly,  a sequence,  is 
obtained  by  assigning  to  each  positive  integer  n a number  zn,  called  a term  of  the  sequence, 
and  is  written 

Zi,  Z2,  or  Ui>  Zz,  ■'•}  or  briefly  {zn}- 

We  may  also  write  z 0,  zi,  ■ ■ • or  "2,  z 3,  • ■ ■ or  start  with  some  other  integer  if  convenient. 
A real  sequence  is  one  whose  terms  are  real. 
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EXAMPLE  1 


EXAMPLE  2 


THEOREM  1 


PROOF 


Convergence.  A convergent  sequence  Zi,  z2,  ■ ■ • is  one  that  has  a limit  c,  written 
lim  zn  = c or  simply  zn  —*  c. 

ft,— >00 

By  definition  of  limit  this  means  that  for  every  e > 0 we  can  find  an  N such  that 

(1)  | zn  — c|  < e for  all  n > N\ 

geometrically,  all  terms  zn  with  n > N lie  in  the  open  disk  of  radius  e and  center  c (Fig.  361) 
and  only  finitely  many  terms  do  not  lie  in  that  disk.  [For  a real  sequence,  (1)  gives  an  open 
interval  of  length  2e  and  real  midpoint  c on  the  real  line  as  shown  in  Fig.  362.] 

A divergent  sequence  is  one  that  does  not  converge. 


Fig.  361.  Convergent  complex  sequence  Fig.  362.  Convergent  real  sequence 


Convergent  and  Divergent  Sequences 

The  sequence  {in/n}  = { i,  — j,  —i/3,  |,  ■ ■ ■ } is  convergent  with  limit  0. 

The  sequence  {/”}  = [i,  — 1,  — i,  1,  ■ ■ • } is  divergent,  and  so  is  { zn  I with  zn  = ( 1 + i)n. 

Sequences  of  the  Real  and  the  Imaginary  Parts 

The  sequence  [zn]  with  zn  = xn  + iyn  = 1 — 1 /n2  + ;'( 2 + 4 /«)  is  6 i,  § + 4 i,  | + 10//3,  + 3i,  ■ ■ ■ . 

(Sketch  it.)  It  converges  with  the  limit  c = 1 + 2 i.  Observe  that  \xf,  f has  the  limit  1 = Re  c and  [yn]  has 
the  limit  2 = Imc.  This  is  typical.  It  illustrates  the  following  theorem  by  which  the  convergence  of  a 
complex  sequence  can  be  referred  back  to  that  of  the  two  real  sequences  of  the  real  parts  and  the  imaginary 
parts. 


Sequences  of  the  Real  and  the  Imaginary  Parts 

A sequence  Zi,  z2,  " ' , Zn,  •••£>/  complex  numbers  zn  = xn  + iyn  ( where  n = 1, 
2,  • • • ) converges  to  c = a + ib  if  and  only  if  the  sequence  of  the  real  parts  X\,  x2,  ■ ■ • 
converges  to  a and  the  sequence  of  the  imaginary  parts  yi,  v2,  ■ • • converges  to  b. 


Convergence  zn^  c = a + ib  implies  convergence  xn  a and  yrl  b because  if 
\zn  ~ c | < e,  then  zn  lies  within  the  circle  of  radius  e about  c = a + ib,  so  that  (Fig.  363a) 

\xn  - a\  < e,  I yn  - b\  < e. 

Conversely,  if  xn  —*  a and  yn  b as  n °°,  then  for  a given  e > 0 we  can  choose 
N so  large  that,  for  every  n > N, 

Un  - a\  < \yn  ~ b | C |. 


SEC.  15.1  Sequences,  Series,  Convergence  Tests 
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Fig.  363.  Proof  of  Theorem  1 

These  two  inequalities  imply  that  zn  = xn  + iyn  lies  in  a square  with  center  c and  side 
e.  Hence,  zn  must  lie  within  a circle  of  radius  e with  center  c (Fig.  363b). 


Series 

Given  a sequence  Zi,  Z2>'"  > zm,  • ■ ■ , we  may  form  the  sequence  of  the  sums 
Sl  = Zl,  S2  = Z!  + Z2,  $3  = Zl  + ^2  + Z3> 


and  in  general 

(2)  sn  = Zl  + z2  + ■■■  + Zn  (n  = 1,  2, • • - 

Here  sn  is  called  the  nth  partial  sum  of  the  infinite  series  or  series 

oo 

(3)  ^ Zm  = Zi  + Z2  + ■■■■ 

m=  1 

The  zi,  Z2, ' ■ • are  called  the  terms  of  the  series.  (Our  usual  summation  letter  is  n,  unless 
we  need  n for  another  purpose,  as  here,  and  we  then  use  m as  the  summation  letter.) 

A convergent  series  is  one  whose  sequence  of  partial  sums  converges,  say, 

oo 

lim  sn  = s.  Then  we  write  5 = 'V  zm  = zi  + z2  + 

n—>  °° 

m= 1 

and  call  s the  sum  or  value  of  the  series.  A series  that  is  not  convergent  is  called  a divergent 
series. 

If  we  omit  the  terms  of  sn  from  (3),  there  remains 

(4)  Rn  = Zn+l  + Zn+2  + Zn+ 3 + ' ' ' ■ 

This  is  called  the  remainder  of  the  series  (3)  after  the  term  zn-  Clearly,  if  (3)  converges 
and  has  the  sum  s,  then 

s = sn  + Rn,  thus  Rn  = s — sn. 

Now  sn  s by  the  definition  of  convergence;  hence  Rn  —>  0.  In  applications,  when  s is 
unknown  and  we  compute  an  approximation  sn  of  ,v,  then  Rn  is  the  error,  and  Rn  —>  0 
means  that  we  can  make  |/?m|  as  small  as  we  please,  by  choosing  n large  enough. 
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THEOREM  2 


THEOREM  3 


PROOF 


THEOREM  4 


An  application  of  Theorem  1 to  the  partial  sums  immediately  relates  the  convergence 
of  a complex  series  to  that  of  the  two  series  of  its  real  parts  and  of  its  imaginary  parts: 


Real  and  Imaginary  Parts 

A series  (3)  with  zm  — xm  + iym  converges  and  has  the  sum  s = u + iv  if  and 
only  if  x i + x2  + • ■ ■ converges  and  has  the  sum  u and  Vi  + yz  + ‘ ' converges 
and  has  the  sum  u. 


Tests  for  Convergence  and  Divergence  of  Series 

Convergence  tests  in  complex  are  practically  the  same  as  in  calculus.  We  apply  them 
before  we  use  a series,  to  make  sure  that  the  series  converges. 

Divergence  can  often  be  shown  very  simply  as  follows. 


Divergence 

If  a series  z\  + Z2  + ‘ ' ' converges,  then  lim  zm  = 0.  Hence  if  this  does  not  hold, 

m— >&> 

the  senes  diverges. 


If  Z\  + Z2  + ' ' ’ converges,  with  the  sum  s,  then,  since  zm  = sm  — sm_i, 

lim  Zm  = lim  ( sm  — sm- 1)  = lim  sm  — lim  sm-\  = s — s = 0. 

m— >°o  m— >0°  m— »°°  m— 

CAUTION!  zm  — > 0 i s necessary  for  convergence  but  not  sufficient , as  we  see  from  the 
harmonic  series  1 + 5 + ^ + 5 + ' ' ' , which  satisfies  this  condition  but  diverges,  as  is 
shown  in  calculus  (see,  for  example,  Ref.  [GenRefll]  in  App.  1). 

The  practical  difficulty  in  proving  convergence  is  that,  in  most  cases,  the  sum  of  a series 
is  unknown.  Cauchy  overcame  this  by  showing  that  a series  converges  if  and  only  if  its 
partial  sums  eventually  get  close  to  each  other: 


Cauchy’s  Convergence  Principle  for  Series 

A series  Zi  + Z2  + ' ' ' is  convergent  if  and  only  if  for  every  given  e > 0 [no  matter 
how  small ) we  can  find  an  N ( which  depends  on  e,  in  general ) such  that 

(5)  lzn+1  + Zn+2  + •■•  + Zn+pl  < e for  every  n>  N and  p = 1,2,--- 


The  somewhat  involved  proof  is  left  optional  (see  App.  4). 

Absolute  Convergence.  A series  Zi  + Z2  + ’ ’ ‘ is  called  absolutely  convergent  if  the 
series  of  the  absolute  values  of  the  terms 

00 

2 \zm\  = kll  + U2I  + • ’ • 

m=  1 


is  convergent. 
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EXAMPLE  3 

Ifzi  + z 2 + ' ' ' converges  but  |zi|  + |z2l  + • ■ • diverges,  then  the  series  zi  + Z 2 + ■■  ■ 
is  called,  more  precisely,  conditionally  convergent. 

A Conditionally  Convergent  Series 

The  series  1—  § + g — j + — • • • converges,  but  only  conditionally  since  the  harmonic  series  diverges,  as 
mentioned  above  (after  Theorem  3). 

If  a series  is  absolutely  convergent,  it  is  convergent. 

This  follows  readily  from  Cauchy’s  principle  (see  Prob.  29).  This  principle  also  yields 
the  following  general  convergence  test. 

THEOREM  5 

Comparison  Test 

If  a series  z\  + Z2  + ' ' ' is  given  and  we  can  find  a convergent  series  + b2  + 
with  nonnegative  real  terms  such  that  |zjJ  = b\,  |z2l  = ^2,  ■ ■ • , then  the  given  series 
converges,  even  absolutely. 

PROOF 

By  Cauchy’s  principle,  since  bi  + b2  + ■ ■ ■ converges,  for  any  given  e > 0 we  can  find 
an  N such  that 

bn+ 1 + ■ ■ ■ + bn+p  < e for  every  n > N and  p = 1,  2,  • ■ ■ . 

From  this  and  | = &i,  |z2l  = b2,  ■ ■ ■ we  conclude  that  for  those  n and  p, 

l^n+rl  T • * • T |z^+p|  = bn+ 1 T ' ' ' T bn+p  <1  €. 

Hence,  again  by  Cauchy’s  principle,  IzjJ  + |z2l  + • ■ • converges,  so  that  zi  + Z2  + ' ‘ ’ is 
absolutely  convergent. 

A good  comparison  series  is  the  geometric  series,  which  behaves  as  follows. 

THEOREM  6 

Geometric  Series 

The  geometric  series 

(6*)  ^ qm=l+  q + q2  + ■■■ 

m= 0 

converges  with  the  sum  1/(1  — q)  if  |t/|  < 1 and  diverges  if  \q\  = 1. 

PROOF 

If  |g|  § 1,  then  q m 1 and  Theorem  3 implies  divergence. 

Now  let  |t/|  < 1.  The  nth  partial  sum  is 

Sn  = 1 + ^ + '''  + t/n. 

From  this, 

1 n . „n+ 1 

qsn  = 


q + ■ ■ • + qn  + q 


676 


CHAP.  15  Power  Series,  Taylor  Series 


THEOREM  7 


PROOF 


On  subtraction,  most  terms  on  the  right  cancel  in  pairs,  and  we  are  left  with 
Sn  - qsn  = (1  - q)sn  = 1 - q n + 1. 


Now  1 — q A 0 since  q A 1 , and  we  may  solve  for  sn,  finding 


(6) 


i - q 


n+ 1 


n+ 1 


Since  |r/|  < 1,  the  last  term  approaches  zero  as  n— »°°.  Hence  if  \q\ 
convergent  and  has  the  sum  1/(1  — q).  This  completes  the  proof. 


< 1,  the  series  is 


Ratio  Test 

This  is  the  most  important  test  in  our  further  work.  We  get  it  by  taking  the  geometric 
series  as  comparison  series  b\  + bz  + ■ ■ ■ in  Theorem  5: 


Ratio  Test 

If  a series  Zi  + Z2  + ' ' ' with  zn  ^ 0 (n  = 1,  2,  • • • ) has  the  property  that  for  every 
n greater  than  some  N, 


(7) 


Zn+ 1 

Z-n 


^ q < 1 


(n  > N) 


( where  q < 1 is  fixed),  this  series  converges  absolutely.  If  for  every  n > N, 

Zn+l 


(8) 


1 


( n > N), 


the  series  diverges. 


If  (8)  holds,  then  \zn+i\  = \zn I for  n > N,  so  that  divergence  of  the  series  follows  from 
Theorem  3. 

If  (7)  holds,  then  |zn+il  = Iz^l  q for  n > N,  in  particular, 

UiV+2l  = IziV+ll?.  IZJV+3I  = \?-N  + 2\q  = Uw+ll^2>  etc-> 

and  in  general,  Iz^v+pl  = l~2V+il?P_1-  Since  q < 1,  we  obtain  from  this  and  Theorem  6 

Utv+il  + Izjv+21  + Ujv+3l  + = Uiv+il  (1  + q + q2  + ■■•)  = \zn+ ll  • 

1 - q 


Absolute  convergence  of  zi  + Z2  + ’ ' ' now  follows  from  Theorem  5. 
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THEOREM  8 


PROOF 


The  inequality  (7)  implies  \zn+i/zn\  < 1-  but  this  does  not  imply  con- 
vergence, as  we  see  from  the  harmonic  series,  which  satisfies  zn+ l/zn  = n/(n  + 1)  < 1 for 
all  n but  diverges. 

If  the  sequence  of  the  ratios  in  (7)  and  (8)  converges,  we  get  the  more  convenient 


Ratio  Test 

Zn+1 

If  a series  Zi  + Z2  + ‘ ' ' with  zn  =£  0 (n  = 1,  2,  • • ■ ) is  such  that  lim 

= u 

then: 

^ n 

(a)  If  L < 1,  the  series  converges  absolutely. 

(b)  If  L > 1,  the  series  diverges. 

(c)  If  L = 1,  the  series  may  converge  or  diverge,  so  that  the  test  fails  and 

permits  no  conclusion. 

(a)  We  write  kn  = \zn+\/zn\  and  let  L = 1 — b < 1.  Then  by  the  definition  of  limit,  the 
kn  must  eventually  get  close  to  1 — b,  say,  kn  = <:/  = I — 2h  < I for  all  n greater  than 
some  N.  Convergence  of  z\  + Z2  + " ' now  follows  from  Theorem  7. 

(b)  Similarly,  for  L = 1 + c > 1 we  have  kn  § 1 + \c  > 1 for  all  n > N*  (sufficiently 
large),  which  implies  divergence  of  zi  + zz  + ' ' ' by  Theorem  7. 

(c)  The  harmonic  series  1 + \ + g + • ■ • has  zn+i/zn  = n/{n  +1),  hence  L = 1,  and 
diverges.  The  series 


1111 

1 -f-  — + — -f-  — T — 4- 
4 9 16  25 


has 


Zn+l  _ nZ 
Zn  in  + 1 )2  ’ 


hence  also  L = 1,  but  it  converges.  Convergence  follows  from  (Fig.  364) 


Sn 


1 + - + 
4 


■ • + 


::  1 + 


[n  dx 


= 2- 


•) 

n 


so  that  5!,  s2,  • ■ ■ is  a bounded  sequence  and  is  monotone  increasing  (since  the  terms  of 
the  series  are  all  positive);  both  properties  together  are  sufficient  for  the  convergence  of 
the  real  sequence  .sq,  ,v2,  • ■ • . (In  calculus  this  is  proved  by  the  so-called  integral  test , whose 
idea  we  have  used.) 


Fig.  J64.  Convergence  of  the  series  1 + ^ + | + ^ + --  - 


0 
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EXAMPLE  4 Ratio  Test 

Is  the  following  series  convergent  or  divergent?  (First  guess,  then  calculate.) 


i 

71  = 0 


(100  + 750" 

n\ 


1 + (100  + 750  + 


1 9 

— (100  + 750  + • • ■ 
2! 


Solution.  By  Theorem  8,  the  series  is  convergent,  since 


Zn+1 

Zn 


1 100  + 75«1”+1/(«  + 1)!  _ 1 100  + 75/'|  _ 125 
1 100  + 75/|7«!  n+  1 n+1 


L = 0. 


EXAMPL  Theorem  7 More  General  Than  Theorem  8 

Let  an  = if Tin  and  bn  = l/23n+1.  Is  the  following  series  convergent  or  divergent? 

1 /'  1 i 1 

CIq  + Z?o  + Cl\  + b-\  + • • • — i H 1 h I 1 h • • • 

2 8 16  64  128 


Solution.  The  ratios  of  the  absolute  values  of  successive  terms  are  Hence  convergence  follows 

from  Theorem  7.  Since  the  sequence  of  these  ratios  has  no  limit,  Theorem  8 is  not  applicable. 


Root  Test 

The  ratio  test  and  the  root  test  are  the  two  practically  most  important  tests.  The  ratio  test 
is  usually  simpler,  but  the  root  test  is  somewhat  more  general. 


THEOREM  9 


Root  Test 

If  a series  zi  + Zi  + ■ ■ ■ is  such  that  for  every  n greater  than  some  N, 

(9)  ^ J §9<1  (n  > N) 

{where  q < 1 is  fixed),  this  series  converges  absolutely.  If  for  infinitely  many  n, 

(10)  ^ 1, 
the  series  diverges. 


PROOF  If  (9)  holds,  then  \zn\  = qn  < 1 for  all  n > N.  Hence  the  series  |zi|  + \z2\  + 

converges  by  comparison  with  the  geometric  series,  so  that  the  series  zi  + Zz  + "• 
converges  absolutely.  If  (10)  holds,  then  \zn\  = 1 for  infinitely  many  n.  Divergence  of 
Ti  + Zz  + ' ' ' now  follows  from  Theorem  3. 


Equation  (9)  implies  ''v/|zrJ  < 1,  but  this  does  not  imply  convergence,  as 
we  see  from  the  harmonic  series,  which  satisfies  \ Zl/n  < 1 (for  n > 1)  but  diverges. 


SEC.  15.1  Sequences,  Series,  Convergence  Tests 


679 


THEOREM  10 


If  the  sequence  of  the  roots  in  (9)  and  (10)  converges,  we  more  conveniently  have 

Root  Test 

If  a series  zi  + Z2  + ' ' ' is  such  that  lim  = L,  then: 

n— 

(a)  The  series  converges  absolutely  if  L < 1. 

(b)  The  series  diverges  if  L > 1. 

(c)  If  L = 1,  the  test  fails;  that  is,  no  conclusion  is  possible. 


P^^O-Bt:EM==SFT— 15^1 


1-10 


SEQUENCES 


Is  the  given  sequence  zi,  Z2. ' ' ' , zn, ' ' ' bounded?  Con- 
vergent? Find  its  limit  points.  Show  your  work  in  detail. 


1.  Z„  = (1  + ifn/2n 
3.  zn  = u7T/(4  + 2m) 
5.  Zn  = (-1)"  + 10; 
7.  zn  = n2  + i/n2 
9.  Zn  = (3  + 30“” 


2.  Zn  = (3  + 4i)n/n\ 

4.  Zn  = (1  + 20” 

6.  zn  — (cos  mrit/n 
8.  Zn  = [(1  + 30/VlO] 
10.  zn  — sin  (|n7T)  + in 


11.  CAS  EXPERIMENT.  Sequences.  Write  a program 
for  graphing  complex  sequences.  Use  the  program  to 
discover  sequences  that  have  interesting  “geometric” 
properties,  e.g.,  lying  on  an  ellipse,  spiraling  to  its  limit, 
having  infinitely  many  limit  points,  etc. 


12.  Addition  of  sequences.  If  zi,  Z2, ' ' ‘ converges  with 
the  limit  l and  z*.  z 2,  ■ ■ ■ converges  with  the  limit  /*, 
show  that  Zi  + Z*,  Z2  Z2,  " ' is  convergent  with  the 
limit  l + l*. 


13.  Bounded  sequence.  Show  that  a complex  sequence 
is  bounded  if  and  only  if  the  two  corresponding 
sequences  of  the  real  parts  and  of  the  imaginary  parts 
are  bounded. 


14.  On  Theorem  1.  Illustrate  Theorem  1 by  an  example 
of  your  own. 

15.  On  Theorem  2.  Give  another  example  illustrating 
Theorem  2. 


16-25 


SERIES 


Is  the  given  series  convergent  or  divergent?  Give  a reason. 
Show  details. 


i6.  i 

n= 0 


(20  + 30Q” 
n\ 


18.  J 

n=l 


17-  2 


(-0” 

In  n 


19.  2 


20. 

21. 

22. 

23. 

24. 

25. 

26. 

27. 

28. 


29. 

30. 


2 

n=0 

i 

n= 0 


n + i 
In2  + 2 i 
(77  + 77i)2”+1 

(2  n + 1)! 


2 — 

,-iVS 

” (-!)"(!  + i)2n 

(2  ")I 


(3i)  n! 


n=l 


Significance  of  (7).  What  is  the  difference  between  (7) 
and  just  stating  \zn+i/zn\  < 1? 

On  Theorems  7 and  8.  Give  another  example  showing 
that  Theorem  7 is  more  general  than  Theorem  8. 

CAS  EXPERIMENT.  Series.  Write  a program  for 
computing  and  graphing  numeric  values  of  the  first  n 
partial  sums  of  a series  of  complex  numbers.  Use  the 
program  to  experiment  with  the  rapidity  of  convergence 
of  series  of  your  choice. 

Absolute  convergence.  Show  that  if  a series  converges 
absolutely,  it  is  convergent. 

Estimate  of  remainder.  Let  \z.n+\/zn\  = q < 1 , so 
that  the  series  7 1 + z2  + ' ' ' converges  by  the  ratio  test. 
Show  that  the  remainder  Rn  = zn+i  + zn+ 2 + ■ ■ ■ 
satisfies  the  inequality  |/?n|  S |zn+il/(l  — <?).  Using 
this,  find  how  many  terms  suffice  for  computing  the 
sum  s of  the  series 


n + i 


with  an  error  not  exceeding  0.05  and  compute  s to  this 
accuracy. 


2 
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15.2  Power  Series 


The  student  should  pay  close  attention  to  the  material  because  we  shall  show  how  power 
series  play  an  important  role  in  complex  analysis.  Indeed,  they  are  the  most  important  series 
in  complex  analysis  because  their  sums  are  analytic  functions  (Theorem  5,  Sec.  15.3),  and 
every  analytic  function  can  be  represented  by  power  series  (Theorem  1,  Sec.  15.4). 

A power  series  in  powers  of  z — Zo  is  a series  of  the  form 


(1)  2 an(z  ~ zo)n  = ao  + dfz  ~ zo)  + a2(z  ~ zof  + ' ’ • 

n= 0 


where  z is  a complex  variable,  c/0,  a1;  • • ■ are  complex  (or  real)  constants,  called  the 
coefficients  of  the  series,  and  zo  is  a complex  (or  real)  constant,  called  the  center  of  the 
series.  This  generalizes  real  power  series  of  calculus. 

If  Zo  = 0,  we  obtain  as  a particular  case  a power  series  in  powers  of  z: 


(2)  ^ anzn  = a0  + fljz  + a2zZ  + • ■ ■ . 

n= 0 


Convergence  Behavior  of  Power  Series 

Power  series  have  variable  terms  (functions  of  z),  but  if  we  fix  z,  then  all  the  concepts 
for  series  with  constant  terms  in  the  last  section  apply.  Usually  a series  with  variable 
terms  will  converge  for  some  z and  diverge  for  others.  For  a power  series  the  situation  is 
simple.  The  series  (1)  may  converge  in  a disk  with  center  zo  or  in  the  whole  z-plane  or 
only  at  zq.  We  illustrate  this  with  typical  examples  and  then  prove  it. 


EXAMPLE  Convergence  in  a Disk.  Geometric  Series 

The  geometric  series 


2 Zn  = 1 + z ‘ 


converges  absolutely  if  |z|  < 1 and  diverges  if  \z\  = 1 (see  Theorem  6 in  Sec.  15.1). 


EXAMPLE  2 Convergence  for  Every  z 

The  power  series  (which  will  be  the  Maclaurin  series  of  ez  in  Sec.  15.4) 

* zn  z2  z3 

y.  — = i + z + — + — + 
n 1 2'  3' 

n=0  n ■ 

is  absolutely  convergent  for  every  z-  In  fact,  by  the  ratio  test,  for  any  fixed  z. 


Zn+1/(r  +1)! 

Zn/n\ 


\z\ 

n + 1 


0 as  n — > oo. 


SEC.  15.2  Power  Series 


681 


EXAM  Convergence  Only  at  the  Center.  (Useless  Series) 

The  following  power  series  converges  only  at  z = 0,  but  diverges  for  every  z # 0,  as  we  shall  show. 


2 n\zn  = 1 + z + 2z2  + 6z3  + ■■■ 

71=0 


In  fact,  from  the  ratio  test  we  have 


(n  + l)!zn 


= ("  + l)lzl 


(z  fixed  and  ¥=0). 


THEOREM  1 


Convergence  of  a Power  Series 

(a)  Every  power  series  (1)  converges  at  the  center  z o- 

(b)  //( 1)  converges  at  a point  z = Zi  ¥=  zo,  it  converges  absolutely  for  every  z 
closer  to  zo  than  zi,  that  is,  \z  — Zo I < Izi  — Zol  • See  Fig-  365. 

(c)  If  { 1)  diverges  at  z = Zi,  it  diverges  for  every  z farther  away  from  zo  than 
Z2-  See  Fig.  365. 


y 


'*n<  Divergent 


\ ' 
i i 

/ t 
/ 

/ 


X 


Fig.  365.  Theroem  1 


PROOF  (a)  For  z = Zo  the  series  reduces  to  the  single  term  ao- 

(b)  Convergence  at  z — Z\  gives  by  Theorem  3 in  Sec.  15.1  cin(z\  — Zo)n  ^ 0 as  n 
This  implies  boundedness  in  absolute  value, 

I an(z\  ~ Zo)n\  < M for  every  n = 0,  1,  • • • . 


Multiplying  and  dividing  an(z  ~ Zo)n  by  (zt  — zo)n  we  obtain  from  this 


_ Zo)U\ 


Z o) 


( Z - Zo 
\Zl  - ^0 


g M 


z ~ Zo 


n 


Cl  Zo 


Summation  over  n gives 


(3) 


2 I an(z  ~ z0f : 

n= 1 


n=  1 


Z ~ Zp 
Zl  — Zo 


Now  our  assumption  |z  — z0|  < |zi  — Zol  implies  that  |(z  - Zo)/(zi  — Zo) I < 1-  Hence 
the  series  on  the  right  side  of  (3)  is  a converging  geometric  series  (see  Theorem  6 in 
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EXAMPLE  4 


Sec.  15.1).  Absolute  convergence  of  (1)  as  stated  in  (b)  now  follows  by  the  comparison 
test  in  Sec.  15.1. 

(c)  If  this  were  false,  we  would  have  convergence  at  a z3  farther  away  from  zo  than  z2- 
This  would  imply  convergence  at  z2,  by  (b),  a contradiction  to  our  assumption  of  divergence 

atz2- 


Radius  of  Convergence  of  a Power  Series 

Convergence  for  every  z (the  nicest  case.  Example  2)  or  for  no  z A Zo  (the  useless  case, 
Example  3)  needs  no  further  discussion,  and  we  put  these  cases  aside  for  a moment.  We 
consider  the  smallest  circle  with  center  z0  that  includes  all  the  points  at  which  a given 
power  series  (1)  converges.  Let  R denote  its  radius.  The  circle 

|z  - Zol  = R (Fig-  366) 


is  called  the  circle  of  convergence  and  its  radius  R the  radius  of  convergence  of  ( 1 ).  Theorem 
1 then  implies  convergence  everywhere  within  that  circle,  that  is,  for  all  z for  which 

(4)  |z  - zol  < R 

(the  open  disk  with  center  zo  and  radius  R).  Also,  since  R is  as  small  as  possible,  the  series 
(1)  diverges  for  all  z for  which 

(5)  |z  - z0l  > R- 

No  general  statements  can  be  made  about  the  convergence  of  a power  series  (1)  on  the 
circle  of  convergence  itself.  The  series  (1)  may  converge  at  some  or  all  or  none  of  the 
points.  Details  will  not  be  important  to  us.  Hence  a simple  example  may  just  give  us 
the  idea. 


Fig.  366.  Ci  rcle  of  convergence 


Behavior  on  the  Circle  of  Convergence 

On  the  circle  of  convergence  (radius  R = 1 in  all  three  series), 
'Hzn/n2  converges  everywhere  since  X \/n2  converges, 

Xzn/rc  converges  at  —1  (by  Leibniz’s  test)  but  diverges  at  1, 
X zn  diverges  everywhere. 
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THEOREM  2 


PROOF 


EXAMPLE  5 


Notations  R = =»  and  R = 0.  To  incorporate  these  two  excluded  cases  in  the  present 
notation,  we  write 

R = oo  if  the  series  (1)  converges  for  all  z (as  in  Example  2), 

R = 0 if  (1)  converges  only  at  the  center  z = Zo  (as  in  Example  3). 

These  are  convenient  notations,  but  nothing  else. 

Real  Power  Series.  In  this  case  in  which  powers,  coefficients,  and  center  are  real, 
formula  (4)  gives  the  convergence  interval  \x  — x()  < R of  length  2 R on  the  real  line. 

Determination  of  the  Radius  of  Convergence  from  the  Coefficients.  For  this  important 
practical  task  we  can  use 


Radius  of  Convergence  R 

Suppose  that  the  sequence  \an+\/an 

1 ,n  = 

1,2,  converges  with  limit  Lr.  If 

L =0,  then  R = that  is,  the  power  series  (1)  converges  for  all  z.  If  L =£0 

( hence  L > 0),  then 

(6)  R =fc  = lim 

Cln 

(Cauchy-Hadamard  formula1). 

L n-^>  oo 

Qn+1 

If  \an+i/an\  — * oo,  then  R = 0 ( convergence  only  at  the  center  zq). 

For  (1)  the  ratio  of  the  terms  in  the  ratio  test  (Sec.  15.1)  is 


fln+l(z  Zo)n+1 

cin+l 

an(z  - Zo)n 

an 

The  limit  is 


L = L*\z  ~ zol- 


Let  L*  =£  0,  thus  L*  > 0.  We  have  convergence  if  L = L*\z  — Sol  < 1-  thus 
|z  — zol  < 1 /L*,  and  divergence  if  |z  — zol  > 1 /L*.  By  (4)  and  (5)  this  shows  that  1 /L* 
is  the  convergence  radius  and  proves  (6). 

If  L*  = 0,  then  L = 0 for  every  z,  which  gives  convergence  for  all  z by  the  ratio  test. 
If  \an+i/an\  — » oo,  then  \an  + \/an\ |z  — Zol  > 1 for  any  z ¥=  Zo  and  all  sufficiently  large 
n.  This  implies  divergence  for  all  z =£  zo  by  the  ratio  test  (Theorem  7,  Sec.  15.1). 


Formula  (6)  will  not  help  if  L*  does  not  exist,  but  extensions  of  Theorem  2 are  still 
possible,  as  we  discuss  in  Example  6 below. 


Radius  of  Convergence 


By  (6)  the  radius  of  convergence  of  the  power  series 


~ (2n)! 
„=o  (n\f 


(z  - 3 if-  is 


R = lim 


\(2  »!) 

/ (2 n + 2)!  ' 

= lim 

n— *oo 

(2m!) 

((n  + l)!)2 

_(n!)2/ 

((«  + DO2 . 

(2  n + 2)! 

(n!)2 

= lim 


(n  + 1 r 


(2  n + 2)(2n  + 1) 


The  series  converges  in  the  open  disk  \z  — 3/|  <5  of  radius  \ and  center  3 i. 


1 

4 


^^Named  after  the  French  mathematicians  A.  L.  CAUCHY  (see  Sec.  2.5)  and  JACQUES  HADAMARD 
(1865-1963).  Hadamard  made  basic  contributions  to  the  theory  of  power  series  and  devoted  his  lifework  to 
partial  differential  equations. 
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Extension  of  Theorem  2 


Find  the  radius  of  convergence  R of  the  power  series 


i 

n= 0 


i + (-i)n  + 


i 

2n 


3 + -z  + 
2 


+ •••. 


Solution.  The  sequence  of  the  ratios  g,  2(2  + 5),  1 / (8(2  + 5)),  • • • does  not  converge,  so  that  Theorem  2 is 
of  no  help.  It  can  be  shown  that 

(6*)  R = 1 it,  L = lim  V\aJ. 

n— 

This  still  does  not  help  here,  since  (V*  |«n|)  does  not  converge  because  X'  | an  = ' \/\/2n  = | for  odd  n,  whereas 
for  even  n we  have 


y/\aj  = </2+  1/2"  -»  1 as  n -*<*>, 
so  that  a.t,  has  the  two  limit  points  | and  1 . It  can  further  be  shown  that 
(6**)  R=l/l,  l the  greatest  limit  point  of  the  sequence 

Here  1=1,  so  that  R = 1.  Answer.  The  series  converges  for  |z|  < 1. 


Summary.  Power  series  converge  in  an  open  circular  disk  or  some  even  for  every  z (or 
some  only  at  the  center,  but  they  are  useless);  for  the  radius  of  convergence,  see  (6)  or 
Example  6. 

Except  for  the  useless  ones,  power  series  have  sums  that  are  analytic  functions  (as  we 
show  in  the  next  section);  this  accounts  for  their  importance  in  complex  analysis. 


F~TOBlTE7^~StT~1^^7 


1.  Power  series.  Are  1/z  + z + zz  + ■ ■ ■ andz  + z3^2  + 
z2  + z3  + ■ • • power  series?  Explain. 

2.  Radius  of  convergence.  What  is  it?  Its  role?  What 
motivates  its  name?  How  can  you  find  it? 

3.  Convergence.  What  are  the  only  basically  different 
possibilities  for  the  convergence  of  a power  series? 

4.  On  Examples  1-3.  Extend  them  to  power  series  in 
powers  of  z — 4 + 3iri.  Extend  Example  1 to  the  case 
of  radius  of  convergence  6. 

5.  Powers  z2n.  Show  that  if  tanzn  has  radius  of 
convergence  R (assumed  finite),  then  1,anz2n  has 
radius  of  convergence  V/?. 


6-18 


RADIUS  OF  CONVERGENCE 


Find  the  center  and  the  radius  of  convergence. 


6.  J 4n(z  + If 

n= 0 


7-  2 

n= 0 


(-if 

(2m)! 


8.  2 - 7 (z  - m')" 

n\ 

n= 0 


io.  2 


(z  - 20” 


9-  i^’^-O2” 

n= 0 


2 - i 


n.  2 t 


+ 5/ 


00  , , xU 

12.  Sff 


13.  ^ 16 n(z  + ifn 


-sp  ( 1)  2ra 

14  ^ 22>!)2" 

n= 0 

-if.  V (2/0 1 n 

6‘  2,  2n(n\f  <- 


(2«)l 

15‘  2 4>!f(Z“2° 

n=0 

00  r*n 

, n V z 2n+l 

^ ^ n(n  + 1)  z 


y 2(~  1)"  2n+l 

V7r(2n  + 1)m  ! ~ 


19.  CAS  PROJECT.  Radius  of  Convergence.  Write  a 
program  for  computing  R from  (6),  (6*),  or  (6**),  in 
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this  order,  depending  on  the  existence  of  the  limits 
needed.  Test  the  program  on  some  series  of  your  choice 
such  that  all  three  formulas  (6),  (6*),  and  (6**)  will 
come  up. 

20.  TEAM  PROJECT.  Radius  of  Convergence. 

(a)  Understanding  (6).  Formula  (6)  for  R contains 
\an/an+i\,  not  |aTC+1/aK|.  How  could  you  memorize 
this  by  using  a qualitative  argument? 

(b)  Change  of  coefficients.  What  happens  to  R 
(0  < R < oo)  if  you  (i)  multiply  all  an  by  k =£  0, 


(ii)  multiply  all  an  by  kn  ^ 0,  (iii)  replace  an  by 
1 /anl  Can  you  think  of  an  application  of  this? 

(c)  Understanding  Example  6,  which  extends 
Theorem  2 to  nonconvergent  cases  of  an/an+i . 
Do  you  understand  the  principle  of  “mixing”  by 
which  Example  6 was  obtained?  Make  up  further 
examples. 

(d)  Understanding  (b)  and  (c)  in  Theorem  1.  Does 
there  exist  a power  series  in  powers  of  z that  converges 
at  z = 30  + 101  and  diverges  at  z = 31  — 6 it  Give 
reason. 


15.3  Functions  Given  by  Power  Series 

Here,  our  main  goal  is  to  show  that  power  series  represent  analytic  functions.  This  fact 
(Theorem  5)  and  the  fact  that  power  series  behave  nicely  under  addition,  multiplication, 
differentiation,  and  integration  accounts  for  their  usefulness. 

To  simplify  the  formulas  in  this  section,  we  take  zo  = 0 and  write 


(1)  2 anZl ■ 

n= 0 

There  is  no  loss  of  generality  because  a series  in  powers  of  z — Zo  with  any  zo  can  always 
be  reduced  to  the  form  (1)  if  we  set  z — zo  = z. 

Terminology  and  Notation.  If  any  given  power  series  (1)  has  a nonzero  radius  of 
convergence  R (thus  R > 0),  its  sum  is  a function  of  z,  say  /(z).  Then  we  write 

oo 

(2)  /(z)  = 2 anzn  = a0  + fliz  + a2z2  + ■•■  (|z|  < R). 

n= 0 

We  say  that/(z)  is  represented  by  the  power  series  or  that  it  is  developed  in  the  power 
series.  For  instance,  the  geometric  series  represents  the  function  /(z)  = 1/(1  — z)  in  the 
interior  of  the  unit  circle  |z|  = 1.  (See  Theorem  6 in  Sec.  15.1.) 

Uniqueness  of  a Power  Series  Representation.  This  is  our  next  goal.  It  means  that  a 
function  f(z)  cannot  be  represented  by  two  different  power  series  with  the  same  center. 
We  claim  that  if  /(z)  can  at  all  be  developed  in  a power  series  with  center  zo,  the 
development  is  unique.  This  important  fact  is  frequently  used  in  complex  analysis  (as  well 
as  in  calculus).  We  shall  prove  it  in  Theorem  2.  The  proof  will  follow  from 


THEOREM  1 


Continuity  of  the  Sum  of  a Power  Series 

If  a function  f(z)  can  be  represented  by  a power  series  (2)  with  radius  of  convergence 
R > 0,  then  f(z)  is  continuous  at  z = 0. 
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PROOF  From  (2)  with  2 = 0 we  have  /( 0)  = a0.  Hence  by  the  definition  of  continuity  we 
must  show  that  limz^0  f(z)  = /( 0)  = a0.  That  is,  we  must  show  that  for  a given  e > 0 
there  is  a 8 > 0 such  that  |z|  < 8 implies  |/(z)  — flo  I < e.  Now  (2)  converges  abso- 
lutely for  |z|  g r with  any  r such  that  0 < r < R,  by  Theorem  1 in  Sec.  15.2.  Hence 
the  series 


2 I an\rn  1 

n=  1 


i 

r 


2 I an\rn 


n=  1 


converges.  Let  S # 0 be  its  sum.  (.S'  = 0 is  trivial.)  Then  for  0 < |z|  r. 


I f(z)  ~ a0 1 


X1  n 

2j  anZ 

71=  1 


Ul  X lflnllzl”  1 = Ul  2 lanlrm  1 = \z\S 
n= 1 n=l 


and  |z|5  < e when  Izl  < 5,  where  5 > 0 is  less  than  r and  less  than  e/S.  Hence 
|z|S  < SS  < (e/S)S  = e.  This  proves  the  theorem.  ■ 


From  this  theorem  we  can  now  readily  obtain  the  desired  uniqueness  theorem  (again 
assuming  z0  = 0 without  loss  of  generality): 


THEOREM  2 


Identity  Theorem  for  Power  Series.  Uniqueness 

Let  the  power  series  a0  + uqz  + a2ZZ  + • ■ • and  b0  + b±z  + b2Z2  + • ■ • both  be 
convergent  for  |z|  < R,  where  R is  positive,  and  let  them  both  have  the  same  sum 
for  all  these  z.  Then  the  series  are  identical,  that  is,  a0  = b0,  a\  = b\,  a2  = b2,  • • ■ . 

Hence  if  a function  f(z)  can  be  represented  by  a power  series  with  any  center  z0> 
this  representation  is  unique. 


PROOF  We  proceed  by  induction.  By  assumption, 

a0  + a\z  + a2z2  + • ■ • = b0  + b\z  + b2z2  + ■■■  (|z|  < R). 

The  sums  of  these  two  power  series  are  continuous  at  z = 0,  by  Theorem  1 . Hence  if  we 
consider  Izl  > 0 and  let  z — » 0 on  both  sides,  we  see  that  a0  = b&.  the  assertion  is  true 
for  n = 0.  Now  assume  that  an  = bn  for  n = 0,  1,  • • ■ , m.  Then  on  both  sides  we  may 
omit  the  terms  that  are  equal  and  divide  the  result  by  zm+1  (=A  0);  this  gives 

®m+ 1 T ■ 2 Z T flm+3Z  T bm  + 1 + bm+2Z  T ^m+3Z  T ' ‘ . 

Similarly  as  before  by  letting  2~>0  we  conclude  from  this  that  am+\  = bmA  \ . This 
completes  the  proof. 


Operations  on  Power  Series 

Interesting  in  itself,  this  discussion  will  serve  as  a preparation  for  our  main  goal,  namely, 
to  show  that  functions  represented  by  power  series  are  analytic. 
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Termwise  addition  or  subtraction  of  two  power  series  with  radii  of  convergence  Ri  and 
R2  yields  a power  series  with  radius  of  convergence  at  least  equal  to  the  smaller  of  R\ 
and  R2.  Proof.  Add  (or  subtract)  the  partial  sums  sn  and  s%  term  by  term  and  use 
lim  (sn  ± = lim  sn  ± lim  sJl,. 

Termwise  multiplication  of  two  power  series 

oo 

f(z)  = 2 akZk  = «0  + a\z  + • ■ • 

k= 0 

and 

oo 

g(z)  = 2 bmZm  = b0  + bxz  + ■ ■ ■ 

m= 0 

means  the  multiplication  of  each  term  of  the  first  series  by  each  term  of  the  second  series 
and  the  collection  of  like  powers  of  z.  This  gives  a power  series,  which  is  called  the 
Cauchy  product  of  the  two  series  and  is  given  by 


a0b0  + (ao*i  + ai  bo)z  + (flo^2  + fli&i  + a2b0)zZ  + • ■ ■ 

oo 

= 2 ( aoK  + a\bn—\  + ■ ■ • + anb0)zn. 

n= 0 

We  mention  without  proof  that  this  power  series  converges  absolutely  for  each  z within 
the  smaller  circle  of  convergence  of  the  two  given  series  and  has  the  sum  ,v(z)  = f(z)g(z). 
For  a proof,  see  [D5]  listed  in  App.  1. 


Termwise  differentiation  and  integration  of  power  series  is  permissible,  as  we  show 
next.  We  call  derived  series  of  the  power  series  (1)  the  power  series  obtained  from  (1) 
by  termwise  differentiation,  that  is, 

oo 

(3)  2 = «i  + 2 a2z  + 3 a3z2  + • • • . 

n= 1 


THEOREM  3 


Termwise  Differentiation  of  a Power  Series 

The  derived  series  of  a power  series  has  the  same  radius  of  convergence  as  the 
original  series. 


PROOF  This  follows  from  (6)  in  Sec.  15.2  because 


lim 


= lim 


( n + l)|flm+1|  n + 1 n- 


lim 


®n+l 


lim 

n— » oo 


dn+ 1 


or,  if  the  limit  does  not  exist,  from  (6**)  in  Sec.  15.2  by  noting  that  vn—>  1 as  n ■ 
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EXAMPLE  Application  of  Theorem  3 

Find  the  radius  of  convergence  R of  the  following  series  by  applying  Theorem  3. 

i(”V  = z2  + 3z3  + 6z4  + 10z5 *  + ---. 

n= 2 '2' 

Solution.  Differentiate  the  geometric  series  twice  term  by  term  and  multiply  the  result  by  zz/2.  This  yields 
the  given  series.  Hence  R = 1 by  Theorem  3. 


THEOREM  4 


Termwise  Integration  of  Power  Series 

The  power  series 


2 

n= 0 


Gn  _n+l 

n + 1 4 


a0z  + 


obtained  by  integrating  the  series  ao  + aiz  + a2Z2  + ■ ■ ■ term  by  term  has  the  same 
radius  of  convergence  as  the  original  series. 


The  proof  is  similar  to  that  of  Theorem  3. 

With  the  help  of  Theorem  3,  we  establish  the  main  result  in  this  section. 


Power  Series  Represent  Analytic  Functions 


THEOREM  5 


Analytic  Functions.  Their  Derivatives 

A power  series  with  a nonzero  radius  of  convergence  R represents  an  analytic 
function  at  every  point  interior  to  its  circle  of  convergence.  The  derivatives  of  this 
function  are  obtained  by  differentiating  the  original  series  term  by  term.  All  the 
series  thus  obtained  have  the  same  radius  of  convergence  as  the  original  series. 
Hence,  by  the  first  statement,  each  of  them  represents  an  analytic  function. 


PROOF  (a)  We  consider  any  power  series  (1)  with  positive  radius  of  convergence  R.  Let/(z)  be 
its  sum  and/i(z)  the  sum  of  its  derived  series;  thus 


(4)  f(z)  = 2 anZn  and  fi(z)  = 2 nanZn  * 

n= 0 n= 1 

We  show  that  /(z)  is  analytic  and  has  the  derivative  f\{z)  in  the  interior  of  the  circle  of 

convergence.  We  do  this  by  proving  that  for  any  fixed  z with  z < R and  Az— >0  the 
difference  quotient  [f(z  + Az)  — /(z)]/Az  approaches ^(z).  By  termwise  addition  we  first 
have  from  (4) 


(5) 


f(z  + Az)  - /(z) 
Az 


- /l(z)  = 2 an 

n= 2 


■(z  + A z)n  -Z7' 

Az 


— nz 


n—  1 


Note  that  the  summation  starts  with  2,  since  the  constant  term  drops  out  in  taking  the 
difference /(z  + Az)  — /(z),  and  so  does  the  linear  term  when  we  subtract  /i(z)  from  the 
difference  quotient. 
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(b)  We  claim  that  the  series  in  (5)  can  be  written 


(6) 


2 On  Az[(z  + A z)n  2 + 2z(z  + Az)re  3 + • • • + (n  — 2 )zn  3(z  + Az) 


n = 2 

+ (n  - l)zn_2]. 


The  somewhat  technical  proof  of  this  is  given  in  App.  4. 

(c)  We  consider  (6).  The  brackets  contain  n — 1 terms,  and  the  largest  coefficient  is 
n — 1.  Since  («  — l)2  n(n  — 1),  we  see  that  for  |z|  = Ro  and  |z  + Az|  R0,  R0  < R, 
the  absolute  value  of  this  series  (6)  cannot  exceed 


(7)  I Az|  2 \an\n(n  — \)Rq~2. 

n= 2 

This  series  with  an  instead  of  an  is  the  second  derived  series  of  (2)  at  z = Ro  and 
converges  absolutely  by  Theorem  3 of  this  section  and  Theorem  1 of  Sec.  15.2.  Hence 
our  present  series  (7)  converges.  Let  the  sum  of  (7)  (without  the  factor  | Az|)  be  K(R0). 
Since  (6)  is  the  right  side  of  (5),  our  present  result  is 


/(z  + Az)  - /(z) 

Az 


- fi(z) 


lAzl^o). 


Letting  Az  ^0  and  noting  that  Ro  (<  R)  is  arbitrary,  we  conclude  that/(z)  is  analytic  at 
any  point  interior  to  the  circle  of  convergence  and  its  derivative  is  represented  by  the  derived 
series.  From  this  the  statements  about  the  higher  derivatives  follow  by  induction. 

Summary.  The  results  in  this  section  show  that  power  series  are  about  as  nice  as  we 
could  hope  for:  we  can  differentiate  and  integrate  them  term  by  term  (Theorems  3 and  4). 
Theorem  5 accounts  for  the  great  importance  of  power  series  in  complex  analysis:  the 
sum  of  such  a series  (with  a positive  radius  of  convergence)  is  an  analytic  function  and 
has  derivatives  of  all  orders,  which  thus  in  turn  are  analytic  functions.  But  this  is  only 
part  of  the  story.  In  the  next  section  we  show  that,  conversely,  every  given  analytic  function 
/(z)  can  be  represented  by  power  series,  called  Taylor  series  and  being  the  complex  analog 
of  the  real  Taylor  series  of  calculus. 


1.  Relation  to  Calculus.  Material  in  this  section  gener- 
alizes calculus.  Give  details. 

2.  Termwise  addition.  Write  out  the  details  of  the  proof 
on  termwise  addition  and  subtraction  of  power  series. 

3.  On  Theorem  3.  Prove  that  X'  n — > 1 as  n — » as 
claimed. 

4.  Cauchy  product.  Show  that  (1  - z)-2  = ^ (n  + I )zn 

n= 0 

(a)  by  using  the  Cauchy  product,  (b)  by  differentiating 
a suitable  series. 


5-15 


RADIUS  OF  CONVERGENCE 
BY  DIFFERENTIATION  OR  INTEGRATION 


Find  the  radius  of  convergence  in  two  ways:  (a)  directly  by 
the  Cauchy-Hadamard  formula  in  Sec.  15.2,  and  (b)  from  a 
series  of  simpler  terms  by  using  Theorem  3 or  Theorem  4. 

00  / i\  00/1  \7l  / \2n+l 

, n(-n  - , „.vn 

5-  Zj  — 2T — (z  - 20 


^ (-If  / Z 
“ 2 n + 1 V27T 


7-  S^(z  + 202 

n=  1 


n(n  +1) 
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9-  2 


(-2)”  2n 

z 


n(n  + 1 ){n  + 2) 


10.  2 

n = k 

11.  i 


m 

3 nn(n  + 1 ) 


- (z  + 2)2 


12-  2 


2n(2n  - 1) 


13.  2 

n= 0 L 

14-  2 


n 

n + k 
k 

n + m 
m 


~4nn(n-l).  „ 

1S-  2,  7T (z  “ ;) 
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APPLICATIONS 

OF  THE  IDENTITY  THEOREM 


State  clearly  and  explicitly  where  and  how  you  are  using 
Theorem  2. 

16.  Even  functions.  If  f(z)  in  (2)  is  even  (i.e., 
/(— z)  = /(z)),  show  that  an  = 0 for  odd  n.  Give 
examples. 


17.  Odd  function.  If/(z)  in  (2)  is  orM  (i.e.,  /(— z)  = — /(z)), 
show  that  an  — 0 for  even  n.  Give  examples. 

18.  Binomial  coefficients.  Using  (1  + z)p(l  + z)q  = 
(1  + z)p  + q,  obtain  the  basic  relation 


19.  Find  applications  of  Theorem  2 in  differential  equa- 
tions and  elsewhere. 

20.  TEAM  PROJECT.  Fibonacci  numbers.2  (a)  The 

Fibonacci  numbers  are  recursively  defined  by 

flO  r/j  1,  CVyi+\  — l tf  n 1, 2,  . 

Find  the  limit  of  the  sequence  ( an  + i/an ). 

(b)  Fibonacci’s  rabbit  problem.  Compute  a list  of 
at,  ■ ■ ■ , ai2-  Show  that  a12  = 233  is  the  number 
of  pairs  of  rabbits  after  12  months  if  initially  there 
is  1 pair  and  each  pair  generates  1 pair  per  month, 
beginning  in  the  second  month  of  existence  (no  deaths 
occurring). 

(c)  Generating  function.  Show  that  the  generating 
function  of  the  Fibonacci  numbers  is  f{z)  = 
1/(1  — z ~ z2)',  that  is,  if  a power  series  (1)  represents 
this/(z),  its  coefficients  must  be  the  Fibonacci  numbers 
and  conversely.  Hint.  Start  from/(z)(l  — z ~ z2)  = 1 
and  use  Theorem  2. 


15.^  Taylor  and  Maclaurin  Series 

The  Taylor  series3  of  a function /(z),  the  complex  analog  of  the  real  Taylor  series  is 

(1)  f(z)  = 2 an(z  ~ Zo)n  where  an  = — /(n)(z0) 

n\ 


or,  by  (1),  Sec.  14.4, 


(2) 


Cln. 


2iri 


f(z*) 


(z*  - Zof 


iT  dz*. 


In  (2)  we  integrate  counterclockwise  around  a simple  closed  path  C that  contains  zo  in  its 
interior  and  is  such  that/(z)  is  analytic  in  a domain  containing  C and  every  point  inside  C. 
A Maclaurin  series3  is  a Taylor  series  with  center  zq  = 0. 


2LEONARDO  OF  PISA,  called  FIBONACCI  (=  son  of  Bonaccio),  about  1180-1250,  Italian  mathematician, 
credited  with  the  first  renaissance  of  mathematics  on  Christian  soil. 

3BROOK  TAYLOR  (1685-1731),  English  mathematician  who  introduced  real  Taylor  series.  COLIN 
MACLAURIN  (1698-1746),  Scots  mathematician,  professor  at  Edinburgh. 
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THEOREM  1 


PROOF 


The  remainder  of  the  Taylor  series  (1)  after  the  term  an(z  — Zo)n  is 


(3) 


Rn(.z)  = 


(z  ~ zo ) 


n+1 


277/ 


Jc 


f(z*) 

(Z*  - z0 )n+\z*  - z) 


dz* 


(proof  below).  Writing  out  the  corresponding  partial  sum  of  (1),  we  thus  have 


/(z)  =/(z0)  + 


z-zo  . 


(4) 


+ 


1! 

(z  - Zo)" 


fXz  o)  + 


(z  - Zo) 


2! 


~f  (zo)  + 


fn\zo)  + Rn(z). 


This  is  called  Taylor’s  formula  with  remainder. 

We  see  that  Taylor  series  are  power  series.  From  the  last  section  we  know  that  power 
series  represent  analytic  functions.  And  we  now  show  that  every  analytic  function  can  be 
represented  by  power  series,  namely,  by  Taylor  series  (with  various  centers).  This  makes 
Taylor  series  very  important  in  complex  analysis.  Indeed,  they  are  more  fundamental  in 
complex  analysis  than  their  real  counterparts  are  in  calculus. 


Taylor’s  Theorem 

Let  f(z)  be  analytic  in  a domain  D,  and  let  z = Zo  be  any  point  in  D.  Then  there 
exists  precisely  one  Taylor  series  (1)  with  center  z0  that  represents  f(z).  This 
representation  is  valid  in  the  largest  open  disk  with  center  zo  hi  which  f(z)  is  analytic. 
The  remainders  Rn(z)  of  ( 1)  can  be  represented  in  the  form  (3).  The  coefficients 
satisfy  the  inequality 


(5) 


an  ^ 


M 


where  M is  the  maximum  of  \f(z)\  on  a circle  |z  — zol  = r in  D whose  interior  is 
also  in  D. 


The  key  tool  is  Cauchy’s  integral  formula  in  Sec.  14.3;  writing  z and  z*  instead  of  zo  and 
Z (so  that  z*  is  the  variable  of  integration),  we  have 


(6) 


f(z*) 
Z*  - z 


dz*. 


z lies  inside  C,  for  which  we  take  a circle  of  radius  r with  center  zo  and  interior  in  D 
(Fig.  367).  We  develop  l/(z*  — z)  in  (6)  in  powers  of  z — Zo-  By  a standard  algebraic 
manipulation  (worth  remembering!)  we  first  have 


(7) 


_J 1 

Z*  - Z Z*  - Zo  _ (z  - Zo) 


1 


(z*  - Zo) 
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Fig.  367.  Cauchy  formula  (6) 


For  later  use  we  note  that  since  z*  is  on  C while  z is  inside  C,  we  have 


(7*) 


< 1. 


To  (7)  we  now  apply  the  sum  formula  for  a finite  geometric  sum 


(8*) 


1 + q + ■■■  + qn 


1 - qn+1 
1 -q 


q 


n+ 1 


1 - q 


(Fig.  367). 


(q  * 1), 


which  we  use  in  the  form  (take  the  last  term  to  the  other  side  and  interchange  sides) 

n+1 

(8)  = 1 + q + • • • T qn  + . 

1 — q 1 — q 

Applying  this  with  q = (z  — Zo)/(z*  ~ Zo)  to  the  right  side  of  (7),  we  get 


1 


Z*  - z 


We  insert  this  into  (6).  Powers  of  z ~ Zo  do  not  depend  on  the  variable  of  integration  z*, 
so  that  we  may  take  them  out  from  under  the  integral  sign.  This  yields 


/(z)  = -’t  l f(z*}  dz*  + 

27 Tl  ^ Z*  - Zo  27 Tl 


JC 


f(z*)  _ 
(z*  - Z0): 


+ 


(z  - Zof 

277; 


■ dz * + 


f(z*) 


Jc 


(z*  - Zof 


PX  dz*  + Rn{z ) 


with  Rn(z)  given  by  (3).  The  integrals  are  those  in  (2)  related  to  the  derivatives,  so  that 
we  have  proved  the  Taylor  formula  (4). 

Since  analytic  functions  have  derivatives  of  all  orders,  we  can  take  n in  (4)  as  large  as 
we  please.  If  we  let  n approach  infinity,  we  obtain  (1).  Clearly,  (1)  will  converge  and 
represent /(z)  if  and  only  if 


lim  Rn(z)  = 0. 

n— »°o 


(9) 
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THEOREM  2 


PROOF 


We  prove  (9)  as  follows.  Since  z*  lies  on  C,  whereas  z lies  inside  C (Fig.  367),  we  have 
|z*  — z|  > 0.  Since /(z)  is  analytic  inside  and  on  C,  it  is  bounded,  and  so  is  the  function 
/(z*)/(z*  - z),  say, 


/(z*) 
Z*  - z 


::  M 


for  all  z*  on  C.  Also,  C has  the  radius  r = |z*  — zol  and  the  length  277 r.  Hence  by  the 
ML-inequality  (Sec.  14.1)  we  obtain  from  (3) 


= 


lz  - Zol 

277 


|n+l 


(10) 


Jc 


/(z*) 

(z*  - zo)n+\z*  - z ) 


dz* 


\n+l 


|Z  - Zol'”  ~ 1 

~ M 2777  = M 

Lit  7I.+1 

r 


z - z o 

r 


n+ 1 


Now  |z  — zol  < r because  z lies  inside  C.  Thus  |z  — zol/r  < 1,  so  that  the  right  side 
approaches  0 as  n — > 00 . This  proves  that  the  Taylor  series  converges  and  has  the  sum/(z). 
Uniqueness  follows  from  Theorem  2 in  the  last  section.  Finally,  (5)  follows  from  cin  in 
(1)  and  the  Cauchy  inequality  in  Sec.  14.4.  This  proves  Taylor’s  theorem. 


Accuracy  of  Approximation.  We  can  achieve  any  preassinged  accuracy  in  approxi- 
mating /(z)  by  a partial  sum  of  (1)  by  choosing  n large  enough.  This  is  the  practical  use 
of  formula  (9). 


Singularity,  Radius  of  Convergence.  On  the  circle  of  convergence  of  (1)  there  is  at 
least  one  singular  point  of  /(z),  that  is,  a point  z = c at  which  /(z)  is  not  analytic 
(but  such  that  every  disk  with  center  c contains  points  at  which  /(z)  is  analytic).  We 
also  say  that  /(z)  is  singular  at  c or  has  a singularity  at  c.  Hence  the  radius  of  con- 
vergence R of  (1)  is  usually  equal  to  the  distance  from  zo  to  the  nearest  singular  point 
of  /(z). 

(Sometimes  R can  be  greater  than  that  distance:  Ln  z is  singular  on  the  negative  real 
axis,  whose  distance  from  zq  = — 1 + i is  1,  but  the  Taylor  series  of  Ln  z with  center 
Zo  = — 1 + i has  radius  of  convergence  V2.) 


Power  Series  as  Taylor  Series 

Taylor  series  are  power  series — of  course!  Conversely,  we  have 


Relation  to  the  Previous  Section 

A power  series  with  a nonzero  radius  of  convergence  is  the  Taylor  series  of  its  sum. 


Given  the  power  series 

/(z)  = flo  + fll(z  “ Zo)  + fl2 (z  - Zo)2  + «3(z  - Zo)3  + ‘ ’ ■ 
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EXAMPLE  2 


Then  /(z0)  = Oq.  By  Theorem  5 in  Sec.  15.3  we  obtain 

f'(z)  = fli  + 2 a2(z  ~ Jo)  + 3fl3(z  - Zof  + • ' • , thus  f'{zQ)  = ax 

f"(z ) = 2a2  + 3 • 2(z  - zo)  + ' ' • , thus  f"(z0)  = 2 \a2 

and  in  general  f^izo)  = n\an.  With  these  coefficients  the  given  series  becomes  the  Taylor 
series  of  f{z)  with  center  ~0. 

Comparison  with  Real  Functions.  One  surprising  property  of  complex  analytic 
functions  is  that  they  have  derivatives  of  all  orders,  and  now  we  have  discovered  the  other 
surprising  property  that  they  can  always  be  represented  by  power  series  of  the  form  (1). 
This  is  not  true  in  general  for  real  functions;  there  are  real  functions  that  have  derivatives 
of  all  orders  but  cannot  be  represented  by  a power  series.  (Example: /(x)  = exp  ( — 1/x2) 
if  x ~t~  0 and  /( 0)  = 0;  this  function  cannot  be  represented  by  a Maclaurin  series  in  an 
open  disk  with  center  0 because  all  its  derivatives  at  0 are  zero.) 

Important  Special  Taylor  Series 

These  are  as  in  calculus,  with  x replaced  by  complex  z.  Can  you  see  why?  ( Answer . The 
coefficient  formulas  are  the  same.) 

Geometric  Series 

Let/(j)  = 1/(1  — z).  Then  we  have  f<n\z)  = n\/(  1 — z)n+1,fn\ 0)  = «!.  Hence  the  Maclaurin  expansion  of 
1/(1  — z)  is  the  geometric  series 


(11)  — ' — = 2 z.n  = 1 + Z + z2  + "• 

1 Z n= 0 

f(z ) is  singular  at  z — 1;  this  point  lies  on  the  circle  of  convergence. 


(\z\  < 1). 


Exponential  Function 

We  know  that  the  exponential  function  ez  (Sec.  13.5)  is  analytic  for  all  z,  and  ( ez )'  = ez.  Hence  from  (1)  with 
Zo  = 0 we  obtain  the  Maclaurin  series 


n 


(12) 


„ z"  z2 

ez=  y — = 1 + z + — ■ 

^ n 1 2' 

n= 0 


This  series  is  also  obtained  if  we  replace  x in  the  familiar  Maclaurin  series  of  ex  by  z- 

Furthermore,  by  setting  z — iy  in  (12)  and  separating  the  series  into  the  real  and  imaginary  parts  (see  Theorem 
2,  Sec.  15.1)  we  obtain 


00  (jv\n  00  v2k  00  v2fc  + 1 

**=  2 — = 2 (-Dfc—  + /2  (-Dfc^ 

„“o  n\  (2 k)\  ±n  (2k  + 1)! 


k= 0 


Since  the  series  on  the  right  are  the  familiar  Maclaurin  series  of  the  real  functions  cos  y and  sin  y,  this  shows 
that  we  have  rediscovered  the  Euler  formula 


(13) 


i sin  y. 


Indeed,  one  may  use  (12)  for  defining  ez  and  derive  from  (12)  the  basic  properties  of  ez.  For  instance,  the 
differentiation  formula  ( ez)'  = ez  follows  readily  from  (12)  by  termwise  differentiation. 
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EXAMPLE  3 


EXAMPLE  4 


EXAMPLE  5 


Trigonometric  and  Hyperbolic  Functions 

By  substituting  (12)  into  (1)  of  Sec.  13.6  we  obtain 


(14) 


2 n 


COS  Z = 


2 (-D" 

n= 0 


(2n)! 


+ ... 


sin  z = 


Et-D1 

n= 0 


2n+l 

Z 

(2  n + 1)! 


+ •••. 


When  z — x these  are  the  familiar  Maclaurin  series  of  the  real  functions  cos  x and  sin  x.  Similarly,  by  substituting 
(12)  into  (11),  Sec.  13.6,  we  obtain 


(15) 


cosh  z = E = 1 

»T0  (2*0! 

oo  2n+l 

sinh  z = 

„=o  (2»  + D! 


Logarithm 

From  (1)  it  follows  that 


(16) 


Ln(l  + z)  = z - — + — 
2 3 


.2  t3 

-Ln  ( 1 - z)  = Ln = z + — + — - 

1 - z 2 3 


Replacing  z by  — z and  multiplying  both  sides  by  —1,  we  get 

(17) 

By  adding  both  series  we  obtain 

1 + Z ( Z3  Z5 

(18)  Ln = 2 1 z H 1 V ■ 

1 - Z V 3 5 


(Izl  < 1). 


(Id  < 1). 


(Izl  < 1).  ■ 


Practical  Methods 

The  following  examples  show  ways  of  obtaining  Taylor  series  more  quickly  than  by  the 
use  of  the  coefficient  formulas.  Regardless  of  the  method  used,  the  result  will  be  the  same. 
This  follows  from  the  uniqueness  (see  Theorem  1). 

Substitution 

Find  the  Maclaurin  series  of/(z)  = 1/(1  + z2). 

Solution.  By  substituting  — ;2  for  z in  (11)  we  obtain 

(19)  — 1 — = y~—  = 2 (-z2)n=  2 {-\)nz2rl=  1 - z2  + z4  - z6  + •••  (Izl  < 1).  ■ 

1 + z 1 - (— Z ) n= 0 71=0 
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EXAMPLE  6 


EXAMPLE  7 


EXAMPLE  8 


Integration 

Find  the  Maclaurin  series  of  j'(z)  = arctan  z. 

Solution.  We  have  f\z)  = 1/(1  + z2).  Integrating  (19)  term  by  term  and  using /(0)  = 0 we  get 

“ (-1)™  73  75 

arctan  z = 2 L Z2"+1  = Z ~ ~ + ~ - + ■ ■ ■ (kl  < 1); 

»-o  2n  + 1 35 

this  series  represents  the  principal  value  of  w = » + iv  = arctan  z defined  as  that  value  for  which 
\u\  < 7t/2.  I 


Development  by  Using  the  Geometric  Series 

Develop  l/(c  — z)  in  powers  of  z — Zo.  where  c — Zo  ^ 0. 

Solution.  This  was  done  in  the  proof  of  Theorem  1,  where  c = z*.  The  beginning  was  simple  algebra  and 
then  the  use  of  (11)  with  z replaced  by  (z  — Zo)/(c  — Zo): 


1 


1 


C - Z C - Zo  - (z  - Zo) 


(c  - Zo)  1 ~ 


“7^77  2 brr 


z - zo\  c - Z 0 _n  \C  - Zo 
C - Zo 


Z Zo  f z Zo 


1 + 

c - Zo  v C - Zo  vc  “ Zo 


This  series  converges  for 


z Zo 
c ~ Zo 


< 1,  that  is,  |z  - zol  < |c  - Zol- 


Binomial  Series,  Reduction  by  Partial  Fractions 

Find  the  Taylor  series  of  the  following  function  with  center  zq  = 1. 


/(z)  = 


2z2  + 9z  + 5 
z3  + z2  - 8z  - 12 


Solution.  We  develop /(z)  in  partial  fractions  and  the  first  fraction  in  a binomial  series 


(20) 


= (l  + zTm  = 2 

(l  + z)m 


—m 

n 


= \ — mz 


m(m  +1)  _ m(m  + 1 )(m  + 2)  _ 

z2 z3  + ■ 

2!  3! 


with  m = 2 and  the  second  fraction  in  a geometric  series,  and  then  add  the  two  series  term  by  term.  This  gives 


1 2 

/(Z)  = o + 


(z  + 2)2  z - 3 [3  + (z  - l)]2  2 - (z  - 1)  9 V[1  + |(z  - 1 )fJ  1 - |(z  - D 


= 72 


-2  \/z  - 1 


- 2 

71=0 


Z — 1 
2 


= 2 

71=0 


(-!)>  + 1)  _ 

^71+2  2n 


(z  - If 


8 31  23  o 275  . 

= (z  — 1) (z  - l)2 (z  — l)3 . 

9 54  108  1944 


We  see  that  the  first  series  converges  for  | z ~ 1 1 <3  and  the  second  for  | z ~ 1 1 <2.  This  had  to  be  expected 
because  l/(z  + 2)  is  singular  at  —2  and  2/{z  ~ 3)  at  3,  and  these  points  have  distance  3 and  2,  respectively, 
from  the  center  zo  = 1-  Hence  the  whole  series  converges  for  | z ~ 1 1 <2. 
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FRQB1-EZM=S^FT— 1^4 


1.  Calculus.  Which  of  the  series  in  this  section  have  you 
discussed  in  calculus?  What  is  new? 

2.  On  Examples  5 and  6.  Give  all  the  details  in  the 
derivation  of  the  series  in  those  examples. 


3-10 


MACLAURIN  SERIES 


Find  the  Maclaurin  series  and  its  radius  of  convergence. 


3.  sin  2z 
1 


5. 


2 + z4 


7. 

COS2  \z 

f z 

9. 

exp  ( 

—t 

2 


dt 


4. 


6. 


z + 2 

, 2 
1 - Z 

1 

1 + 3 iz 


8.  sin2  z 


10.  exp  (z  ) exp  (— t ) dt 


11-14 


HIGHER  TRANSCENDENTAL 
FUNCTIONS 


Find  the  Maclaurin  series  by  termwise  integrating  the 
integrand.  (The  integrals  cannot  be  evaluated  by  the  usual 
methods  of  calculus.  They  define  the  error  function  erf  z, 
sine  integral  Si(z),  and  Fresnel  integrals4  S(z)  and  C(z), 
which  occur  in  statistics,  heat  conduction,  optics,  and  other 
applications.  These  are  special  so-called  higher  transcen- 
dental functions.) 

dt  12.  C(z)  = f cos  t2  dt 
■'o 

13.  erf  z = — I e~l  dt  14.  Si(z)  = [ — — dt 

Vtt  )0  )0  t 

15.  CAS  Project,  sec,  tan.  (a)  Euler  numbers.  The 

Maclaurin  series 


11.  S(z)  = 


sin  t 


(21) 


sec  z = E o — 


+ ■ ■ ■ 


defines  the  Euler  numbers  E2n.  Show  that  Eg  = 

E2  = — 1,  £4  = 5,  Eg  — —61.  Write  a program  that 
computes  the  E2n  from  the  coefficient  formula  in  (1) 
or  extracts  them  as  a list  from  the  series.  (For  tables 
see  Ref.  [GenRefl],  p.  810,  listed  in  App.  1.) 

(b)  Bernoulli  numbers.  The  Maclaurin  series 


(22) 


z 


ez  - 1 


— 1 + Bjz  + 


+ ■ • • 


defines  the  Bernoulli  numbers  Bn.  Using  undetermined 
coefficients,  show  that 

(23)  Sl=  “2-  fi2  = S’  fi3  = 0’ 

E4  = 30  , Bg  = 0,  Bg  = 42  , ' ' ' • 

Write  a program  for  computing  Bn. 

(c)  Tangent.  Using  (1),  (2),  Sec.  13.6,  and  (22),  show 
that  tan  z has  the  following  Maclaurin  series  and 
calculate  from  it  a table  of  Bg,  • • • , B2q'. 


(24)  tan  z = 


2 i 

e2iz  - 1 


= 2 (-if 


4i 

e4iz  - 1 
22n(22n  - l) 


(2m)! 


B2nz 


16.  Inverse  sine.  Developing  l/\/ 1 — zz  and  integrating, 
show  that 


, 1 \ z 

arcsin  z = Z + I — I E 


1 • 3\z 


2-4/5 


1 • 3 • 5\z  , , 

2 • 4 • 6 j 7 + " < 


Show  that  this  series  represents  the  principal  value  of 
arcsin  z (defined  in  Team  Project  30,  Sec.  13.7). 

17.  TEAM  PROJECT.  Properties  from  Maclaurin 
Series.  Clearly,  from  series  we  can  compute  function 
values.  In  this  project  we  show  that  properties  of 
functions  can  often  be  discovered  from  their  Taylor  or 
Maclaurin  series.  Using  suitable  series,  prove  the 
following. 

(a)  The  formulas  for  the  derivatives  of  ez,  cos  z,  sin  z, 
cosh  z,  sinh  z.  and  Ln  ( 1 + z) 

(b)  2(elz  + e~lz)  = cosz 

(c)  sin  z # 0 for  all  pure  imaginary  z = iy  + 0 


18-25 


TAYLOR  SERIES 


Find  the  Taylor  series  with  center  zo  and  its  radius  of 
convergence. 

18.  1/z,  zo  = i 19.  1/(1  - z),  z0  = i 

20.  cos2z,  Zo  = 'tr/2  21.  sinz,  z 0 = 7r/ 2 

22.  cosh  (z  — 7 ti),  zo  = 7 Ti 


23.  l/(z  + if,  Zo  = i 


24.  e 


ztz—2) 


Zo  = 1 


25.  sinh  (2z  - ().  Zo  = i/2 


^AUGUSTIN  FRESNEL  (1788-1827),  French  physicist  and  engineer,  known  for  his  work  in  optics. 
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15.!  Uniform  Convergence.  Optional 

We  know  that  power  series  are  absolutely  convergent  (Sec.  15.2,  Theorem  1)  and,  as 
another  basic  property,  we  now  show  that  they  are  uniformly  convergent.  Since  uniform 
convergence  is  of  general  importance,  for  instance,  in  connection  with  termwise  integration 
of  series,  we  shall  discuss  it  quite  thoroughly. 

To  define  uniform  convergence,  we  consider  a series  whose  terms  are  any  complex 
functions /0(z),/i(z),  ■ • ■ 


(1)  2 frniz)  = /o(z)  + fl(z)  + f2(z)  + ■■■. 

m= 0 

(This  includes  power  series  as  a special  case  in  which  fm(z)  = am(z  — Zo)m-)  We  assume 
that  the  series  (1)  converges  for  all  z in  some  region  G.  We  call  its  sum  s(z)  and  its  nth 
partial  sum  .vn(z);  thus 


Sn(z)  = /o(z)  + A(Z)  + ' ' ' + fn(z). 

Convergence  in  G means  the  following.  If  we  pick  a z = Zi  in  G,  then,  by  the  definition 
of  convergence  at  z i,  for  given  e > Owe  can  find  an  NJe)  such  that 

U(zi)  - .tn(zi)  < e for  all  n > NJe). 

If  we  pick  a z 2 in  G,  keeping  e as  before,  we  can  find  an  N2(e)  such  that 

U(Z2>  - sn(z2)  I < e for  all  n > NJe), 

and  so  on.  Hence,  given  an  e > 0,  to  each  z in  G there  corresponds  a number  NJe).  This 
number  tells  us  how  many  terms  we  need  (what  sn  we  need)  at  a z to  make  |s(z)  — sn(z)| 
smaller  than  e.  Thus  this  number  NJe)  measures  the  speed  of  convergence. 

Small  Nz(e)  means  rapid  convergence,  large  Nz(e)  means  slow  convergence  at  the  point 
Z considered.  Now,  if  we  can  find  an  N(e)  larger  than  all  these  NJe)  for  all  z in  G,  we 
say  that  the  convergence  of  the  series  (1)  in  G is  uniform.  Hence  this  basic  concept  is 
defined  as  follows. 


DEFINITION 


Uniform  Convergence 

A series  (1)  with  sum  .v  (z.)  is  called  uniformly  convergent  in  a region  G if  for  every 
e > 0 we  can  find  an  N = N(e),  not  depending  on  z,  such  that 

|s(z)  — sn(z) | < e for  all  n > N(e)  and  all  z.  in  G. 

Uniformity  of  convergence  is  thus  a property  that  always  refers  to  an  infinite  set  in 
the  z-plane,  that  is,  a set  consisting  of  infinitely  many  points. 


EXAMPLE  Geometric  Series 

Show  that  the  geometric  series  1 + z + z2  + ■ • • is  (a)  uniformly  convergent  in  any  closed  disk  z|  = r < 1 , 
(b)  not  uniformly  convergent  in  its  whole  disk  of  convergence  |z|  < 1. 


SEC.  15.5  Uniform  Convergence.  Optional 


699 


THEOREM  1 


PROOF 


Solution,  (a)  For  z in  that  closed  disk  we  have  |l  — z|  S 1 — r (sketch  it).  This  implies  that 
1/1 1 — z\  = 1/(1  — r).  Hence  (remember  (8)  in  Sec.  15.4  with  q = z) 


Uz)  “ sn(z)  I 


oc 

n+l 

Z 

1 - z 

m=n+ 1 

r 


n+1 


1 - r ' 


Since  r < 1,  we  can  make  the  right  side  as  small  as  we  want  by  choosing  n large  enough,  and  since  the  right 
side  does  not  depend  on  z (in  the  closed  disk  considered),  this  means  that  the  convergence  is  uniform. 

(b)  For  given  real  K (no  matter  how  large)  and  n we  can  always  find  a z in  the  disk  \z\  < 1 such  that 


u 


n+l 


> K , 


simply  by  taking  z close  enough  to  1.  Hence  no  single  N(e)  will  suffice  to  make  | .s  (z)  — sn(z)  smaller  than  a 
given  € > 0 throughout  the  whole  disk.  By  definition,  this  shows  that  the  convergence  of  the  geometric  series 
in  z < 1 is  not  uniform. 


This  example  suggests  that  for  a power  series,  the  uniformity  of  convergence  may  at  most 
be  disturbed  near  the  circle  of  convergence.  This  is  true: 


Uniform  Convergence  of  Power  Series 

A power  series 

(2)  2 am(z  ~ z0)m 

m= 0 

with  a nonzero  radius  of  convergence  R is  uniformly  convergent  in  every  circular 
disk  |z  — Zol  = r of  radius  r < R. 


For  |z  — z0l  = r tinh  any  positive  integers  n and  p we  have 

(3)  \an+1(z  - z0)n+1  + ' ' ■ + an+p(z  - z0)n+p I =§  \an+1\rn+1  + ■ ■ • + \an+p\rn+p. 

Now  (2)  converges  absolutely  if  |z  — z0l  = r < R (by  Theorem  1 in  Sec.  15.2).  Hence 
it  follows  from  the  Cauchy  convergence  principle  (Sec.  15.1)  that,  an  e > 0 being  given, 
we  can  find  an  N(e ) such  that 

|flm+i|rn+1  + ■ ■ • + \an+p\rn+v  < e for  n > N(e)  and  p=  1,2, 

From  this  and  (3)  we  obtain 


\an+1(z  ~ z0)n+1  + ■ ■ • + an+p(z  ~ z0)n+p I < e 

for  all  z in  the  disk  z — z<)\  = r,  every  n > N (e),  and  every  p = 1,  2,  • • • . Since  N(e)  is 
independent  of  z,  this  shows  uniform  convergence,  and  the  theorem  is  proved. 

Thus  we  have  established  uniform  convergence  of  power  series,  the  basic  concern  of  this 
section.  We  now  shift  from  power  series  to  arbitary  series  of  variable  terms  and  examine 
uniform  convergence  in  this  more  general  setting.  This  will  give  a deeper  understanding 
of  uniform  convergence. 
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THEOREM  2 


PROOF 


Properties  of  Uniformly  Convergent  Series 

Uniform  convergence  derives  its  main  importance  from  two  facts: 

1.  If  a series  of  continuous  terms  is  uniformly  convergent,  its  sum  is  also  continuous 
(Theorem  2,  below). 

2.  Under  the  same  assumptions,  termwise  integration  is  permissible  (Theorem  3). 
This  raises  two  questions: 

1.  How  can  a converging  series  of  continuous  terms  manage  to  have  a discontinuous 
sum?  (Example  2) 

2.  How  can  something  go  wrong  in  termwise  integration?  (Example  3) 

Another  natural  question  is: 

3.  What  is  the  relation  between  absolute  convergence  and  uniform  convergence?  The 
surprising  answer:  none.  (Example  5) 

These  are  the  ideas  we  shall  discuss. 

If  we  add  finitely  many  continuous  functions,  we  get  a continuous  function  as  their  sum. 
Example  2 will  show  that  this  is  no  longer  true  for  an  infinite  series,  even  if  it  converges 
absolutely.  However,  if  it  converges  uniformly,  this  cannot  happen,  as  follows. 


Continuity  of  the  Sum 

Let  the  series 

oo 

2 frniz)  =Mz ) +/i(z)  + ••• 
m= 0 

be  uniformly  convergent  in  a region  G.  Let  F(z)  be  its  sum.  Then  if  each  termfm(z) 
is  continuous  at  a point  z\  in  G,  the  function  F(z)  is  continuous  at  z.\. 


Let  sn(z ) be  the  nth  partial  sum  of  the  series  and  Rn(z)  the  corresponding  remainder: 

sn  = fo  + fl  + • ' • + /n>  Rn  = fn+ 1 + fn  + 2 + ‘ ' ‘ • 

Since  the  series  converges  uniformly,  for  a given  e > 0 we  can  find  an  N = N(e)  such  that 

\Rn(z)\  < ^ for  all  z in  G. 

Since  sN(z)  is  a sum  of  finitely  many  functions  that  are  continuous  at  z i,  this  sum  is 
continuous  at  Z\.  Therefore,  we  can  find  a 8 > 0 such  that 

sN(z)  - i’jvfei)!  < ^ for  all  z in  G for  which  z — zj  <5. 

Using  F = Sjv  + rn  and  the  triangle  inequality  (Sec.  13.2),  for  these  z we  thus  obtain 

I F(z)  - F(zi)\  = kv(z)  + Rn(z.)  ~ [sjvOii)  + RN(Z!)]\ 

= Ujv(z)  - sN(zi)\  + |/?jv(z)|  + \Rn(zi)\  <f  + f + f = e- 


This  implies  that  F{z)  is  continuous  at  zi,  and  the  theorem  is  proved. 
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EXAMPLE  2 


Series  of  Continuous  Terms  with  a Discontinuous  Sum 

Consider  the  series 

2 2 2 
, x x x 

x2  H 1 1 f • • ■ 

1 + x2  (1  + x2)2  (1  + x2)3 

This  is  a geometric  series  with  q = 1/(1  + x2)  times  a factor  x2.  Its  nth  partial  sum  is 


(x  real). 


Sn(x)  = X 


1 + 


1 + X 


2 (1  + x2)2 


(1  + x2)n 


We  now  use  the  trick  by  which  one  finds  the  sum  of  a geometric  series,  namely,  we  multiply  5n(x)  by 
-q  = -1/(1  + x2). 


i + jr 


: 


■ + •••  + ■ 


A + x* 


(1  + x2)n  (1  + rT 


Adding  this  to  the  previous  formula,  simplifying  on  the  left,  and  canceling  most  terms  on  the  right,  we  obtain 


1 + 


- sn(x)  = x 


1 - 


thus 


1 

(1  + x2)n+1 


sjx)  = 1 + x - - 


The  exciting  Fig.  368  “explains”  what  is  going  on.  We  see  that  if  x =£  0,  the  sum  is 

s (x)  = lim  5n(x)  = 1 + x2, 

n— »oo 

but  for  x = 0 we  have  ,sn(0)  =1  — 1=0  for  all  n , hence  5(0)  = 0.  So  we  have  the  surprising  fact  that  the  sum 
is  discontinuous  (at  x = 0),  although  all  the  terms  are  continuous  and  the  series  converges  even  absolutely  (its 
terms  are  nonnegative,  thus  equal  to  their  absolute  value!). 

Theorem  2 now  tells  us  that  the  convergence  cannot  be  uniform  in  an  interval  containing  x = 0.  We  can  also 
verify  this  directly.  Indeed,  for  x =£  0 the  remainder  has  the  absolute  value 


|ftn(x)|  = \s(x)  - 5n(x)|  = — - 

(1  + x2)n 

and  we  see  that  for  a given  e (<1)  we  cannot  find  an  N depending  only  on  e such  that  |/?n|  < e for  all  n > N(e) 
and  all  x,  say,  in  the  interval  0 ^ x ^ 1 . 


Termwise  Integration 

This  is  our  second  topic  in  connection  with  uniform  convergence,  and  we  begin  with  an 
example  to  become  aware  of  the  danger  of  just  blindly  integrating  term-by-term. 
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Series  for  Which  Termwise  Integration  Is  Not  Permissible 

Let  um(x ) = mxe_ml  and  consider  the  series 

2 /nW  where  /mW  = »mW  “ “m-lW 

ra=0 

in  the  interval  0 ^ x ^ 1 . The  nth  partial  sum  is 

Sn  U0  if 2 Mi  "T  T Mn  Un—  1 Wn  Wo 

Hence  the  series  has  the  sum  F(x ) = lim  sn(x)  = lim  wn(x)  = 0 (0  = x = 1).  From  this  we  obtain 


F(x)  dx  = 0. 


On  the  other  hand,  by  integrating  term  by  term  and  using  A + fz  + * * ■ + fn  = sn,  we  have 

oo  rl  nr1  r1 

2 /m(*)  dx  = ljm  2 fm(x)  dx  = lim  $«(*)  dx. 

m=l  ^0  m=l  0 n -’o 

Now  sn  = Mn  and  the  expression  on  the  right  becomes 

r 1 rl 


lim  I un(x)  dx  = lim  I nxe  dx  = lim  — (1  — e n)  = , 

- — i - — 1 n—’-i-  2 2 


o 


but  not  0.  This  shows  that  the  series  under  consideration  cannot  be  integrated  term  by  term  from  x = 0 to 
x = I. 

The  series  in  Example  3 is  not  uniformly  convergent  in  the  interval  of  integration,  and 
we  shall  now  prove  that  in  the  case  of  a uniformly  convergent  series  of  continuous 
functions  we  may  integrate  term  by  term. 


THEOREM  3 


Termwise  Integration 

Let 


F{Z)  = 2 fmiz)  = MZ)  + fl(z)  + ■ • • 

m= 0 

be  a uniformly  convergent  series  of  continuous  functions  in  a region  G.  Let  C be 
any  path  in  G.  Then  the  series 


(4) 


fm(z)  dz  = 


m= 0 C 


fo(z)  dz  + 


Jc 


fi(z)  dz  + ■■ 


Jc 


is  convergent  and  has  the  sum 


F(z)  dz- 


PROOF  From  Theorem  2 it  follows  that  F(z)  is  continuous.  Let  sn(z)  be  the  nth  partial  sum  of  the 
given  series  and  Rn(z ) the  corresponding  remainder.  Then  F = sn  + Rn  and  by  integration. 


F(z)  dz  = sn(z)  dz  + Rn(z)  dz. 


Jc 


Jc 
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Let  L be  the  length  of  C.  Since  the  given  series  converges  uniformly,  for  every  given 
e > 0 we  can  find  a number  N such  that  Rn(z)  < e/L  for  all  n > N and  all  z in  G.  By 
applying  the  ;V/ /--inequality  (Sec.  14.1)  we  thus  obtain 


Rn(z)  dz 


< — L = e 
L 


Since  Rn  = F — sn,  this  means  that 


for  all  n > N. 


F{z ) dz  ~ 
Jc 


Sn(z)  dz 
c 


< e 


for  all  n > N. 


Hence,  the  series  (4)  converges  and  has  the  sum  indicated  in  the  theorem. 


Theorems  2 and  3 characterize  the  two  most  important  properties  of  uniformly  convergent 
series.  Also,  since  differentiation  and  integration  are  inverse  processes,  Theorem  3 implies 


THEOREM  4 


Termwise  Differentiation 

Let  the  series  /o(z)  + fi(z)  + fz(z)  + • ■ • be  convergent  in  a region  G and  let  F(z ) 
be  its  sum.  Suppose  that  the  series  f^iz)  + fi(z)  + fz(z)  + ■ • • converges  uniformly 
in  G and  its  terms  are  continuous  in  G.  Then 

F'(z)  = fo  (z)  + A' (z)  + fz  (z)  + • ■ • for  all  z in  G. 


Test  for  Uniform  Convergence 

Uniform  convergence  is  usually  proved  by  the  following  comparison  test. 


THEOREM  5 


Weierstrass5  M-Test  for  Uniform  Convergence 

Consider  a series  of  the  form  (1)  in  a region  G of  the  z-plane.  Suppose  that  one  can 
find  a convergent  series  of  constant  terms, 

(5)  Mq  + Mi  + M2  + • * * , 

such  that  |/m(z)|  = Mmfor  all  z in  G and  every  m = 0,  1,  • • • . Then  (1)  is  uniformly 
convergent  in  G. 


The  simple  proof  is  left  to  the  student  (Team  Project  18). 


5KARL  WEIERSTRASS  (1815-1897),  great  German  mathematician,  who  developed  complex  analysis  based 
on  the  concept  of  power  series  and  residue  integration.  (See  footnote  in  Section  13.4.)  He  put  analysis  on  a 
sound  theoretical  footing.  His  mathematical  rigor  is  so  legendary  that  one  speaks  Weierstrassian  rigor.  (See 
paper  by  Birkhoff  and  Kreyszig,  1984  in  footnote  in  Sec.  5.5;  Kreyszig,  E.,  On  the  Calculus,  of  Variations  and 
Its  Major  Influences  on  the  Mathematics  of  the  First  Half  of  Our  Century.  Part  II,  American  Mathematical 
Monthly  (1994),  101,  No.  9,  pp.  902-908).  Weierstrass  also  made  contributions  to  the  calculus  of  variations, 
approximation  theory,  and  differential  geometry.  He  obtained  the  concept  of  uniform  convergence  in  1841 
(published  1894,  sic!)’,  the  first  publication  on  the  concept  was  by  G.  G.  STOKES  (see  Sec  10.9)  in  1847. 
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EXAMPLE  4 


Weierstrass  M-Test 

Does  the  following  series  converge  uniformly  in  the  disk  |z|  = 1? 

^ zm  + 1 

m=1  m2  + cosh  m|z| 

Solution.  Uniform  convergence  follows  by  the  Weierstrass  M- test  and  the  convergence  of  2 1/m2  (see 
Sec.  15.1,  in  the  proof  of  Theorem  8)  because 


Zm  + 1 

m2  + cosh  m|z| 


lz]m  + 1 

2 


No  Relation  Between  Absolute 
and  Uniform  Convergence 

We  finally  show  the  surprising  fact  that  there  are  series  that  converge  absolutely  but  not 
uniformly,  and  others  that  converge  uniformly  but  not  absolutely,  so  that  there  is  no  relation 
between  the  two  concepts. 


E X A M P L No  Relation  Between  Absolute  and  Uniform  Convergence 

The  series  in  Example  2 converges  absolutely  but  not  uniformly,  as  we  have  shown.  On  the  other  hand,  the  series 


* (-i)m_i  i i i 

2 _ ^ 

m_i  x2  + in  x2  + 1 x2  + 2 x2  + 3 


(x  real) 


converges  uniformly  on  the  whole  real  line  but  not  absolutely. 

Proof.  By  the  familiar  Leibniz  test  of  calculus  (see  App.  A3. 3)  the  remainder  Rn  does  not  exceed  its  first 
term  in  absolute  value,  since  we  have  a series  of  alternating  terms  whose  absolute  values  form  a monotone 
decreasing  sequence  with  limit  zero.  Hence  given  e > 0,  for  all  x we  have 


Km  I ^ 


i 

x2  + n + 1 


1 

< < e 

n 


if  n > N(e) 


1 

e 


This  proves  uniform  convergence,  since  N(e)  does  not  depend  on  x. 
The  convergence  is  not  absolute  because  for  any  fixed  x we  have 


(-if1-1 

2 i 

x + m 


1 


x2  + m 
k 

> — 

m 


where  k is  a suitable  constant,  and  kH\/m  diverges. 


PRFDB1TE^M=S-ET-1^^ 


1.  CAS  EXPERIMENT.  Graphs  of  Partial  Sums,  (a) 
Fig.  368.  Produce  this  exciting  figure  using  your  CAS. 
Add  further  curves,  say,  those  of  ^256.  ii024.  etc.  on  the 
same  screen. 


(b)  Power  series.  Study  the  nonuniformity  of  con- 
vergence experimentally  by  graphing  partial  sums  near 
the  endpoints  of  the  convergence  interval  for  real 

z = x. 
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2-9  POWER  SERIES 

Where  does  the  power  series  converge  uniformly?  Give 
reason. 


2-  2 


n + 2 
In  3 


3-  2 ’ (*  + o2B 

n= 0 J 

^ 3™(1  - i)n 

4-  2j ; (z  ~ 0 


5-  2 [2  )(4z  + 20" 


6.  ^ 2"(tanh  n2)z2 
n= 0 

’•  i4(W< 


*■ 

n=l  v ' 

00 

9-  2^rrfe-20" 

n=i  2 n 


10-17 


UNIFORM  CONVERGENCE 

Prove  that  the  series  converges  uniformly  in  the  indicated 
region. 

oo  ,2  n 

10.  2 Ul  S io20 

n= 0 


11.  2 ~o’  Izl  S 1 


12-  2 


=1  n cosh  n\z\ 

” sin™  |z| 

13.  2 • allz 


Izl  S 1 


14-  2 2 S |z|  S 10 

00  c n\f 

15-  2 Id  £ 3 

w= 0 v 7 
” tanh™  |z| 

16.  V , all  z 

^ n(n  + 1) 


17.  2 Izl  = 0.56 


18.  TEAM  PROJECT.  Uniform  Convergence, 
(a)  Weierstrass  M-test.  Give  a proof. 


(b)  Termwise  differentiation.  Derive  Theorem  4 
from  Theorem  3. 

(c)  Subregions.  Prove  that  uniform  convergence  of  a 
series  in  a region  G implies  uniform  convergence  in 
any  portion  of  G.  Is  the  converse  true? 

(d)  Example  2.  Find  the  precise  region  of  convergence 
of  the  series  in  Example  2 with  x replaced  by  a complex 
variable  z. 

(e)  Figure  369.  Show  that*2  Xm= l (1  + x2)~m  = 1 
if  x ~t~  0 and  0 if  x = 0.  Verify  by  computation  that  the 
partial  sums  .yj,  .s2,  S3  look  as  shown  in  Fig.  369. 


Fig.  369.  Sum  s and  partial 
sums  in  Team  Project  18(e) 


19-20 


HEAT  EQUATION 


Show  that  (9)  in  Sec.  12.6  with  coefficients  (10)  is  a solution 
of  the  heat  equation  for  t > 0,  assuming  that  f(x)  is 
continuous  on  the  interval  OSiSf  and  has  one-sided 
derivatives  at  all  interior  points  of  that  interval.  Proceed  as 
follows. 


19.  Show  that  |fire|  is  bounded,  say  |fi„|  < K for  all  n. 
Conclude  that 

\un\  < Ke~x*to  if  (Sto>0 


and,  by  the  Weierstrass  test,  the  series  (9)  converges 
uniformly  with  respect  to  x and  t for  t S t0,  0 S x S L. 
Using  Theorem  2,  show  that  u (x,  t)  is  continuous  for 
t S f0  and  thus  satisfies  the  boundary  conditions  (2) 
for  t S t0. 

20.  Show  that  \dun/dt\  < A2A_e-A"t°  if  t = tg  and  the 
series  of  the  expressions  on  the  right  converges,  by 
the  ratio  test.  Conclude  from  this,  the  Weierstrass 
test,  and  Theorem  4 that  the  series  (9)  can  be 
differentiated  term  by  term  with  respect  to  t and  the 
resulting  series  has  the  sum  du/dt.  Show  that  (9)  can 
be  differentiated  twice  with  respect  to  x and  the 
resulting  series  has  the  sum  d2u/dx2.  Conclude  from 
this  and  the  result  to  Prob.  19  that  (9)  is  a solution 
of  the  heat  equation  for  all  t £ t0.  (The  proof  that  (9) 
satisfies  the  given  initial  condition  can  be  found  in 
Ref.  [CIO]  listed  in  App.  1.) 
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S T I O N S AND  PROBLEMS 


1.  What  is  convergence  test  for  series?  State  two  tests  from 
memory.  Give  examples. 

2.  What  is  a power  series?  Why  are  these  series  very 
important  in  complex  analysis? 

3.  What  is  absolute  convergence?  Conditional  convergence? 
Uniform  convergence? 

4.  What  do  you  know  about  convergence  of  power  series? 

5.  What  is  a Taylor  series?  Give  some  basic  examples. 

6.  What  do  you  know  about  adding  and  multiplying  power 
series? 

7.  Does  every  function  have  a Taylor  series  development? 
Explain. 

8.  Can  properties  of  functions  be  discovered  from 
Maclaurin  series?  Give  examples. 

9.  What  do  you  know  about  termwise  integration  of 
series? 

10.  How  did  we  obtain  Taylor’s  formula  from  Cauchy’s 
formula? 


11-15 


RADIUS  OF  CONVERGENCE 


Find  the  radius  of  convergence. 

ii.  2 * + 1 & + D" 

n2  + 1 


12-  2 


n — 1 


(Z  - 7 Ti)n 


°°  n(n  — 1) 

13-  2 & - 0" 

n= 2 J 


14.  2^fe-3/)2 


15.  2 


(-2)"' 
2 n 
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RADIUS  OF  CONVERGENCE 


Find  the  radius  of  convergence.  Try  to  identify  the  sum  of 
the  series  as  a familiar  function. 


-5-1  * 

i6.  y 

n 


17-  2 V 

~ n\ 


is.  y 


(-Dn 

(2  n + 1)! 


(ttz)2 


i9.  y 


(2/0! 


20.  y 


(3  + 40” 
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MACLAURIN  SERIES 


Find  the  Maclaurin  series  and  its  radius  of  convergence. 
Show  details. 

21.  (sinhz2)/z2  22.  1/(1  - z)3 

23.  cos2z  24.  1/(7 Tz  + 1) 

25.  — (exp/(— z2)  - 1 )/z2 
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TAYLOR  SERIES 


Find  the  Taylor  series  with  the  given  point  as  center  and  its 
radius  of  convergence. 

26.  z4,  i 

27.  cos  z,  g 7 r 

28.  1/z,  2 i 

29.  Lnz,  3 

30.  ez,  TTi 


SUMMARY  OF  CHAPTER  1 5 

Power  Series,  Taylor  Series 


Sequences,  series,  and  convergence  tests  are  discussed  in  Sec.  15.1.  A power  series 
is  of  the  form  (Sec.  15.2) 

oo 

(1)  2 an(z  - Zo)n  = «o  + ax(z  - z0)  + Ci2(z  ~ z0f  + • ■ • ; 

n= 0 

"o  is  its  center.  The  series  (1)  converges  for  z — z0l  < R and  diverges  for 
\z  ~ z0l  > R,  where  R is  the  radius  of  convergence.  Some  power  series  converge 


Summary  of  Chapter  15 
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for  all  z (then  we  write  R = °°).  In  exceptional  cases  a power  series  may  converge 
only  at  the  center;  such  a series  is  practically  useless.  Also,  R = lim  |a„/a„+1| 
if  this  limit  exists.  The  series  (1)  converges  absolutely  (Sec.  15.2)  and  uniformly 
(Sec.  15.5)  in  every  closed  disk  |z  — z0|  = r < R(R  > 0).  It  represents  an  analytic 
function  /(z)  for  z — z 0 < R-  The  derivatives  f(z),f  (z),  ■ ■ ■ are  obtained  by 
termwise  differentiation  of  (1),  and  these  series  have  the  same  radius  of  convergence 
R as  (1).  See  Sec.  15.3. 

Conversely,  every  analytic  function/(z)  can  be  represented  by  power  series.  These 
Taylor  series  of /(z)  are  of  the  form  (Sec.  15.4) 

(2)  f(z)  = 2 ^ fnXz0)(z  - zo)n  (lz  - zo)l  < R\ 

n\ 

n= 0 

as  in  calculus.  They  converge  for  all  z in  the  open  disk  with  center  zo  and  radius 
generally  equal  to  the  distance  from  zo  to  the  nearest  singularity  of  /(z)  (point  at 
which /(z)  ceases  to  be  analytic  as  defined  in  Sec.  15.4).  If/(z)  is  entire  (analytic 
for  all  z;  see  Sec.  13.5),  then  (2)  converges  for  all  z-  The  functions  ez,  cos  z,  sin  z, 
etc.  have  Maclaurin  series,  that  is,  Taylor  series  with  center  0,  similar  to  those  in 
calculus  (Sec.  15.4). 


CHAPTER 


Laurent  Series. 
Residue  Integration 


The  main  purpose  of  this  chapter  is  to  learn  about  another  powerful  method  for  evaluating 
complex  integrals  and  certain  real  integrals.  It  is  called  residue  integration.  Recall  that 
the  first  method  of  evaluating  complex  integrals  consisted  of  directly  applying  Cauchy’s 
integral  formula  of  Sec.  14.3.  Then  we  learned  about  Taylor  series  (Chap.  15)  and  will 
now  generalize  Taylor  series.  The  beauty  of  residue  integration,  the  second  method  of 
integration,  is  that  it  brings  together  a lot  of  the  previous  material. 

Laurent  series  generalize  Taylor  series.  Indeed,  whereas  a Taylor  series  has  positive 
integer  powers  (and  a constant  term)  and  converges  in  a disk,  a Laurent  series  (Sec.  16.1) 
is  a series  of  positive  and  negative  integer  powers  of  z ~ Zo  and  converges  in  an  annulus 
(a  circular  ring)  with  center  zo-  Hence,  by  a Laurent  series,  we  can  represent  a given 
function  f(z)  that  is  analytic  in  an  annulus  and  may  have  singularities  outside  the  ring  as 
well  as  in  the  “hole”  of  the  annulus. 

We  know  that  for  a given  function  the  Taylor  series  with  a given  center  z0  is  unique. 
We  shall  see  that,  in  contrast,  a function /(z)  can  have  several  Laurent  series  with  the 
same  center  z.q  and  valid  in  several  concentric  annuli.  The  most  important  of  these  series 
is  the  one  that  converges  for  0<  |z  — zol  < R,  that  is,  everywhere  near  the  center  zo 
except  at  zo  itself,  where  zo  is  a singular  point  of  /(z).  The  series  (or  finite  sum)  of  the 
negative  powers  of  this  Laurent  series  is  called  the  principal  part  of  the  singularity  of 
/(z)  at  zo.  and  is  used  to  classify  this  singularity  (Sec.  16.2).  The  coefficient  of  the  power 
l/(z  — Zo)  °f  this  series  is  called  the  residue  of/(z)  at  zo-  Residues  are  used  in  an  elegant 
and  powerful  integration  method,  called  residue  integration,  for  complex  contour  integrals 
(Sec.  16.3)  as  well  as  for  certain  complicated  real  integrals  (Sec.  16.4). 

Prerequisite:  Chaps.  13,  14,  Sec.  15.2. 

Sections  that  may  be  omitted  in  a shorter  course:  16.2,  16.4. 

References  and  Answers  to  Problems:  App.  1 Part  D,  App.  2. 


16.1  Laurent  Series 


Laurent  series  generalize  Taylor  series.  If,  in  an  application,  we  want  to  develop  a function 
/(z)  in  powers  of  z — zo  when/(z)  is  singular  at  zo  (as  defined  in  Sec.  15.4),  we  cannot 
use  a Taylor  series.  Instead  we  can  use  a new  kind  of  series,  called  Laurent  series,1 


1PIERRE  ALPHONSE  LAURENT  (1813-1854),  French  military  engineer  and  mathematician,  published  the 
theorem  in  1843. 
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consisting  of  positive  integer  powers  of  z ~ Zo  (and  a constant)  as  well  as  negative  integer 
powers  of  z ~ Zol  this  is  the  new  feature. 

Laurent  series  are  also  used  for  classifying  singularities  (Sec.  16.2)  and  in  a powerful 
integration  method  (“residue  integration,”  Sec.  16.3). 

A Laurent  series  of  f(z)  converges  in  an  annulus  (in  the  “hole”  of  which /(z)  may  have 
singularities),  as  follows. 


THEOREM  1 


Laurent’s  Theorem 

Let  f(z)  be  analytic  in  a domain  containing  two  concentric  circles  C i and  C2  with 
center  z o and  the  annulus  between  them  (blue  in  Fig.  370).  Then  f(z)  can  be 
represented  by  the  Laurent  series 


(1) 


/(z)  = 2 an(z  ~ Z0)n  + 2 _ n 

n= 0 n= 1 ^ 

= <70  + «i(z  - zo)  + Ci2(z  - Zo)2  + ■■■ 
b\  b2 

• • ■ H h « ' * ’ 

z - Zo  (z  - Zo) 


consisting  of  nonnegative  and  negative  powers.  The  coefficients  of  this  Laurent  series 
are  given  by  the  integrals 


(2)  an 


1 

277/ 


f /(Z*) 
Jc  ( 7 * - Z0f 


, dz:\ 


bn  = — t (Z*  - Zo) 

277 ; 


n—  1 


f(z*)  dz*. 


taken  counterclockwise  around  any  simple  closed  path  C that  lies  in  the  annulus 
and  encircles  the  inner  circle,  as  in  Fig.  370.  [The  variable  of  integration  is  denoted 
by  z*  since  z is  used  in  (1).] 

This  series  converges  and  represents  f(z)  in  the  enlarged  open  annulus  obtained 
from  the  given  annulus  by  continuously  increasing  the  outer  circle  C i and  decreasing 
C-2  until  each  of  the  two  circles  reaches  a point  where  f(z)  is  singular. 

In  the  important  special  case  that  zo  is  the  only  singular  point  off(z)  inside  C2, 
this  circle  can  be  shrunk  to  the  point  z o,  giving  convergence  in  a disk  except  at  the 
center.  In  this  case  the  series  (or  finite  sum)  of  the  negative  powers  of  ( 1)  is  called 
the  principal  part  of  f(z)  at  zo  [or  of  that  Laurent  series  (1)]. 


Fig.  370.  Laurent’s  theorem 
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PROOF 


COMMENT  Obviously,  instead  of  (1),  (2)  we  may  write  (denoting  bn  by  «_„) 


d') 


co 

f(z)  = 2 - Zo)" 

n=  — co 


where  all  the  coefficients  are  now  given  by  a single  integral  formula,  namely, 


(2') 


an 


1 

277( 


/(z*) 


Jc 


(z* 


Zo ) 


n+1 


dz* 


(, n = 0,  ±1,  ±2,  • • • ). 


Let  us  now  prove  Laurent’s  theorem. 


(a)  The  nonnegative  powers  are  those  of  a Taylor  series. 

To  see  this,  we  use  Cauchy’s  integral  formula  (3)  in  Sec.  14.3  with  z*  (instead  of  z)  as 
the  variable  of  integration  and  z instead  of  zo-  Let  g(z)  and  h(z)  denote  the  functions 
represented  by  the  two  terms  in  (3),  Sec.  14.3.  Then 


(3) 


f{z)  = g(z)  + Kz)  = zr~.  <> 


/(z*) 


2vi  J z*  ~ z 
^1 


dz* 


1 

27 n 


f(z*) 


dz*. 


Here  z is  any  point  in  the  given  annulus  and  we  integrate  counterclockwise  over  both  C i 
and  C2,  so  that  the  minus  sign  appears  since  in  (3)  of  Sec.  14.3  the  integration  over  C2 
is  taken  clockwise.  We  transform  each  of  these  two  integrals  as  in  Sec.  15.4.  The  first 
integral  is  precisely  as  in  Sec.  15.4.  Hence  we  get  exactly  the  same  result,  namely,  the 
Taylor  series  of  g(z). 


(4) 


g(z) 


1 

27 Ti 


I /fe*) 

d>  


dz*  = 2 an(z  - Zo)n 

n= 0 


with  coefficients  [see  (2),  Sec.  15.4,  counterclockwise  integration] 


(5) 


1 

2rri  . 


f(z*) 


Cl 


(z*  - 'n+1 


zof 


dz*. 


Here  we  can  replace  C\  by  C (see  Fig.  370),  by  the  principle  of  deformation  of  path,  since 
Zo,  the  point  where  the  integrand  in  (5)  is  not  analytic,  is  not  a point  of  the  annulus.  This 
proves  the  formula  for  the  an  in  (2). 

(b)  The  negative  powers  in  (1)  and  the  formula  for  bn  in  (2)  are  obtained  if  we  consider 
/i(z).  It  consists  of  the  second  integral  times  — l/(277t)  in  (3).  Since  z lies  in  the  annulus, 
it  lies  in  the  exterior  of  the  path  C2.  Hence  the  situation  differs  from  that  for  the  first 
integral.  The  essential  point  is  that  instead  of  [see  (7*)  in  Sec.  15.4] 


(6)  (a) 


we  now  have 


(b) 


< 1. 


Consequently,  we  must  develop  the  expression  l/(z*  — z)  in  the  integrand  of  the  second 
integral  in  (3)  in  powers  of  (z*  — Zo)/(z  — z 0)  (instead  of  the  reciprocal  of  this)  to  get  a 
convergent  series.  We  find 
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1 _ 1 

Z*  - Z Z*  - Zo  ~ (z  - Zo) 


(z  - Zo) 


Compare  this  for  a moment  with  (7)  in  Sec.  15.4,  to  really  understand  the  difference.  Then 
go  on  and  apply  formula  (8),  Sec.  15.4,  for  a finite  geometric  sum,  obtaining 


Multiplication  by  —f(z*)/27Ti  and  integration  over  C2  on  both  sides  now  yield 


1 /(z*) 

h(z ) = — - — : ( > — dz* 


277/  J Z*  — Z 


— (—■I 

277 / (Z  Zq 


<>  f(z*)dz*  + 


1 


(z  - Zo)  J 


<>  (z*  - zo)f(z*)dz*  + 


1 


(z  - Zof 


C2 
n—  Is 


» (z*  - zo)n-Lf(z*)dz* 

JC2 

1 0 (z*  - Zo  )nf(z*)  + R'Uz) 

(z  - Zo)  Jc,  ) 


+ 


with  the  last  term  on  the  right  given  by 


(7) 


Rn(z) 


1 

277/(z  - z0)n+1 


/ * \7l+ 1 

O — f(z*)  dz*. 

c2  z - z* 


As  before,  we  can  integrate  over  C instead  of  C2  in  the  integrals  on  the  right.  We  see  that 
on  the  right,  the  power  l/(z  — z0)m  is  multiplied  by  bn  as  given  in  (2).  This  establishes 
Laurent’s  theorem,  provided 

(8)  lim  R*(z)  = 0. 

n— 

(c)  Convergence  proof  of  (8).  Very  often  (1 ) will  have  only  finitely  many  negative  powers. 
Then  there  is  nothing  to  be  proved.  Otherwise,  we  begin  by  noting  that /(z*)/(z  — z*)  in  (7) 
is  bounded  in  absolute  value,  say, 


/(z*) 

z - z* 


< M 


for  all  z*  on  C2 


because  /(z*)  is  analytic  in  the  annulus  and  on  C2,  and  z*  lies  on  C2  and  z outside,  so 
that  z — z*  A 0.  From  this  and  the  ML-inequality  (Sec.  14.1)  applied  to  (7)  we  get  the 
inequality  ( L = 2tt r2  = length  of  C2,  r2  = |z*  — ZqI  = radius  of  C2  = const) 
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EXAMPLE  1 


EXAMPLE  2 


K(z)l 


2tt\z  ~ z0 


n+1 


r%+1ML 


ML  ( r2  \n+1 
277  V \z  ~ Zol  ' 


From  (6b)  we  see  that  the  expression  on  the  right  approaches  zero  as  n approaches  infinity. 
This  proves  (8).  The  representation  (1)  with  coefficients  (2)  is  now  established  in  the  given 
annulus. 

(d)  Convergence  of  (1)  in  the  enlarged  annulus.  The  first  series  in  (1)  is  a Taylor 
series  [representing  g(z)];  hence  it  converges  in  the  disk  I)  with  center  z0  whose  radius 
equals  the  distance  of  the  singularity  (or  singularities)  closest  to  zo ■ Also,  g(z)  must  be 
singular  at  all  points  outside  C i where  f(z)  is  singular. 

The  second  series  in  (1),  representing  h(z),  is  a power  series  in  Z = I /(z  — Zo)-  Let  the 
given  annulus  be  r2  < z — Zo  < where  r\  and  r2  are  the  radii  of  C\  and  C2,  respectively 
(Fig.  370).  This  corresponds  to  l/r2  > |z|  > 1 j r\ . Flence  this  power  series  in  Z must 
converge  at  least  in  the  disk  |z|  < l/r2.  This  corresponds  to  the  exterior  |z  — Zol  > r2  of 
C2,  so  that  h(z)  is  analytic  for  all  z outside  C2.  Also,  h(z)  must  be  singular  inside  C2 
where /(z)  is  singular,  and  the  series  of  the  negative  powers  of  (1)  converges  for  all  z 
in  the  exterior  E of  the  circle  with  center  zo  and  radius  equal  to  the  maximum  distance 
from  z0  to  the  singularities  of  /(z)  inside  C2.  The  domain  common  to  D and  E is  the 
enlarged  open  annulus  characterized  near  the  end  of  Laurent’s  theorem,  whose  proof 
is  now  complete. 

Uniqueness.  The  Laurent  series  of  a given  analytic  function  f(z)  in  its  annulus  of 
convergence  is  unique  (see  Team  Project  18).  However,  /(z)  may  have  different  Laurent 
series  in  two  annuli  with  the  same  center ; see  the  examples  below.  The  uniqueness  is 
essential.  As  for  a Taylor  series,  to  obtain  the  coefficients  of  Laurent  series,  we  do  not 
generally  use  the  integral  formulas  (2);  instead,  we  use  various  other  methods,  some  of 
which  we  shall  illustrate  in  our  examples.  If  a Laurent  series  has  been  found  by  any  such 
process,  the  uniqueness  guarantees  that  it  must  be  the  Laurent  series  of  the  given  function 
in  the  given  annulus. 


Use  of  Maclaurin  Series 

Find  the  Laurent  series  of  z~5  sin  z with  center  0. 
Solution.  By  (14),  Sec.  15  .4,  we  obtain 


z 5 sin  z = 


i 

n= 0 


(-D"  4 

{In  + 1)!  " 


111  1 . 

~A O “I Z 

z 6z  120  5040 


(Id  > 0). 


Here  the  “annulus”  of  convergence  is  the  whole  complex  plane  without  the  origin  and  the  principal  part  of  the 
series  at  0 is  z~  ~ b Z~  • 


Substitution 

Find  the  Laurent  series  of  with  center  0. 

Solution.  From  (12)  in  Sec.  15.4  with  z replaced  by  l/z  we  obtain  a Laurent  series  whose  principal  part  is 
an  infinite  series, 


1 1 1 

— I 1 -= 

2 3!z  4!z1 2 


(Id  > 0).  ■ 
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EXAMPLE  3 


EXAMPLE  4 


EXAMPLE  5 


Development  of  1/(1  — z) 

Develop  1/(1  — z)  (a)  in  nonnegative  powers  of  z,  (b)  in  negative  powers  of  z. 

Solution. 

(valid  if  |z|  < 1). 

(valid  if  |z|  > 1).  ■ 


(a) 

(b) 


1 - z 


-2i" 


1 — Z z(l  — z ) 


-I  “I 

tt=-2  ir  = 


Laurent  Expansions  in  Different  Concentric  Annuli 

Find  all  Laurent  series  of  l/(z3  — z4)  with  center  0. 
Solution.  Multiplying  by  1/z3,  we  get  from  Example  3 


(I) 

(II) 


iz"-3 

71=0 


1 

- + 1 + z + 
z 


-i 

71=0 


1 

71  + 4 

Z 


(0  < Izl  < 1), 

(Id  > i).  ■ 


Use  of  Partial  Fractions 

—2  z + 3 

Find  all  Taylor  and  Laurent  series  of/(z)  = — with  center  0. 

z2  — 3z  + 2 


Solution.  In  terms  of  partial  fractions. 


/(z)  = - 


1 


1 


z - 1 z - 2 

(a)  and  (b)  in  Example  3 take  care  of  the  first  fraction.  For  the  second  fraction. 


(c) 


(d) 


(I)  From  (a)  and  (c),  valid  for  |zl  < 1 (see  Fig.  371), 


/(z) 


i 

71=0 


(Izl  < 2), 


(Izl  >2). 


Fig.  371.  Regions  of  convergence  in  Example  5 
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(II)  From  (c)  and  (b),  valid  for  1 < |z|  < 2, 


/(z) 


71=0  4 


1 

- + 
2 


(III)  From  (d)  and  (b),  valid  for  |z|  >2, 


1 1 


z 


" „ 1 2 3 5 9 

m — — 2 (2  +1)  ^n+1  ■■■■ 

n=0  *•  Z Z Z Z 

rf  /Tz)  in  Laurent’s  theorem  is  analytic  inside  C2,  the  coefficients  bn  in  (2)  are  zero  by 
Cauchy’s  integral  theorem,  so  that  the  Laurent  series  reduces  to  a Taylor  series.  Examples 
3(a)  and  5(1)  illustrate  this. 
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LAURENT  SERIES  NEAR  A SINGULARITY 
AT  0 


Expand  the  function  in  a Laurent  series  that  converges  for 
0 < |z|  < R and  determine  the  precise  region  of  conver- 
gence. Show  the  details  of  your  work. 

cosz  exp  (— 1/z2) 


3. 


exp  z 


4. 


sin  7 tz 


5. 


6. 


sinh  2z 
2 


a 1 
7.  z cosh  — 


17.  CAS  PROJECT.  Partial  Fractions.  Write  a program 
for  obtaining  Laurent  series  by  the  use  of  partial 
fractions.  Using  the  program,  verify  the  calculations  in 
Example  5 of  the  text.  Apply  the  program  to  two  other 
functions  of  your  choice. 

18.  TEAM  PROJECT.  Laurent  Series,  (a)  Uniqueness. 

Prove  that  the  Laurent  expansion  of  a given  analytic 
function  in  a given  annulus  is  unique. 

(b)  Accumulation  of  singularities.  Does  tan  (1/z) 
have  a Laurent  series  that  converges  in  a region 
0 < |z|  < R?  (Give  a reason.) 


(c)  Integrals.  Expand  the  following  functions  in  a 
Laurent  series  that  converges  for  |z|  >0: 
t 


1 


e1  - 1 


- dt. 


1 


sin  t 


-dt. 


9-16  LAURENT  SERIES  NEAR  A SINGULARITY 
AT  z0 

Find  the  Laurent  series  that  converges  for  0 < |z  - zol  < R 
and  determine  the  precise  region  of  convergence.  Show  details. 


9. 

e 

Zo  = 1 

10. 

(z-  l)2’ 

2 

11. 

Z 

ZO  = 77' 

12. 

(z  - 7 n')4 

13. 

1 

14. 

z (z  ~ 1) 

co  i 

15. 

COS  z 

O 

II 

3 

(z  - Ttf  ’ 

16. 

sin  z 

O 

ll 

=1 

/_  1 \3 

z2  ~ 3i 
(z-  3)2' 

1 

z2(z  - 0 
ea* 


Zo  = 3 


Zo  = 1 


z — b’ 


zo  = b 


19-25  TAYLOR  AND  LAURENT  SERIES 

Find  all  Taylor  and  Laurent  series  with  center  zo-  Determine 
the  precise  regions  of  convergence.  Show  details. 

19.  — z0  = 0 20.  i z0  = 1 

1 - z2  z 


sin  z , 

21.  , , ZO  = -gTT 

z + 

22.  zo  = i 

z 

sinh  z 

24.  7,  Zo  = 1 


23. 


1 - z* 


Zo  = 0 


(z  - 4«' 


25. 


(z  - If 
z3  ~ 2 iz2 
(z  ~ if  ’ 


Zo  = l 
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16.2 

Roughly,  a singular  point  of  an  analytic  function  /(z)  is  a zo  at  which  /(z)  ceases  to  be 
analytic,  and  a zero  is  a z at  which  /(z)  = 0.  Precise  definitions  follow  below.  In  this 
section  we  show  that  Laurent  series  can  be  used  for  classifying  singularities  and  Taylor 
series  for  discussing  zeros. 

Singularities  were  defined  in  Sec.  15.4,  as  we  shall  now  recall  and  extend.  We  also 
remember  that,  by  definition,  a function  is  a single-valued  relation,  as  was  emphasized 
in  Sec.  13.3. 

We  say  that  a function /(z)  is  singular  or  has  a singularity  at  a point  z = zo  if/(z)  is  not 
analytic  (perhaps  not  even  defined)  at  z = zo.  but  every  neighborhood  of  z = zo  contains 
points  at  which /(z)  is  analytic.  We  also  say  that  z = Zo  is  a singular  point  of /(z). 

We  call  z = Zo  an  isolated  singularity  of  /(z)  if  z = Zo  has  a neighborhood  without 
further  singularities  of /(z).  Example:  tan  z has  isolated  singularities  at  ±77/2,  ±37t/2,  etc.; 
tan  (1/z)  has  a nonisolated  singularity  at  0.  (Explain!) 

Isolated  singularities  of  /(z)  at  z = Zo  can  be  classified  by  the  Laurent  series 

OC  00  jy 

(1)  /(z)  = 2 an (z  - Zo)n  + 2 — w (Sec.  16.1) 

n= 0 n=l  (Z 

valid  in  the  immediate  neighborhood  of  the  singular  point  z = zo,  except  at  zo  itself,  that 
is,  in  a region  of  the  form 


Singularities  and  Zeros.  Infinity 


0 < |z  - Zol  < R- 

The  sum  of  the  first  series  is  analytic  at  z = z0,  as  we  know  from  the  last  section.  The 
second  series,  containing  the  negative  powers,  is  called  the  principal  part  of  (1),  as  we 
remember  from  the  last  section.  If  it  has  only  finitely  many  terms,  it  is  of  the  form 


(2) 


b\  bm 

Z - Zo  (z  - Zo) 


( bm  + 0). 


Then  the  singularity  of /(z)  at  z = z0  is  called  a pole,  and  m is  called  its  order.  Poles  of 
the  first  order  are  also  known  as  simple  poles. 

If  the  principal  part  of  (1)  has  infinitely  many  terms,  we  say  that/(z)  has  at  z = z0  an 
isolated  essential  singularity. 

We  leave  aside  nonisolated  singularities. 

Poles.  Essential  Singularities 

The  function 


m = 


z(z  - 2 )5  (z  - 2? 


has  a simple  pole  at  z — 0 and  a pole  of  fifth  order  at  z — 2.  Examples  of  functions  having  an  isolated  essential 
singularity  at  z — 0 are 


“1  11 


,V*  = 
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EXAMPLE  2 


THEOREM  1 


EXAMPLE  3 


THEOREM  2 


and 


. 1 _ ^ ("If  _ 1 11 

Sin  z (2n  + l)\z2n  + 1 z 3 !z3  5 !z5 

Section  16.1  provides  further  examples.  In  that  section.  Example  1 shows  that  z~ 5 sin  z has  a fourth-order 
pole  at  0.  Furthermore,  Example  4 shows  that  1 /(z  — z4)  has  a third-order  pole  at  0 and  a Laurent  series  with 
infinitely  many  negative  powers.  This  is  no  contradiction,  since  this  series  is  valid  for  |z|  > 1;  it  merely  tells 
us  that  in  classifying  singularities  it  is  quite  important  to  consider  the  Laurent  series  valid  in  the  immediate 
neighborhood  of  a singular  point.  In  Example  4 this  is  the  series  (I),  which  has  three  negative  powers. 


The  classification  of  singularities  into  poles  and  essential  singularities  is  not  merely  a formal 
matter,  because  the  behavior  of  an  analytic  function  in  a neighborhood  of  an  essential 
singularity  is  entirely  different  from  that  in  the  neighborhood  of  a pole. 

Behavior  Near  a Pole 

/(z)  = 1/z2  has  a pole  at  z = 0,  and  |/(z)|  — * °°  as  z 0 in  any  manner.  This  illustrates  the  following 
theorem. 


Poles 

Iffiz)  is  analytic  and  has  a pole  at  z = Zo>  then  \f(z)  \ —^^asz^Zoin  any  manner. 


The  proof  is  left  as  an  exercise  (see  Prob.  24). 

Behavior  Near  an  Essential  Singularity 

The  function /(z)  = c1/z  has  an  essential  singularity  at  z = 0.  It  has  no  limit  for  approach  along  the  imaginary 
axis;  it  becomes  infinite  if  z — » 0 through  positive  real  values,  but  it  approaches  zero  if  z — » 0 through  negative  real 
values.  It  takes  on  any  given  value  c = c0eza  A 0 in  an  arbitrarily  small  e-neighborhood  of  z = 0.  To  see  the 
latter,  we  set  z = re10,  and  then  obtain  the  following  complex  equation  for  r and  0,  which  we  must  solve: 

gl/z  _ £(cos  e-i  sin  0)/r  _ £ gia 

Equating  the  absolute  values  and  the  arguments,  we  have  e<cos  = c0,  that  is 

cos  6 = rlnc0,  and  —sin  0 = ar 

2 *2  2 2 2 2 

respectively.  From  these  two  equations  and  cos  6 + sin  6 = r (In  cq)  + a r = 1 we  obtain  the  formulas 

2 1 a 

r = ^ and  tan  6 = . 

(In  c0)z  + or  In  c0 

Hence  r can  be  made  arbitrarily  small  by  adding  multiples  of  277  to  a,  leaving  c unaltered.  This  illustrates  the 
very  famous  Picard's  theorem  (with  z = 0 as  the  exceptional  value). 


Picard's  Theorem 

If  f(z)  is  analytic  and  has  an  isolated  essential  singularity  at  a point  zq,  it  takes  on 
every  value , with  at  most  one  exceptional  value , in  an  arbitrarily  small  e -neighborhood 
ofzo- 


For  the  rather  complicated  proof,  see  Ref.  [D4],  vol.  2,  p.  258.  For  historical  information 
on  Picard,  see  footnote  9 in  Problem  Set  1.7. 
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EXAMPLE  4 


THEOREM  3 


PROOF 


THEOREM  4 


Removable  Singularities.  We  say  that  a function /(z)  has  a removable  singularity  at 
z = Zo  if  f(z)  is  not  analytic  at  z = z0.  but  can  be  made  analytic  there  by  assigning  a 
suitable  value  f(zo)-  Such  singularities  are  of  no  interest  since  they  can  be  removed  as 
just  indicated.  Example:  f(z)  = (sin  z)/z  becomes  analytic  at  z = 0 if  we  define /(0)  = 1. 

Zeros  of  Analytic  Functions 

A zero  of  an  analytic  function /(z)  in  a domain  I)  is  a 7 = 7()  in  I)  such  that  f(zo)  = 0. 
A zero  has  order  n if  not  only  /but  also  the  derivatives  1 1 are  all  0 at  z = zo 

but/<>l)(zo)  A 0.  A first-order  zero  is  also  called  a simple  zero.  For  a second-order  zero, 
f(z 0)  =f'(z0 ) = 0 but/"(z0)  A 0.  And  so  on. 

Zeros 

The  function  1 + z2  has  simple  zeros  at  ±i.  The  function  (1  — z4)2  has  second-order  zeros  at  ±1  and  ±i.  The 
function  (z  — ay  has  a third-order  zero  at  z — a.  The  function  ez  has  no  zeros  (see  Sec.  13.5).  The  function  sin  z 
has  simple  zeros  at  0,  ±77,  ±277,  • • ■ , and  sin2  z has  second-order  zeros  at  these  points.  The  function  1 — cos  z has 
second-order  zeros  at  0,  ±277,  ±477,  • • • , and  the  function  (1  — cos  zy  has  fourth-order  zeros  at  these  points. 


Taylor  Series  at  a Zero.  At  an  rcth-order  zero  z = Zo  of  /(z),  the  derivatives  f'izo),  • • • , 
/(n-D(z0)  are  zero,  by  definition.  Hence  the  first  few  coefficients  a0,  • • • , an_i  of  the  Taylor 
series  (1),  Sec.  15.4,  are  zero,  too,  whereas  an  A 0,  so  that  this  series  takes  the  form 

(3)  f(z)  = an(z  ~ z0)n  + an+1(z  - z0)n+1  + ■■■ 

= (z  ~ z0)n  [an  + an+i(z  ~ z0)  + an+2(z  ~ Zof  + • • • ] (an  + 0). 

This  is  characteristic  of  such  a zero,  because,  if  f(z)  has  such  a Taylor  series,  it  has  an 
nth-order  zero  at  z = zo,  as  follows  by  differentiation. 

Whereas  nonisolated  singularities  may  occur,  for  zeros  we  have 


Zeros 

The  zeros  of  an  analytic  function  f(z)  (#  0)  are  isolated;  that  is,  each  of  them  has 
a neighborhood  that  contains  no  further  zeros  off(z). 


The  factor  (z  — z.of'  in  (3)  is  zero  only  at  z = zo-  The  power  series  in  the  brackets  [ • ■ • ] 
represents  an  analytic  function  (by  Theorem  5 in  Sec.  15.3),  call  it  g(z).  Now 
g(z o)  = an  T 0,  since  an  analytic  function  is  continuous,  and  because  of  this  continuity, 
also  g(z)  A 0 in  some  neighborhood  of  z = z o-  Hence  the  same  holds  of  f(z). 

This  theorem  is  illustrated  by  the  functions  in  Example  4. 

Poles  are  often  caused  by  zeros  in  the  denominator.  ( Example : tan  z has  poles  where 
cos  z is  zero.)  This  is  a major  reason  for  the  importance  of  zeros.  The  key  to  the  connection 
is  the  following  theorem,  whose  proof  follows  from  (3)  (see  Team  Project  12). 


Poles  and  Zeros 

Let  f(z)  be  analytic  at  z = Zo  and  have  a zero  of  nth  order  at  z = Zo-  Then  l//(z) 
has  a pole  of  nth  order  at  z = Zo',  and  so  does  h(z)/f{z),  provided  h{z)  is  analytic 
at  z = z o and  h(zo)  A 0. 


718 


CHAP.  16  Laurent  Series.  Residue  Integration 


EXAMPLE  5 


N 


Riemann  Sphere.  Point  at  Infinity 

When  we  want  to  study  complex  functions  for  large  | z I , the  complex  plane  will  generally 
become  rather  inconvenient.  Then  it  may  be  better  to  use  a representation  of  complex  numbers 
on  the  so-called  Riemann  sphere.  This  is  a sphere  S of  diameter  1 touching  the  complex 
"-plane  at  z = 0 (Fig.  372),  and  we  let  the  image  of  a point  P (a  number  z in  the  plane)  be 
the  intersection  P*  of  the  segment  PN  with  S,  where  N is  the  “North  Pole”  diametrically 
opposite  to  the  origin  in  the  plane.  Then  to  each  z there  corresponds  a point  on  S. 

Conversely,  each  point  on  S represents  a complex  number  z,  except  for  N,  which  does 
not  correspond  to  any  point  in  the  complex  plane.  This  suggests  that  we  introduce  an 
additional  point,  called  the  point  at  infinity  and  denoted  °°  (“infinity”)  and  let  its  image 
be  N.  The  complex  plane  together  with  °o  is  called  the  extended  complex  plane.  The 
complex  plane  is  often  called  the  finite  complex  plane,  for  distinction,  or  simply  the 
complex  plane  as  before.  The  sphere  S is  called  the  Riemann  sphere.  The  mapping  of 
the  extended  complex  plane  onto  the  sphere  is  known  as  a stereographic  projection. 
(What  is  the  image  of  the  Northern  Hemisphere?  Of  the  Western  Hemisphere?  Of  a straight 
line  through  the  origin?) 

Analytic  or  Singular  at  Infinity 

If  we  want  to  investigate  a function/(z)  for  large  | z \ , we  may  now  set  z = l/w  and  investigate 
f(z)  = f (l/w)  = g(w)  in  a neighborhood  of  w = 0.  We  define/Yz)  to  be  analytic  or  singular 
at  infinity  if  g(w)  is  analytic  or  singular,  respectively,  at  w = 0.  We  also  define 

(4)  g(0)  = lim  g(w) 

to— >0 

if  this  limit  exists. 

Furthermore,  we  say  that  f(z)  has  an  nth-order  zero  at  infinity  if /(l/w)  has  such  a zero 
at  w = 0.  Similarly  for  poles  and  essential  singularities. 

Functions  Analytic  or  Singular  at  Infinity.  Entire  and  Meromorphic  Functions 

The  function /(z)  = 1/z2  is  analytic  at  °°  since  g(w ) = /(l/w)  = w2  is  analytic  at  w = 0,  and  /(z ) has  a second- 
order  zero  at  oo.  The  function /(z)  = zd  is  singular  at  o°  and  has  a third-order  pole  there  since  the  function 
g(w)  = /(l/w)  = l/w3  has  such  a pole  at  w = 0.  The  function  ez  has  an  essential  singularity  at  °°  since  e^'w 
has  such  a singularity  at  w = 0.  Similarly,  cos  z and  sin  z have  an  essential  singularity  at 

Recall  that  an  entire  function  is  one  that  is  analytic  everywhere  in  the  (finite)  complex  plane.  Liouville’s 
theorem  (Sec.  14.4)  tells  us  that  the  only  bounded  entire  functions  are  the  constants,  hence  any  nonconstant 
entire  function  must  be  unbounded.  Hence  it  has  a singularity  at  °°,  a pole  if  it  is  a polynomial  or  an  essential 
singularity  if  it  is  not.  The  functions  just  considered  are  typical  in  this  respect. 
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An  analytic  function  whose  only  singularities  in  the  finite  plane  are  poles  is  called  a meromorphic  function. 
Examples  are  rational  functions  with  nonconstant  denominator,  tan  z,  cot  z,  sec  z,  and  esc  Z- 


In  this  section  we  used  Laurent  series  for  investigating  singularities.  In  the  next  section 
we  shall  use  these  series  for  an  elegant  integration  method. 


FRQBLFNT-S^F^ 


1-10 


ZEROS 


Determine  the  location  and  order  of  the  zeros. 


1.  sin4|z 

2.  (z4  - 81)3 

3.  (z  + 8 1 04 

4.  tan2  2z 

5.  z-2  sin2  7 Tz 

6.  cosh4  z 

7.  z4  + (1  - 8i)z2  - 8/ 

8.  (sin  z - l)3 

9.  sin  2z  cos  2z 

10.  (z2  - 8)3(exp  (z2)  - 1) 

11.  Zeros.  If  /(z)  is  analytic  and  has  a zero  of  order  n at 

z = zo>  show  that /2(z)  has  a zero  of  order  2 n at  zo- 

12.  TEAM  PROJECT.  Zeros,  (a)  Derivative.  Show  that 

if  /(z)  has  a zero  of  order  n > 1 at  z = Zo,  then  fXz) 
has  a zero  of  order  n — 1 at  zo- 


(b)  Poles  and  zeros.  Prove  Theorem  4. 

(c)  Isolated  ^-points.  Show  that  the  points  at  which 
a nonconstant  analytic  function  f(z)  has  a given  value 
k are  isolated. 


(d)  Identical  functions.  If/i(z)  and/2(z)  are  analytic 
in  a domain  D and  equal  at  a sequence  of  points  zn  in 
D that  converges  in  D,  show  that/i(z)  = /2(z)  in  D. 


13-22 


SINGULARITIES 


Determine  the  location  of  the  singularities,  including  those 
at  infinity.  For  poles  also  state  the  order.  Give  reasons. 


13. 


(z  + 2 if 


z ~ i 


z + 1 

(z  - if 


14.  ez 


(z  - if 


15.  z exp  (l/(z  — 1 — if)  16.  tan  7 Tz 


17.  cot4  z 


18.  z3  exp 


yZ  — 1 . 

19.  l/(ez  — e2z)  20.  1 / (cos  z — sin  z) 

21.  e1/<z-1)/(ez  - 1)  22.  (z  — 7r)_1  sin  z 

23.  Essential  singularity.  Discuss  e1'2  in  a similar  way  as 
e x!z  is  discussed  in  Example  3 of  the  text. 

24.  Poles.  Verify  Theorem  1 for/(z)  = z-3  — z-1.  Prove 
Theorem  1. 


25.  Riemann  sphere.  Assuming  that  we  let  the  image  of 
the  x-axis  be  the  meridians  0°  and  180°,  describe  and 
sketch  (or  graph)  the  images  of  the  following  regions 
on  the  Riemann  sphere:  (a)  |z|  > 100,  (b)  the  lower 
half-plane,  (c)  2 S |z|  S 2. 


16.  Residue  Integration  Method 

We  now  cover  a second  method  of  evaluating  complex  integrals.  Recall  that  we  solved 
complex  integrals  directly  by  Cauchy’s  integral  formula  in  Sec.  14.3.  In  Chapter  15  we 
learned  about  power  series  and  especially  Taylor  series.  We  generalized  Taylor  series  to 
Laurent  series  (Sec.  16.1)  and  investigated  singularities  and  zeroes  of  various  functions 
(Sec.  1 6.2).  Our  hard  work  has  paid  off  and  we  see  how  much  of  the  theoretical  groundwork 
comes  together  in  evaluating  complex  integrals  by  the  residue  method. 

The  purpose  of  Cauchy’s  residue  integration  method  is  the  evaluation  of  integrals 


° f(z)  dz 
Jc 

taken  around  a simple  closed  path  C.  The  idea  is  as  follows. 

If  f(z)  is  analytic  everywhere  on  C and  inside  C,  such  an  integral  is  zero  by  Cauchy’s 
integral  theorem  (Sec.  14.2),  and  we  are  done. 
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EXAMPLE  1 


EXAMPLE  2 


The  situation  changes  if  f(z)  has  a singularity  at  a point  z = Zo  inside  C but  is  otherwise 
analytic  on  C and  inside  C as  before.  Then  f(z)  has  a Laurent  series 


f(z)  = 2 an(z  ~ zo)n  + 

n= 0 


bl 

Z ~ Zo 


t>2 

H o 

(z  — Zo) 


+ • • • 


that  converges  for  all  points  near  z = Zo  (except  at  z = zo  itself),  in  some  domain  of  the 
formO  < | z — zo  I < R (sometimes  called  a deleted  neighborhood,  an  old-fashioned  term 
that  we  shall  not  use).  Now  comes  the  key  idea.  The  coefficient  b\  of  the  first  negative 
power  l/(z  — zo)  of  this  Laurent  series  is  given  by  the  integral  formula  (2)  in  Sec.  16.1 
with  n = 1,  namely. 


bi 


1 

27 Ti 


<>  /(z)  dz. 
c 


Now,  since  we  can  obtain  Laurent  series  by  various  methods,  without  using  the  integral 
formulas  for  the  coefficients  (see  the  examples  in  Sec.  16.1),  we  can  find  b\  by  one  of 
those  methods  and  then  use  the  formula  for  bi  for  evaluating  the  integral,  that  is, 


(1) 


° /(z)  dz  = 2TTibi. 
c 


Here  we  integrate  counterclockwise  around  a simple  closed  path  C that  contains  z = z o 
in  its  interior  (but  no  other  singular  points  of  f(z)  on  or  inside  C!). 

The  coefficient  b\  is  called  the  residue  of/(z)  at  z = zo  and  we  denote  it  by 

(2)  bx  = Res  /(z). 

z=z0 


Evaluation  of  an  Integral  by  Means  of  a Residue 

Integrate  the  function /(z)  = z_4sinz  counterclockwise  around  the  unit  circle  C. 
Solution.  From  (14)  in  Sec.  15.4  we  obtain  the  Laurent  series 


/(z)  = 


3!z 


5! 


z_ 

7! 


which  converges  for  |z|  > 0 (that  is,  for  all  z A 0).  This  series  shows  that/(z)  has  a pole  of  third  order  at  z = 0 
and  the  residue  N = —3!.  From  (1)  we  thus  obtain  the  answer 


1 dz  = hribi 


7 Ti 
3 


CAUTION  Use  the  Right  Laurent  Series! 

Integrate /(z)  = l/(z3  — Z4)  clockwise  around  the  circle  C:  |z|  = 2 ■ 

Solution,  z3  — z4  = Z3(l  — z)  shows  that/(z)  is  singular  at  z = 0 and  z = 1.  Now  z = 1 lies  outside  C. 
Hence  it  is  of  no  interest  here.  So  we  need  the  residue  of  /(z)  at  0.  We  find  it  from  the  Laurent  series  that 
converges  for  0 < |z|  < 1.  This  is  series  (I)  in  Example  4,  Sec.  16.1, 

1 _ 1 + 1 + 1 


(0  < |z|  < 1). 
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PROOF 


We  see  from  it  that  this  residue  is  1.  Clockwise  integration  thus  yields 

f dz 

<p  = — 277/ Res  /(z)  = —277!. 

Jc  z3  - z4  2=0 

CAUTION!  Had  we  used  the  wrong  series  (II)  in  Example  4,  Sec.  16.1, 


1 


1 1 1 


(Izl  > 1), 


we  would  have  obtained  the  wrong  answer,  0,  because  this  series  has  no  power  1/z. 


Formulas  for  Residues 

To  calculate  a residue  at  a pole,  we  need  not  produce  a whole  Laurent  series,  but,  more 
economically,  we  can  derive  formulas  for  residues  once  and  for  all. 

Simple  Poles  at  zo-  A first  formula  for  the  residue  at  a simple  pole  is 


(3)  Res  /(z ) = h = lim  ( z ~ zo)f(z). 

Z = Z0  Z *Zq 

A second  formula  for  the  residue  at  a simple  pole  is 


(4) 


p{z) 

Res  f(z)  = Res  — 
z=z0  z=z0  q\Z) 


p(z  o) 

q'(z0) ' 


(Proof  below). 


(Proof  below). 


In  (4)  we  assume  that/(z)  = p(z)/q(z)  with  p(zo)  A 0 and  q(z)  has  a simple  zero  at  zo, 
so  that/(z)  has  a simple  pole  at  zo  by  Theorem  4 in  Sec.  16.2. 

We  prove  (3).  For  a simple  pole  at  z = Zo  the  Laurent  series  (1),  Sec.  16.1,  is 

/(z)  = z _1.Q  + a0  + ax(z  ~ z0)  + a2(z  ~ Z0f  + ’ ' • (0  < |z  - z0|  < R). 

Here  hx  A 0.  (Why?)  Multiplying  both  sides  by  z ~ Z0  and  then  letting  z — » z o,  we  obtain 
the  formula  (3): 

lim  (z  - z0)/(z)  = h + lim  (z  - z0)[a0  + a,(z  - z0)  + • ■ • ] = bx 

z^z0  Z->Z0 

where  the  last  equality  follows  from  continuity  (Theorem  1,  Sec.  15.3). 

We  prove  (4).  The  Taylor  series  of  q(z)  at  a simple  zero  Zo  is 

(z  — Zo)2  „ 

q(z)  = (z  ~ z0)q  (zo)  + — — q (zo)  + • • • ■ 


Substituting  this  into  / = p/q  and  then  / into  (3)  gives 


Res  /(z) 

z=z0 


lim  (z  - zo) 

Z *Zq 


P(z) 

q(z ) 


(z  - Zo)p(z) 

(z  - Zo)[q\zo)  + (z  - Zo)q"(Zo)/2  + ■■■]' 


Z — Zo  cancels.  By  continuity,  the  limit  of  the  denominator  is  q'(zo ) and  (4)  follows. 
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PROOF 


EXAMPLE  4 


Residue  at  a Simple  Pole 

/(z)  = (9 z + /)/ (z3  + z)  has  a simple  pole  at  i because  z2  + 1 = (z  + /)(z  — /),  and  (3)  gives  the  residue 


9z  + i 9 z + i 

Res  — z = lim  (z  — i ) 


9z  + i 


10/ 


z=i  z(z  + l)  *-«  z(z  + /)(z  - 0 Lz(z  + 0 J2=<  “2 

By  (4)  with  p(/)  = 9/  + / and  r/Vz)  = 3z2  + 1 we  confirm  the  result, 


= -5 /. 


Res 


9z  + i 


9z  + i 


z(z"  + 1) 


10/ 

-2 


= —5/. 


Poles  of  Any  Order  at  Zo-  The  residue  of/(z)  at  an  mth-order  pole  at  zo  is 


(5) 


Res  f(z)  = — 

z=z0  (m 


ly.  .“S’.  { 


jin- 1 


dz 


m— 1 


(z  - zo)mf(z) 


In  particular,  for  a second-order  pole  (m  = 2), 

(5*)  Res  /(z)  = lim  { [(z  - z0)2/(z)]' }. 

Z=Zq  Z—>Zq 


We  prove  (5).  The  Laurent  series  of/(z)  converging  near  zo  (except  at  zo  itself)  is  (Sec.  16.2) 


f(z)  = 


bm—  1 


(z  - zo)m  (Z  - Zo)" 


+ • • • H i fl0  + fli(z  — Zo)  + 

Z - Zo 


where  bm  A 0.  The  residue  wanted  is  b\.  Multiplying  both  sides  by  (z  — Zo)m  gives 
(z  ~ z0)m/(z)  = bm  + fem_i(z  - Zo)  + • • • + h (z  ~ z0)m_1  + a0(z  ~ z0)m  + • • • ■ 

We  see  that  b\  is  now  the  coefficient  of  the  power  (z  — z0)m_1of  the  power  series  of 
g(z)  = (z  ~ Zo)mf(z).  Hence  Taylor’s  theorem  (Sec.  15.4)  gives  (5): 


bi  = 


Cm— 1)/  \ 

8 (Zo) 


(m  — 1)! 


1 


jm—  1 


(m  - 1)!  dzm ' 


rf  [(z  - zo)T(z)]. 


Residue  at  a Pole  of  Higher  Order 

f(z ) — 50z/(z3  + 2z2  — lz  + 4)  has  a pole  of  second  order  at  z = 1 because  the  denominator  equals 
(z  + 4)(z  — l)2  (verify!).  From  (5*)  we  obtain  the  residue 


Res  /(z)  = lim  — [(z  - l)2/(z)]  = lim  — 
2=1  2 *1  2 >1 
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PROOF 


Several  Singularities  Inside  the  Contour. 

Residue  Theorem 

Residue  integration  can  be  extended  from  the  case  of  a single  singularity  to  the  case  of 
several  singularities  within  the  contour  C.  This  is  the  purpose  of  the  residue  theorem.  The 
extension  is  surprisingly  simple. 


Residue  Theorem 

Let  f(z)  be  analytic  inside  a simple  closed  path  C and  on  C,  except  for  finitely  many 
singular  points  Zi,  Zz> ' ' ' > Zfc  inside  C.  Then  the  integral  off(z)  taken  counterclockwise 
around  C equals  27 ri  times  the  sum  of  the  residues  off(z)  at  zi,  • • • , zu- 


(6) 


k 

°/(z)  dz  = 277/ 2 Res  f(z). 

J„  , , Z=Z.7 


We  enclose  each  of  the  singular  points  Zj  in  a circle  Cj  with  radius  small  enough  that 
those  k circles  and  C are  all  separated  (Fig.  373  where  k = 3).  Then/(z)  is  analytic  in  the 
multiply  connected  domain  D bounded  by  C and  C i,  ■ ■ ■ , Cf  - and  on  the  entire  boundary 
of  D.  From  Cauchy’s  integral  theorem  we  thus  have 


(7) 


Jc 


Jc1 


< ) /(z)  dz  + <■  > f(z)  dz  + < > /(z)  dz,  + ■■■  + o /(z)  dz  = 0 


ck 


the  integral  along  C being  taken  counterclockwise  and  the  other  integrals  clockwise  (as  in 
Figs.  354  and  355,  Sec.  14.2).  We  take  the  integrals  over  C i,  • • • , Cfc  1°  the  right  and 
compensate  the  resulting  minus  sign  by  reversing  the  sense  of  integration.  Thus, 


(8) 

> /(z)  dz  = ( 

> f(z)  dz  + < 

> /(z)  dz  + • ■ ■ + ( > 

. 

c J 

Ci  J 

c2  c 

where  all  the  integrals  are  now  taken  counterclockwise.  By  (1)  and  (2), 


< > /(z)  dz  = 277/  Res  /(z), 

JCj 

so  that  (8)  gives  (6)  and  the  residue  theorem  is  proved. 


1 


Fig.  373.  Residue  theorem 
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EXAMPLE  7 


This  important  theorem  has  various  applications  in  connection  with  complex  and  real  integrals. 
Let  us  first  consider  some  complex  integrals.  (Real  integrals  follow  in  the  next  section.) 

Integration  by  the  Residue  Theorem.  Several  Contours 

Evaluate  the  following  integral  counterclockwise  around  any  simple  closed  path  such  that  (a)  0 and  1 are  inside 
C,  (b)  0 is  inside,  1 outside,  (c)  1 is  inside,  0 outside,  (d)  0 and  1 are  outside. 

f 4 — 3 z 

f 2 * 

Z — 7 


Solution.  The  integrand  has  simple  poles  at  0 and  1,  with  residues  [by  (3)] 


Res 


4 — 3 z 
o z(z  - 1) 


4 - 3z 


z - 1 


= “4, 


4 - 3z 

Res  — 7T 

2=1  z(z  - 1) 


4 - 3z 


= 1. 


[Confirm  this  by  (4).]  Answer:  (a)  2iri(— 4 + 1)  = — 677r,  (b)  — 87 Ti,  (c)  277!,  (d)  0. 


Another  Application  of  the  Residue  Theorem 

Integrate  (tanz)/(z2  — 1)  counterclockwise  around  the  circle  C:  |z|  = §. 

Solution,  tan  z is  not  analytic  at  ±tt/2,  ±37t/2,  • • • , but  all  these  points  lie  outside  the  contour  C.  Because 
of  the  denominator  z — 1 = (z  — l)(z  + 1)  the  given  function  has  simple  poles  at  ±1.  We  thus  obtain  from 
(4)  and  the  residue  theorem 


tan  z 


dz  = 2iri  Res 


tan  z 


Res 


tan  z 


= 2t Ti  - 


( tan  z tan  z 

i ( H 

V 2z  2=1  2z 

= 2tt;'  tan  1 = 9.7855 i. 


Poles  and  Essential  Singularities 

Evaluate  the  following  integral,  where  C is  the  ellipse  9x2  + y2  = 9 (counterclockwise,  sketch  it). 


K— 

'z4  — 16 


dz. 


Solution.  Since  z4  — 16  = 0 at  ±2 i and  ±2,  the  first  term  of  the  integrand  has  simple  poles  at  ±2i  inside 
C,  with  residues  [by  (4);  note  that  e2m  = 1] 


ze 

Res  —z 

2=2i  z4  — 16 


4z3 


1 

16 


Res  —z 

2=— 2i  z — 16 


4zd 


1 

16 


and  simple  poles  at  ±2,  which  lie  outside  C,  so  that  they  are  of  no  interest  here.  The  second  term  of  the  integrand 
has  an  essential  singularity  at  0,  with  residue  77/ 2 as  obtained  from 


= 


= z 1 + 


T7Z  1 


= Z + T7 


(\z\  > 0). 


Answer:  2iri{—  yq  ~ ib  + i77-2)  = 77’(77'2  — \)i  = 30.221/  by  the  residue  theorem. 
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FRQB1-E^M=SIET— 


1.  Verify  the  calculations  in  Example  3 and  find  the  other 
residues. 

2.  Verify  the  calculations  in  Example  4 and  find  the  other 
residue. 


3-12  RESIDUES 

Find  all  the  singularities  in  the  finite  plane  and  the 
corresponding  residues.  Show  the  details. 


3. 


sin  2z 


5. 


8 

1 + z2 


COS  z 

4-  — 

z 

6.  tan  z 


7.  cot  7TZ 

9.  1 

11. 


1 - tC 

ez 

(Z  - TTif 


8. 


10. 


77 

(zz  ~ l)2 
z4 

zz  - iz  + 2 


12.  e1/a_z) 


13.  CAS  PROJECT.  Residue  at  a Pole.  Write  a program 
for  calculating  the  residue  at  a pole  of  any  order  in  the 
finite  plane.  Use  it  for  solving  Probs.  5-10. 


14-25  RESIDUE  INTEGRATION 

Evaluate  (counterclockwise).  Show  the  details, 
z - 23 


14.  o 


z - 4z  - 5 


dz,  C:  |z  — 2 — i\  = 3.2 


15.  o tan27rzc?z,  C:  |z  — 0.2 1 = 0.2 
•'C 


16.  9 e x!z  dz,  C : the  unit  circle 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


COS  z 


dz,  C:  |z  - 77-1/2 1 = 4.5 


z + 1 

C4- 2z3 


dz,  C:\z-  1=2 


sinh  z 


2z  - i 


dz 

t 2 i \3  ' 
c(z  +1) 


COS  77Z 


-dz,  C:  \z  ~ 2i\  = 2 


C:  |z  - i|  = 3 


C:  |z|  = | 


c z 


z2  sin  z 


- dz,  C the  unit  circle 


4z^  - 1 


30z  - 23z  + 5 
(2z  - 1)2(3z  - 1) ' 


C the  unit  circle 


exp  (— z ) 
sin  4z 


dz,  C:  |z|  = 1.5 


z cosh  7 Tz 
C z4  + 13z2  + 36 


dz,  |z|  = 77 


16.4  Residue  Integration  of  Real  Integrals 

Surprisingly,  residue  integration  can  also  be  used  to  evaluate  certain  classes  of  complicated 
real  integrals.  This  shows  an  advantage  of  complex  analysis  over  real  analysis  or  calculus. 

Integrals  of  Rational  Functions  of  cos  6 and  sin  d 

We  first  consider  integrals  of  the  type 


/ = 


r2ir 


(1) 


'0 


,F(cos  9,  sin  9)  d9 
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where  F( cos  0,  sin  0)  is  a real  rational  function  of  cos  0 and  sin  0 [for  example,  (sin2  0)/ 
(5  — 4 cos  0)]  and  is  finite  (does  not  become  infinite)  on  the  interval  of  integration.  Setting 
eie  = z,  we  obtain 


(2) 


a 1 / id  i —i0\  1 

cos  0 = - (e  + e ) = - 


sin  0 = — (eie  - = — 

2 i 2 i 


z + 


z ~ 


Since  F is  rational  in  cos  0 and  sin  0,  Eq.  (2)  shows  that  F is  now  a rational  function  of 
z,  say ,f(z).  Since  dz/dO  = ie10,  we  have  dO  = dz/iz  and  the  given  integral  takes  the  form 


(3) 


, dz 
J=»  fiz)  ~ 


IZ 


and,  as  0 ranges  from  0 to  2tt  in  (1),  the  variable  z = elS  ranges  counterclockwise  once 
around  the  unit  circle  |z|  = 1.  (Review  Sec.  13.5  if  necessary.) 

An  Integral  of  the  Type  (1) 


Show  by  the  present  method  that 


dd 


= 277. 


J0  V2  — cos  l 

Solution.  We  use  cos  6 = |(z  + l/z)  and  dd  = dz/iz • Then  the  integral  becomes 


dz/  iz 


V2--U  + 


dz 


— (z2  - 2 Viz  + 1) 

2 


2 

i Jc 


dz 


{z  - VI-  l)(z  - V2  + 1) 


We  see  that  the  integrand  has  a simple  pole  at  zi  — V2  + 1 outside  the  unit  circle  C,  so  that  it  is  of  no  interest 
here,  and  another  simple  pole  at  Z2  = V2  — 1 (where  z — V2  +1=0)  inside  C with  residue  [by  (3),  Sec.  16.3] 


Res  - 


*=*»  (z  — VI  — i)(z  — VI  + l) 


: = V§-1 


z-V2-  1 

1 

2 ' 


Answer:  2tt;(— 2/ /)( — |)  = 277".  (Here  — 2/ / is  the  factor  in  front  of  the  last  integral.) 

As  another  large  class,  let  us  consider  real  integrals  of  the  form 


(4) 


/( x)  dx. 


Such  an  integral,  whose  interval  of  integration  is  not  finite  is  called  an  improper  integral, 
and  it  has  the  meaning 


(5') 


fix)  dx  = lim 


fix)  dx  + lim 


f(x)  dx. 
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If  both  limits  exist,  we  may  couple  the  two  independent  passages  to  — 00  and  °°,  and  write 

rR 


(5) 


fix)  dx  = lim 

R — >cc 


fix)  dx. 


—R 


The  limit  in  (5)  is  called  the  Cauchy  principal  value  of  the  integral.  It  is  written 


pr.  v. 


f{x)  dx. 


It  may  exist  even  if  the  limits  in  (5’)  do  not.  Example: 


lim 

R—><x> 


r R / O 

( R2 

r2\ 

rb 

x dx  = lim  — 

— — = 0, 

but 

lim 

x dx  = oo 

V 2 

2 J 

n 

We  assume  that  the  function /(x)  in  (4)  is  a real  rational  function  whose  denominator 
is  different  from  zero  for  all  real  x and  is  of  degree  at  least  two  units  higher  than  the 
degree  of  the  numerator.  Then  the  limits  in  (5  ) exist,  and  we  may  start  from  (5).  We 
consider  the  corresponding  contour  integral 


(5*) 


n fiz)  dz 
c 


around  a path  C in  Fig.  374.  Since  fix)  is  rational,  fiz)  has  finitely  many  poles  in  the 
upper  half-plane,  and  if  we  choose  R large  enough,  then  C encloses  all  these  poles.  By 
the  residue  theorem  we  then  obtain 


° fiz)dz 
c 


fiz)  dz  + 
s 


R 

fix)  dx 

-R 


2 TTi  2 Res/(z) 


where  the  sum  consists  of  all  the  residues  of  fiz)  at  the  points  in  the  upper  half-plane  at 
which  f(z)  has  a pole.  From  this  we  have 


(6) 


rR 

fix)  dx  = 277/ 2 Res  f(z) 
-R 


fiz)  dz. 
s 


We  prove  that,  if  R — » the  value  of  the  integral  over  the  semicircle  S approaches 
zero.  If  we  set  z = Re 18 . then  S is  represented  by  R = const,  and  as  z ranges  along  .S',  the 
variable  0 ranges  from  0 to  77.  Since,  by  assumption,  the  degree  of  the  denominator  of 
fiz)  is  at  least  two  units  higher  than  the  degree  of  the  numerator,  we  have 

\fiz)\  < kl9  (Id  = R > Ro) 

Izl 


Fig.  374.  Path  C of  the  contour  integral  in  (5*) 
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for  sufficiently  large  constants  k and  Rq . By  the  ML-inequality  in  Sec.  14.1, 


f(z)  dz 
s 


k kir 

< — 7TR  = 


R 4 


R 


(R  > Ro)- 


Hence,  as  R approaches  infinity,  the  value  of  the  integral  over  S approaches  zero,  and  (5) 
and  (6)  yield  the  result 


(7) 


f(x)  dx  = 27 Ti^Resf(z) 


where  we  sum  over  all  the  residues  of  f(z)  at  the  poles  of  f(z)  in  the  upper  half-plane. 

An  Improper  Integral  from  0 to  °° 

Using  (7),  show  that 

f dx  7T 

1 1 + x4  2V2  ' 


Solution.  Indeed, /(z)  = 1/(1  + z4)  has  four  simple  poles  at  the  points  (make  a sketch) 


z 1 = 


z2  = 


z3  = 


z4  = e-^*. 


The  first  two  of  these  poles  lie  in  the  upper  half-plane  (Fig.  375).  From  (4)  in  the  last  section  we  find  the  residues 


Res  f(z)  = 

Z=Zi 


Res  /(z)  = 
2=22 


1 

1 ' 

id  +Z4)'J 

Z = Zl 

,4z3 

Z = Zi 

1 

' 1 

Lei  + z4)' 

Z = Z2 

_4z3 

2 = 22 

i g-S’r*/4 
4 

le-97ri/4 

4 


- — e 7ri/i 

4 


4 


(Here  we  used  e"'  = — 1 and  e 2m  = 1.)  By  (1)  in  Sec.  13.6  and  (7)  in  this  section, 


dx 

1 + x4 


T77/  _ e-’ri/4)  = _ 3/171 

4 4 


77  77 

■ 2 i sin  — = 77  sin  — = 
4 4 


77 

V2 
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Since  1/(1  + Jt4)  is  an  even  function,  we  thus  obtain,  as  asserted. 


dx  1 
1 + x4  ~~  2 


dx  IT 

1 + .V4  2V2  ' 


Fourier  Integrals 

The  method  of  evaluating  (4)  by  creating  a closed  contour  (Fig.  374)  and  “blowing  it  up” 
extends  to  integrals 


(8) 


f(x)  cos  sx  dx  and 


f(x)  sin  sx  dx 


(s  real) 


as  they  occur  in  connection  with  the  Fourier  integral  (Sec.  11.7). 

If/(x)  is  a rational  function  satisfying  the  assumption  on  the  degree  as  for  (4),  we  may 
consider  the  corresponding  integral 


( > f(z)elsz  dz  (s  real  and  positive) 

Jc 

over  the  contour  C in  Fig.  374.  Instead  of  (7)  we  now  get 


(9) 


f(x)eisx  dx 


CO 


277/ 2 Res  [f(z)eisz] 


(s>  0) 


where  we  sum  the  residues  of  f(z)elsz  at  its  poles  in  the  upper  half-plane.  Equating  the 
real  and  the  imaginary  parts  on  both  sides  of  (9),  we  have 


(10) 


f(x)  cos  sx  dx  = — 277  2 Im  Res  [ f(z)elsz  ] , 
/(x)  sin  sx  dx  = 277  ^ Re  Res  [ f{z)elsz  ]. 


(s  > 0) 


To  establish  (9),  we  must  show  [as  for  (4)]  that  the  value  of  the  integral  over  the 
semicircle  S in  Fig.  374  approaches  0 as  Now  .y  > 0 and  S lies  in  the  upper  half- 

plane jgO.  Hence 

\eisz\  = £is(x+iy)  = \£isx\  |g-»»|  = J . e~sy  g j (s  > 0>  y g Q). 

From  this  we  obtain  the  inequality  \f{z)elsz\  = |/(z)|  |e“z|  = l/(z)l  (s  > 0,  y = 0).This 
reduces  our  present  problem  to  that  for  (4).  Continuing  as  before  gives  (9)  and  (10). 

An  Application  of  (10) 

f°°  cos  sx  7 T f“  sin  sx 

Show  that  dx  = — e , dx  = 0 (s  > 0,  k > 0). 

k2  + v2  k k2  + x2 
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Solution.  In  fact,  eMZ/(i:2  + Z2)  has  only  one  pole  in  the  upper  half-plane,  namely,  a simple  pole  at  z.  = ik, 
and  from  (4)  in  Sec.  16.3  we  obtain 


Res 

z=ik 


e 


isz  - 


e 


—ks 


2 z 


- z=ik 


2 ik 


Thus 


. kT  + 


dx  = 2iri  - 


2ik 


Since  e^x  = cos  sx  + i sin  sx,  this  yields  the  above  results  [see  also  (15)  in  Sec.  11.7.] 


Another  Kind  of  Improper  Integral 

We  consider  an  improper  integral 


(11) 


B 

f(x)  dx 


JA 

whose  integrand  becomes  infinite  at  a point  a in  the  interval  of  integration. 


lim  |/(x)|  = oo. 
x—>a 

By  definition,  this  integral  (11)  means 


(12) 


fix)  dx  = lim 
J v ' e— »o 


f{x)  dx  + 


lim 

77 — >0 


fix)  dx 


Cl+77 


where  both  e and  17  approach  zero  independently  and  through  positive  values.  It  may 
happen  that  neither  of  these  two  limits  exists  if  e and  17  go  to  0 independently,  but  the 
limit 


03) 


lim 

e— *0 


rd  — e 

r B 

fix)  dx  + 

fix)  dx 

A J 

a+e 

exists.  This  is  called  the  Cauchy  principal  value  of  the  integral.  It  is  written 


pr.  v. 


fix)  dx. 


For  example, 


pr.  v. 


, 1 


dx 

,3 


■'-1  ■ 


lim 


r 1 


dx 

,3 


= 0; 


the  principal  value  exists,  although  the  integral  itself  has  no  meaning. 

In  the  case  of  simple  poles  on  the  real  axis  we  shall  obtain  a formula  for  the  principal 
value  of  an  integral  from  — °°  to  °°.  This  formula  will  result  from  the  following  theorem. 
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THEOREM  1 


PROOF 


Simple  Poles  on  the  Real  Axis 

Iff(z)  has  a simple  pole  at  z = 

lim 

r— >0 


: a on  the  real  axis,  then  (Fig.  376) 
/(z)  dz  = tri  Res  /(z). 

-p  2=a 

'-'2 


a — r a a + r x 

Fig.  376.  Theorem  1 


By  the  definition  of  a simple  pole  (Sec.  16.2)  the  integrand  /(z)  has  for  0 < z — a\  < R 
the  Laurent  series 

/(z)  = zrzrr.  + g(z),  bx  = Res  /(z). 

<•  u z=a 


Here  g(z)  is  analytic  on  the  semicircle  of  integration  (Fig.  376) 

C2'.  Z = a + relB,  0 =§  9 Si  77 

and  for  all  z between  C2  and  the  x-axis,  and  thus  bounded  on  C2,  say,  g(z)  = M.  By 
integration. 


f(z)  dz  = 


c2 


1 • ,a  , 

—7-  ire  d0  + 

ZlB 


re 


g(z)  dz  = bxTti  + 


g(z)  dz. 


c2 


c2 


The  second  integral  on  the  right  cannot  exceed  Mirr  in  absolute  value,  by  the 
ML-inequality  (Sec.  14.1),  and  ML  = Mtti — > 0 as  r ^ 0.  ■ 


Figure  377  shows  the  idea  of  applying  Theorem  1 to  obtain  the  principal  value  of  the 
integral  of  a rational  function /(x)  from  — °o  to  °°.  For  sufficiently  large  R the  integral  over 
the  entire  contour  in  Fig.  377  has  the  value  J given  by  2jri  times  the  sum  of  the  residues 
of  /(z)  at  the  singularities  in  the  upper  half-plane.  We  assume  that/(x)  satisfies  the  degree 
condition  imposed  in  connection  with  (4).  Then  the  value  of  the  integral  over  the  large 


Fig.  377.  Application  of  Theorem  1 


732 


CHAP.  16  Laurent  Series.  Residue  Integration 


EXAMPLE  4 


semicircle  S approaches  0 as  For  r— >0  the  integral  over  C2  (clockwise!) 

approaches  the  value 


K — —TTi  Res  f(z) 

z=a 

by  Theorem  1.  Together  this  shows  that  the  principal  value  P of  the  integral  from  — °°  to 

oo  plus  K equals  J ; hence  P = J — K = J + iri  Resz a f(z).  If  f(z)  has  several  simple 

poles  on  the  real  axis,  then  K will  be  — 7 Ti  times  the  sum  of  the  corresponding  residues. 
Hence  the  desired  formula  is 


(14) 


pr.  v. 


r 00 

f(x)  dx 


2 77;'2Res/(z)  + 7n'2Res/(z) 


where  the  first  sum  extends  over  all  poles  in  the  upper  half-plane  and  the  second  over  all 
poles  on  the  real  axis,  the  latter  being  simple  by  assumption. 

Poles  on  the  Real  Axis 

Find  the  principal  value 

r dx 

pr.  v.  . 

( x 2 - 3*  + 2)(x2  + 1) 


Solution.  Since 


x2  - 3jc  + 2 = (x  - 1)(*  - 2), 

the  integrand  f(x),  considered  for  complex  z,  has  simple  poles  at 


Z = 1,  Res  m = 

Z-l 


Z = 2,  Res  f(z)  = 

z=2 


1 


L(z  - 2 )(zz  + 1) . 


1 


(Z  - 1 )(z2  + 1)  - 


Z = i.  Res  f(z)  = 


1 


(z2  - 3z  + 2)(z  + 0 . 

1 _ 3 - i 

6 + 2 i 20 


and  at  z—  ~ i in  the  lower  half-plane,  which  is  of  no  interest  here.  From  (14)  we  get  the  answer 

dx 


pr.  v. 


(x2  - 3x  + 2){x2  + 1) 


= 27 Ti 


3 - i 

\ / 

1 

1\ 

77 

— 

+ 77! 

— 

+ - 

= 

V 20 

J V 

2 

5/ 

to 

More  integrals  of  the  kind  considered  in  this  section  are  included  in  the  problem  set.  Try 
also  your  CAS,  which  may  sometimes  give  you  false  results  on  complex  integrals. 
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-FRQBl=E^M=5^T— 1=6^3= 


1-9  INTEGRALS  INVOLVING  COSINE  AND  SINE 


Evaluate  the  following  integrals  and  show  the  details  of 
your  work. 


1. 


2 dd 


k — cos  6 


2. 


dd 


77  + 3 cos  6 


21. 


22. 


„(x  - l)(x2  + 4) 
dx 


- dx 


„ . 1 + sin  6 

3.  | dd 

3 + cos  0 

(2lr  2 n 

s.  f cos  e de 

5—4  cos  6 


7.  d6 

JQ  a — sin  d 


4. 


6. 


8. 


1 + 4 cos  6 
17  — 8 cos  0 


-dd 


2,1  ■ 2 n 
sin  e 

5 — 4 cos  e 


-de 


1 

8 — 2 sin  e 


de 


9. 


cos  e 


13  - 12  cos  26 


-de 


10-22 


IMPROPER  INTEGRALS: 

INFINITE  INTERVAL  OF  INTEGRATION 

Evaluate  the  following  integrals  and  show  details  of  your 
work. 


10. 


12. 


14. 


dx 


(1  + x2)3 


dx 


(x2  - 2x  + 5)2 


x2  + 1 
, x4  + 1 


dx 


16.  | 

, (x2  + l)2 
cos  4x 

, x4  + 5x2  + 4 
x 


18. 


11. 


13. 


15. 


17. 


dx  19. 


dx 


(1  + x2)2 


(x2  + l)(x2  + 4) 


dx 


. xb  + 1 
sin  3x 
, x4  + 1 


- dx 


dx 


dx 

, x4  — 1 


20. 


8 — x 


dx 


23-26 


IMPROPER  INTEGRALS: 

POLES  ON  THE  REAL  AXIS 

Find  the  Cauchy  principal  value  (showing  details): 


23. 


25. 


dx 


x + 5 


dx 


24. 


26. 


dx 


, x4  + 3x2  - 4 


, x4  - 1 


dx 


27.  CAS  EXPERIMENT.  Simple  Poles  on  the  Real 

Axis.  Experiment  with  integrals  fix)  dx, 

fix ) = [(x  — af){x  — a2 ) ■ ■ ■ (x  — af)\~  , Oj  real  and 
all  different,  k > 1.  Conjecture  that  the  principal  value 
of  these  integrals  is  0.  Try  to  prove  this  for  a special 
k,  say,  k = 3.  For  general  k. 

28.  TEAM  PROJECT.  Comments  on  Real  Integrals. 

(a)  Formula  (10)  follows  from  (9).  Give  the  details. 

(b)  Use  of  auxiliary  results.  Integrating  e around 
the  boundary  C of  the  rectangle  with  vertices  —a,  a, 
a + ib,  —a  + ib,  letting  a — * 00 , and  using 

V77 
2 ’ 


: dx  = 


show  that 


e x cos  2 bx  dx  = 


Vtt 


- b 2 


(This  integral  is  needed  in  heat  conduction  in  Sec. 
12.7.) 

(c)  Inspection.  Solve  Probs.  13  and  17  without 
calculation. 


SEH3gEE3EEEEErISEEEEBEEBHB^HEE  S T IONS  AND  PROBLEMS 


1.  What  is  a Laurent  series?  Its  principal  part?  Its  use? 
Give  simple  examples. 

2.  What  kind  of  singularities  did  we  discuss?  Give  defi- 
nitions and  examples. 

3.  What  is  the  residue?  Its  role  in  integration?  Explain 
methods  to  obtain  it. 


4.  Can  the  residue  at  a singularity  be  zero?  At  a simple 
pole?  Give  reason. 

5.  State  the  residue  theorem  and  the  idea  of  its  proof  from 
memory. 

6.  How  did  we  evaluate  real  integrals  by  residue  integration? 
How  did  we  obtain  the  closed  paths  needed? 
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7.  What  are  improper  integrals?  Their  principal  value? 
Why  did  they  occur  in  this  chapter? 

8.  What  do  you  know  about  zeros  of  analytic  functions? 
Give  examples. 

9.  What  is  the  extended  complex  plane?  The  Riemann 
sphere  R1  Sketch  z = 1 + i on  R. 

10.  What  is  an  entire  function?  Can  it  be  analytic  at 
infinity?  Explain  the  definitions. 


11-18 


COMPLEX  INTEGRALS 


Integrate  counterclockwise  around  C.  Show  the  details. 


11. 


sin  3z 


C:\z\  = 7 t 


12.  e2/z,  C:\z~  1 - i\  =2 


13. 


5z3 

z2  + 4 ' 


C:  z = 3 


14. 


5z3 

z2  + 4 


C:\z 


Trill 


15. 


25z2 

(z  - 5)2  ’ 


Cx\z-  5|  = 1 


16. 


15z  + 9 
z3  - 9z 


C:  |z|  = 4 


cos  z . . 

17.  — — , n = 0,  1,  2,  ■ ■ - , C:  |zl  = 1 


18.  cot  4z,  C:  |z|  = 


19-25 


REAL  INTEGRALS 


Evaluate  by  the  methods  of  this  chapter.  Show  details. 


19. 


de 


13  — 5 sin  0 


20. 


sin  0 


3 + cos  6 


de 


21. 


22. 


24. 


sin  6 


34  — 16  sin  0 
dx 


dO 


, 1 + 4x* 
dx 

, x2  — 4ix 


23. 


25. 


(1  + x2f 


dx 


- dx 


SUMMARY  OF  CHAPTER  16 

Laurent  Series.  Residue  Integration 


A Laurent  series  is  a series  of  the  form 

00  oo  , 

(1)  f(z)  = 2 Oniz  - Zo)n  + 2 (SeC-  16'1} 

n= 0 n=  1 ^ Zo' 

or,  more  briefly  written  [but  this  means  the  same  as  (1)!] 

(1*)  /(z)  = 2 an(z  - zo)n,  an  = — O ^ dz* 

n±x  2771  lc  (z*  - Zo)n  + 1 

where  n = 0,  ±1,  ±2,  ■ • • . This  series  converges  in  an  open  annulus  (ring)  A with 
center  z().  In  A the  function  f(z)  is  analytic.  At  points  not  in  A it  may  have 
singularities.  The  first  series  in  (1)  is  a power  series.  In  a given  annulus,  a Laurent 
series  of  /(z)  is  unique,  but/(z)  may  have  different  Laurent  series  in  different  annuli 
with  the  same  center. 

Of  particular  importance  is  the  Laurent  series  (1 ) that  converges  in  a neighborhood 
of  zo  except  at  zo  itself,  say,  for  0 < \z  - z0l  < R (R  > 0,  suitable).  The  series 


Summary  of  Chapter  16 


735 


(or  finite  sum)  of  the  negative  powers  in  this  Laurent  series  is  called  the  principal 
part  of f(z)  at  Zq-  The  coefficient  b:  of  l/(z  — Zo)  in  this  series  is  called  the  residue 
of  f(z)  at  zo  and  is  given  by  [see  (1)  and  (1*)] 


(2)  h = Res  f(z) 

Z-^-Zo 


1 

277/ 


<>  f(z*)dz*. 
c 


Thus 


<>  f(z*)dz*  = 277/ Res  f(z). 

L z=z° 


b\  can  be  used  for  integration  as  shown  in  (2)  because  it  can  be  found  from 

/ »m— 1 \ 

(3)  Res  f(z)  = — lim  — ^ [(z  - z0)mf(z)]  , (Sec.  16.3), 

provided /(z)  has  at  a pole  of  order  ///;  by  definition  this  means  that  principal 
part  has  l/(z  — Zo)m  as  its  highest  negative  power.  Thus  for  a simple  pole  (///  = 1), 


Res  f(z)  = lim  (z  ~ z0)f(z ); 

Z=Z0  Z-*Z0 


also, 


P(z) 

Res  

Z=z„  q{z) 


P(z  o) 
q'izo) 


If  the  principal  part  is  an  infinite  series,  the  singularity  of  f(z)  at  z0  is  called  an 
essential  singularity  (Sec.  16.2). 

Section  16.2  also  discusses  the  extended  complex  plane,  that  is,  the  complex  plane 
with  an  improper  point  °°  (“infinity”)  attached. 

Residue  integration  may  also  be  used  to  evaluate  certain  classes  of  complicated 
real  integrals  (Sec.  16.4). 


CHAPTER 


Conformal  Mapping 


Conformal  mappings  are  invaluable  to  the  engineer  and  physicist  as  an  aid  in  solving 
problems  in  potential  theory.  They  are  a standard  method  for  solving  boundary  value 
problems  in  two-dimensional  potential  theory  and  yield  rich  applications  in  electrostatics, 
heat  flow,  and  fluid  flow,  as  we  shall  see  in  Chapter  18. 

The  main  feature  of  conformal  mappings  is  that  they  are  angle-preserving  (except  at 
some  critical  points)  and  allow  a geometric  approach  to  complex  analysis.  More  details 
are  as  follows.  Consider  a complex  function  w = f(z ) defined  in  a domain  D of  the  ’-plane; 
then  to  each  point  in  D there  corresponds  a point  in  the  vv- plane.  In  this  way  we  obtain  a 
mapping  of  D onto  the  range  of  values  of  f(z)  in  the  w-plane.  In  Sec.  17.1  we  show  that 
if f(z)  is  an  analytic  function,  then  the  mapping  given  by  vv  = f(z)  is  a conformal  mapping, 
that  is,  it  preserves  angles,  except  at  points  where  the  derivative  / (z)  is  zero.  (Such  points 
are  called  critical  points.) 

Conformality  appeared  early  in  the  history  of  construction  of  maps  of  the  globe. 
Such  maps  can  be  either  “conformal,”  that  is,  give  directions  correctly,  or  “equiareal,” 
that  is,  give  areas  correctly  except  for  a scale  factor.  However,  the  maps  will  always 
be  distorted  because  they  cannot  have  both  properties,  as  can  be  proven,  see  [GenRef8] 
in  App.  1.  The  designer  of  accurate  maps  then  has  to  select  which  distortion  to  take 
into  account. 

Our  study  of  conformality  is  similar  to  the  approach  used  in  calculus  where  we  study 
properties  of  real  functions  y = f(x)  and  graph  them.  Here  we  study  the  properties  of  conformal 
mappings  (Secs.  17.1-17.4)  to  get  a deeper  understanding  of  the  properties  of  functions,  most 
notably  the  ones  discussed  in  Chap.  13.  Chapter  17  ends  with  an  introduction  to  Riemann 
surfaces,  an  ingenious  geometric  way  of  dealing  with  multivalued  complex  functions  such  as 
w = sqrt  (z)  and  vv  = In  z. 

So  far  we  have  covered  two  main  approaches  to  solving  problems  in  complex  analysis. 
The  first  one  was  solving  complex  integrals  by  Cauchy’s  integral  formula  and  was  broadly 
covered  by  material  in  Chaps.  13  and  14.  The  second  approach  was  to  use  Laurent  series 
and  solve  complex  integrals  by  residue  integration  in  Chaps.  15  and  16.  Now,  in  Chaps.  17 
and  18,  we  develop  a third  approach,  that  is,  the  geometric  approach  of  conformal  mapping 
to  solve  boundary  value  problems  in  complex  analysis. 

Prerequisite:  Chap.  13. 

Sections  that  may  be  omitted  in  a shorter  course:  17.3  and  17.5. 

References  and  Answers  to  Problems:  App.  1 Part  D,  App.  2. 
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1Z1  Geometry  of  Analytic  Functions: 

Conformal  Mapping 

We  shall  see  that  conformal  mappings  are  those  mappings  that  preserve  angles,  except  at 
critical  points,  and  that  these  mappings  are  defined  by  analytic  functions.  A critical  point 
occurs  wherever  the  derivative  of  such  a function  is  zero.  To  arrive  at  these  results,  we 
have  to  define  terms  more  precisely. 

A complex  function 

(1)  w = f(z)  = u(x,y ) + iv(x,y ) (z  = x + iy) 


of  a complex  variable  z gives  a mapping  of  its  domain  of  definition  D in  the  complex 
2-plane  into  the  complex  w-plane  or  onto  its  range  of  values  in  that  plane.1  For  any  point  zo 
in  D the  point  up  = f(zo)  is  called  the  image  of  ’o  with  respect  to/.  More  generally,  for 
the  points  of  a curve  C in  D the  image  points  form  the  image  of  C;  similarly  for  other 
point  sets  in  D.  Also,  instead  of  the  mapping  by  a function  w = f(z ) we  shall  say  more 
briefly  the  mapping  w = f(z). 


Mapping  w = f[x)  — z2 * 

Using  polar  forms  z = re10  and  w = Re"'1,  we  have  w = z2  = r2e2ie . Comparing  moduli  and  arguments  gives 
R = r and  c ft  = 26.  Hence  circles  r = tq  are  mapped  onto  circles  R = tq  and  rays  6 = do  onto  rays  </>  = 26 o. 
Figure  378  shows  this  for  the  region  1 ^ \z\  = 2>  '?r/6  = 6 ^ tt/ 3,  which  is  mapped  onto  the  region 
1 S |w|  = ir/3  ses  277-/3. 

In  Cartesian  coordinates  we  have  z = x + iy  and 

u = Re  (z2)  = x2  — y2,  v = Im  (z2)  = 2xy. 

Hence  vertical  lines  x = c = const  are  mapped  onto  u = c2  — y2,  v = 2 cy.  From  this  we  can  eliminate  y.  We 
obtain  y2  = c2  — u and  v2  = 4 c2y2.  Together, 

v2  = 4 c\c2  - u)  (Fig.  379). 


These  parabolas  open  to  the  left.  Similarly,  horizontal  lines  y = k = const  are  mapped  onto  parabolas  opening 
to  the  right, 

v2  = 4 k\k2  + u ) (Fig.  379).  ■ 


(2-plane)  (u/-plane) 

Fig.  378.  Mapping  w = z2.  Lines  |z|  = const,  argz  = const  and  their  images  in  the  w-plane 


1The  general  terminology  is  as  follows.  A mapping  of  a set  A into  a set  B is  called  surjective  or  a mapping  of 

A onto  B if  every  element  of  B is  the  image  of  at  least  one  element  of  A.  It  is  called  injective  or  one-to-one  if 

different  elements  of  A have  different  images  in  B.  Finally,  it  is  called  bijective  if  it  is  both  suijective  and  injective. 
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Fig.  379.  Images  of  x = const,  y = const  under  w = z2 


Conformal  Mapping 


THEOREM  1 


PROOF 


A mapping  w = f(z)  is  called  conformal  if  it  preserves  angles  between  oriented  curves  in 
magnitude  as  well  as  in  sense.  Figure  380  shows  what  this  means.  The  angle  a (0  a Si  tt) 
between  two  intersecting  curves  C\  and  C2  is  defined  to  be  the  angle  between  their  oriented 
tangents  at  the  intersection  point  zo-  And  conformality  means  that  the  images  C*  and  C\ 
of  C 1 and  C2  make  the  same  angle  as  the  curves  themselves  in  both  magnitude  and  direction. 


Conformality  of  Mapping  by  Analytic  Functions 

The  mapping  w = fit)  by  an  analytic  function  f is  conformal,  except  at  critical 
points,  that  is,  points  at  which  the  derivative  f is  zero. 

2 f 

w = z has  a critical  point  at  z = 0,  where/  (z)  = 2z  = 0 and  the  angles  are  doubled  (see 
Fig.  378),  so  that  conformality  fails. 

The  idea  of  proof  is  to  consider  a curve 


in  the  domain  of  /(z)  and  to  show  that  w = f(z)  rotates  all  tangents  at  a point  z0  (where 
/ (zo)  ^ 0)  through  the  same  angle.  Now  z(t)  = dz/dt  = x(t)  + iy(t)  is  tangent  to  C in 
(2)  because  this  is  the  limit  of  (zj  — zo)/A?  (which  has  the  direction  of  the  secant  zi  — Zo 


(2) 


C:  z(t)  = x(t)  + iy(t ) 


(z-plane) 


(w-plane) 


Fig.  380.  Curves  C|  and  C2  and  their  respective  images 
C*  and  C*  under  a conformal  mapping  w = f(z ) 
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in  Fig.  381)  as  zy  approaches  z o along  C.  The  image  C*  of  C is  w = f(z(t)).  By  the  chain 
rule,  w = f (z.(t))z(t).  Hence  the  tangent  direction  of  C*  is  given  by  the  argument  (use  (9) 
in  Sec.  13.2) 

(3)  arg  vv  = arg /'  + arg  z 

where  arg  z gives  the  tangent  direction  of  C.  This  shows  that  the  mapping  rotates  all 
directions  at  a point  zo  in  the  domain  of  analyticity  of/through  the  same  angle  arg / (zo), 
which  exists  as  long  as  f (z0)  i=  0.  But  this  means  conformality,  as  Fig.  381  illustrates 
for  an  angle  a between  two  curves,  whose  images  C*  and  C|  make  the  same  angle  (because 
of  the  rotation). 


Tangent 


In  the  remainder  of  this  section  and  in  the  next  ones  we  shall  consider  various  conformal 
mappings  that  are  of  practical  interest,  for  instance,  in  modeling  potential  problems. 

EXAMPLE  2 Conformality  of  w — zn 

The  mapping  w = zn,  n = 2,  3,  • • • , is  conformal,  except  at  z = 0,  where  w'  = nz”_1  = 0.  For  n 2 this  is 
shown  in  Fig.  378;  we  see  that  at  0 the  angles  are  doubled.  For  general  n the  angles  at  0 are  multiplied  by  a 
factor  n under  the  mapping.  Flence  the  sector  OSDS  7r/ti  is  mapped  by  zn  onto  the  upper  half-plane  [)g0 
(Fig.  382). 


EXAMPLE  3 


Mapping  w — z + 1/z.  Joukowski  Airfoil 

In  terms  of  polar  coordinates  this  mapping  is 

1 

w = u + iv  = r{ cos  6 + i sin  6)  + — (cos  6 — i sin  6). 
By  separating  the  real  and  imaginary  parts  we  thus  obtain 


u = a cos  8 , v = b sin  8 where  a 


b = 


r — 


1 

r ' 


II  2/2  2/2 

Hence  circles  \z\  — r = const  =£  1 are  mapped  onto  ellipses  x /a  + y /b  = 1.  The  circle  r = 1 is  mapped 
onto  the  segment  — 2 ^ u ^ 2 of  the  w-axis.  See  Fig.  383. 


740 


CHAP.  17  Conformal  Mapping 


EXAMPLE  4 


Now  the  derivative  of  w is 

, , 1 (z  + l)(z  - 1) 


which  is  0 at  z — ±1.  These  are  the  points  at  which  the  mapping  is  not  conformal.  The  two  circles  in  Fig.  384 
pass  through  z = ~ 1 . The  larger  is  mapped  onto  a Joukowski  airfoil.  The  dashed  circle  passes  through  both  — 1 
and  1 and  is  mapped  onto  a curved  segment. 

Another  interesting  application  of  w = z + l/z  (the  flow  around  a cylinder)  will  be  considered  in  Sec.  18.4. 


y v 


Fig.  384.  Joukowski  airfoil 


Conformality  of  w = ez 

From  (10)  in  Sec.  13.5  we  have  \ez\  = ex  and  Arg  z = y.  Hence  ez  maps  a vertical  straight  line  x = xq  ~ const 
onto  the  circle  \w\  = ex°  and  a horizontal  straight  line  y = yo  = const  onto  the  ray  arg  w = yo.  The  rectangle 
in  Fig.  385  is  mapped  onto  a region  bounded  by  circles  and  rays  as  shown. 

The  fundamental  region  — 77  < Arg  z = 77  of  ez  in  the  z-plane  is  mapped  bijectively  and  conformally  onto 
the  entire  w-plane  without  the  origin  w = 0 (because  ez  = 0 for  no  z).  Figure  386  shows  that  the  upper  half 
0 < y ^ 77  of  the  fundamental  region  is  mapped  onto  the  upper  half-plane  0 < arg  w ^ 77,  the  left  half  being 
mapped  inside  the  unit  disk  |w|  ^ 1 and  the  right  half  outside  (why?). 


y 

1 

0.5 

0 


0 1 x 


Fig.  385.  Mapping  by  w = ez 


y 


71 


0 X 


(z-plane) 


(u;-plane) 


Fig.  386.  Mapping  by  w = e' 
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EXAMPLE  5 Principle  of  Inverse  Mapping.  Mapping  w = Ln z 

Principle.  The  mapping  by  the  inverse  z =/-1(w)  of  w = f(z)  is  obtained  by  interchanging  the  roles  of  the 
Z-plane  and  the  w -plane  in  the  mapping  by  w = /(z). 

Now  the  principal  value  w = /(z)  = Ln  z of  the  natural  logarithm  has  the  inverse  z — /_1(w)  — ew.  From 
Example  4 (with  the  notations  z and  w interchanged!)  we  know  that/-1(w)  = ew  maps  the  fundamental  region 
of  the  exponential  function  onto  the  z-plane  without  z — 0 (because  ew  =£  0 for  every  w).  Hence  w = f(z)  = Ln  z 
maps  the  z-plane  without  the  origin  and  cut  along  the  negative  real  axis  (where  6 = Im  Ln  z jumps  by  27 r) 
conformally  onto  the  horizontal  strip  — tt  < v ^ 77  of  the  w-plane,  where  w = u + iv. 

Since  the  mapping  w = Ln  z + 277/  differs  from  w = Ln  z by  the  translation  277/  (vertically  upward),  this 
function  maps  the  z-plane  (cut  as  before  and  0 omitted)  onto  the  strip  77  < v ^ 377.  Similarly  for  each  of  the 
infinitely  many  mappings  w = ln  z — Ln  z ± 2/177/  ( n = 0,  1,  2,  • • • )•  The  corresponding  horizontal  strips  of  width 
277  (images  of  the  z-plane  under  these  mappings)  together  cover  the  whole  w-plane  without  overlapping. 


Magnification  Ratio.  By  the  definition  of  the  derivative  we  have 


(4) 


lim 

Z^Zq 


m -nz  o) 

Z~  Z0 


l/WI- 


Therefore,  the  mapping  w = f(z)  magnifies  (or  shortens)  the  lengths  of  short  lines  by 
approximately  the  factor  l/Lo)!-  The  image  of  a small  figure  conforms  to  the  original 
figure  in  the  sense  that  it  has  approximately  the  same  shape.  However,  since  / (z)  varies 
from  point  to  point,  a large  figure  may  have  an  image  whose  shape  is  quite  different  from 
that  of  the  original  figure. 

More  on  the  Condition  f'(z)  i1  0.  From  (4)  in  Sec.  13.4  and  the  Cauchy-Riemann 
equations  we  obtain 


(5') 


l/'(z)|2 


du  . dv 

2 

f du\2 

( dv \2  _ 

du  dv 

— + i — 
dx  dx 

= 

y + 

Kdx) 

dx  dy 

du  dv 
dy  dx 


that  is, 


(5) 


\Az)\z 


dll 

dx 

dv 

dx 


du 

dy  d(u,  v ) 

dv  d(x,  y)  ' 

dy 


This  determinant  is  the  so-called  Jacobian  (Sec.  10.3)  of  the  transformation  w = f(z) 
written  in  real  form  u = u(x,  y),  v = v(x,  y).  Hence  / (z0)  ¥=  0 implies  that  the  Jacobian 
is  not  0 at  z0-  This  condition  is  sufficient  that  the  mapping  w = f{z ) in  a sufficiently  small 
neighborhood  of  z0  is  one-to-one  or  injective  (different  points  have  different  images).  See 
Ref.  [GenRef4]  in  App.  1 . 


gRQB-L=EM=S^T—l^— 1 


1.  On  Fig.  378.  One  “rectangle”  and  its  image  are  colored. 
Identify  the  images  for  the  other  “rectangles.” 

2.  On  Example  1.  Verify  all  calculations. 

3.  Mapping  w = z3-  Draw  an  analog  of  Fig.  378  for 

3 

W = Z ■ 


4.  Conformality.  Why  do  the  images  of  the  straight  lines 
x = const  and  y = const  under  a mapping  by  an 
analytic  function  intersect  at  right  angles?  Same 
question  for  the  curves  |z|  = const  and  arg  z = const. 
Are  there  exceptional  points? 
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5.  Experiment  on  w = z.  Find  out  whether  vv  = z pre- 
serves angles  in  size  as  well  as  in  sense.  Try  to  prove 
your  result. 


6-9 


MAPPING  OF  CURVES 


Find  and  sketch  or  graph  the  images  of  the  given  curves 
under  the  given  mapping. 


6.  x =1,2,  3,  4,  y = 1,2,  3,  4,  w = z2 

7.  Rotation.  Curves  as  in  Prob.  6,  w = iz 


8.  Reflection  in  the  unit  circle.  |z|  = |,|,1,  2,  3, 
Arg  z = 0,  ±7t/4,  ±77/2,  ±3tt/2 


9.  Translation.  Curves  as  in  Prob.  6,  w = z + 2 + 2 


10.  CAS  EXPERIMENT.  Orthogonal  Nets.  Graph  the 
orthogonal  net  of  the  two  families  of  level  curves 
Re/(z)  = const  and  Im/(z)  = const,  where  (a)/(z)  = z4, 
(b)  /(z)  = 1/z,  (c)  /(z)  = 1/z2,  (d)  /(z)  = (z  + 0/ 
(1  + iz).  Why  do  these  curves  generally  intersect  at 
right  angles?  In  your  work,  experiment  to  get  the  best 
possible  graphs.  Also  do  the  same  for  other  functions 
of  your  own  choice.  Observe  and  record  shortcomings 
of  your  CAS  and  means  to  overcome  such  deficiencies. 


11-20  MAPPING  OF  REGIONS 

Sketch  or  graph  the  given  region  and  its  image  under  the 
given  mapping. 

11.  |z|  S |,  — 7t/8  < Arg  z < tt/8,  w = z2 

12.  1 < |z|  < 3,  0 < Arg  z < 7t/2,  w = z3 


13.  2 S Im  z S 5,  w = iz 

14.  x = 1,  w = 1/z 

15.  |z  - ||  S g,  w = 1/z 

16.  |z|  <|,  Imz  > 0,  w = 1/z 

17.  -Ln  2 S x S Ln  4,  w = ez 


18.  — 1 S x S 2,  —TT<y<iT,  w = ez 


19.  1 < |z|  < 4,  7t/4  < 6 S 37t/4,  w = Lnz 

20.  | S |z|  S 1,  0 S 6 < 7r /2,  tv  = Ln  z 


21-26  FAILURE  OF  CONFORMALITY 

Find  all  points  at  which  the  mapping  is  not  conformal.  Give 
reason. 

21.  A cubic  polynomial 

22.  z2  + 1/z2 

23. 


z + 2 


4z2  + 2 

24.  exp  (z5  - 80z) 

25.  cosh  z 

26.  sin  7rz 

27.  Magnification  of  Angles.  Let  /(z)  be  analytic  at  zo- 
Suppose  that/'(zo)  = 0,  ■ ■ • ,/<,£_1>(zo)  = 0-  Then  the 
mapping  w = /(z)  magnifies  angles  with  vertex  at  z0  by 
a factor  k.  Illustrate  this  with  examples  for  k = 2,  3,  4. 

28.  Prove  the  statement  in  Prob.  27  for  general  k = 1. 
2,  ■ ■ • . //mf.  Use  the  Taylor  series. 


29-35  MAGNIFICATION  RATIO,  JACOBIAN 

Find  the  magnification  ratio  M.  Describe  what  it  tells 
you  about  the  mapping.  Where  is  M = 1?  Find  the 
Jacobian  J. 

29.  w = |z2 

30.  tv  = z3 

31.  w = 1/z 

32.  tv  = 1/z2 

33.  w — ez 

z + 1 

34.  w = - 

2z  - 2 

35.  tv  = Ln  z 


17.2  Linear  Fractional  Transformations 
(Mobius  Transformations) 

Conformal  mappings  can  help  in  modeling  and  solving  boundary  value  problems  by  first 
mapping  regions  conformally  onto  another.  We  shall  explain  this  for  standard  regions 
(disks,  half-planes,  strips)  in  the  next  section.  For  this  it  is  useful  to  know  properties  of 
special  basic  mappings.  Accordingly,  let  us  begin  with  the  following  very  important  class. 

The  next  two  sections  discuss  linear  fractional  transformations.  The  reason  for  our 
thorough  study  is  that  such  transformations  are  useful  in  modeling  and  solving  boundary 
value  problems,  as  we  shall  see  in  Chapter  18.  The  task  is  to  get  a good  grasp  of  which 
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conformal  mappings  map  certain  regions  conformally  onto  each  other,  such  as,  say 
mapping  a disk  onto  a half-plane  (Sec.  17.3)  and  so  forth.  Indeed,  the  first  step  in  the 
modeling  process  of  solving  boundary  value  problems  is  to  identify  the  correct  conformal 
mapping  that  is  related  to  the  “geometry”  of  the  boundary  value  problem. 

The  following  class  of  conformal  mappings  is  very  important.  Linear  fractional 
transformations  (or  Mobius  transformations)  are  mappings 


(1) 


az  + b 
cz  + d 


(ad  — be  =/=  0) 


where  a,  b , c,  d are  complex  or  real  numbers.  Differentiation  gives 


(2) 


xv  = 


a(cz  + d)  — c(az  + b)  _ ad  — be 


(cz  + d)2 


(cz  + d)2 


This  motivates  our  requirement  ad  — be  ¥=  0.  It  implies  conformality  for  all  z.  and  excludes 
the  totally  uninteresting  case  xv  = 0 once  and  for  all.  Special  cases  of  (1)  are 


(3) 


xv  = z + b 

xv  = ciz  with  |a|  = 1 
xv  = az  + b 
xv  = l/z 


(Translations) 

(Rotations) 

(Linear  transformations) 
(Inversion  in  the  unit  circle). 


Properties  of  the  Inversion  w — 1/z  (Fig.  387) 

In  polar  forms  z = re16  and  xv  = Re'4'  the  inversion  w = l/z  is 

Rd4,  = — 7-  = —e~ie  and  gives  R = — , cj>  = —0. 
re  r r 

Hence  the  unit  circle  |z|  — r = 1 is  mapped  onto  the  unit  circle  |w|  = R = 1;  w = e4'  = e~w.  For  a general 
z the  image  w = l/z  can  be  found  geometrically  by  marking  |w|=i?  = l/ron  the  segment  from  0 to  z and 
then  reflecting  the  mark  in  the  real  axis.  (Make  a sketch.) 

Figure  387  shows  that  w = l/z  maps  horizontal  and  vertical  straight  lines  onto  circles  or  straight  lines.  Even 
the  following  is  true. 

xv  = l/z  maps  every  straight  line  or  circle  onto  a circle  or  straight  line. 


Fig.  387.  Mapping  (Inversion)  w = l/z 


744 


CHAP.  17  Conformal  Mapping 


THEOREM  1 


PROOF 


Proof.  Every  straight  line  or  circle  in  the  z-plane  can  be  written 

A(x2  + y2)  + Bx  + Cy  + D = 0 

A = 0 gives  a straight  line  and  A A 0 a circle.  In  terms  of  z and  Z this  equation  becomes 

Z + Z Z ~ Z 


(A.  B C,D  real). 


AZZ  + B 2 +C  2i 


D = 0. 


Now  w = \/z.  Substitution  of  z = I / w and  multiplication  by  ww  gives  the  equation 


W + W W — w 

A + B + C + Dww  = 0 

2 2 i 


or,  in  terms  of  u and  v. 


A + Bu  — Cv  + D(uz  + v2)  — 0. 


This  represents  a circle  (if  D A 0)  or  a straight  line  (if  D = 0)  in  the  w-plane. 


The  proof  in  this  example  suggests  the  use  of  z and  z instead  of  .r  and  y,  a general  principle 
that  is  often  quite  useful  in  practice. 

Surprisingly,  every  linear  fractional  transformation  has  the  property  just  proved: 


Circles  and  Straight  Lines 

Every  linear  fractional  transformation  (1)  maps  the  totality  of  circles  and  straight 
lines  in  the  z-plane  onto  the  totality  of  circles  and  straight  lines  in  the  w-plane. 


This  is  trivial  for  a translation  or  rotation,  fairly  obvious  for  a uniform  expansion  or 
contraction,  and  true  for  w = I /z,  as  just  proved.  Hence  it  also  holds  for  composites  of 
these  special  mappings.  Now  comes  the  key  idea  of  the  proof:  represent  (1)  in  terms  of 
these  special  mappings.  When  c = 0,  this  is  easy.  When  c 0,  the  representation  is 

rr  1 , a , r ad  - be 

w = K ; — j H — where  K = . 

cz  + d c c 

This  can  be  verified  by  substituting  K.  taking  the  common  denominator  and  simplifying; 
this  yields  (1).  We  can  now  set 

H’l  = CZ,  W2  = Wi  + d,  W3  = , VV'4  = K\V%, 

W 2 

and  see  from  the  previous  formula  that  then  w = uq  + a/c.  This  tells  us  that  (1)  is  indeed 
a composite  of  those  special  mappings  and  completes  the  proof. 


Extended  Complex  Plane 

The  extended  complex  plane  (the  complex  plane  together  with  the  point  °°  in  Sec.  16.2) 
can  now  be  motivated  even  more  naturally  by  linear  fractional  transformations  as  follows. 

To  each  j for  which  cz  + d ¥=  0 there  corresponds  a unique  w in  (1).  Now  let  c ¥=  0. 
Then  for  z = ~d/c  we  have  cz  + d = 0,  so  that  no  w corresponds  to  this  z.  This  suggests 
that  we  let  w = 00  be  the  image  of  z = —d/c. 
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Also,  the  inverse  mapping  of  (1)  is  obtained  by  solving  (1)  for  z;  this  gives  again  a 
linear  fractional  transformation 


(4) 


dw  — b 
—cw  + a 


When  c f 0,  then  cw  — a = 0 for  w = a/c,  and  we  let  a/c  be  the  image  of  z = 00 . With 
these  settings,  the  linear  fractional  transformation  (1)  is  now  a one-to-one  mapping  of  the 
extended  z-plane  onto  the  extended  vv-plane.  We  also  say  that  every  linear  fractional 
transformation  maps  “the  extended  complex  plane  in  a one-to-one  manner  onto  itself.” 
Our  discussion  suggests  the  following. 

General  Remark.  If  z = 00 , then  the  right  side  of  (1)  becomes  the  meaningless  expression 
(a  ■ 00  + b)/(c  -oo  + d).  We  assign  to  it  the  value  w = a/c  if  c A 0 and  w = 00  if  c = 0. 

Fixed  Points 

Fixed  points  of  a mapping  w = f(z ) are  points  that  are  mapped  onto  themselves,  are  “kept 
fixed”  under  the  mapping.  Thus  they  are  obtained  from 

W = /(z)  = z. 

The  identity  mapping  w = z has  every  point  as  a fixed  point.  The  mapping  w = z,  has 
infinitely  many  fixed  points,  w = 1/z  has  two,  a rotation  has  one,  and  a translation  none 
in  the  finite  plane.  (Find  them  in  each  case.)  For  (1),  the  fixed-point  condition  w = z is 

az  + b 9 

(5)  z = thus  cz  - (a  — d)z  ~ b = 0. 

cz  + d 

For  c =£  0 this  is  a quadratic  equation  in  z whose  coefficients  all  vanish  if  and  only  if  the 
mapping  is  the  identity  mapping  w = z (in  this  case,  a = d A 0,  b = c = 0).  Hence  we  have 


THEOREM  2 


Fixed  Points 

A linear  fractional  transformation,  not  the  identity,  has  at  most  two  fixed  points.  If 
a linear  fractional  transformation  is  known  to  have  three  or  more  fixed  points,  it  must 
be  the  identity  mapping  w = z. 


To  make  our  present  general  discussion  of  linear  fractional  transformations  even  more 
useful  from  a practical  point  of  view,  we  extend  it  by  further  facts  and  typical  examples, 
in  the  problem  set  as  well  as  in  the  next  section. 


P^RQBLEW-y ET==17^? 


1.  Verify  the  calculations  in  the  proof  of  Theorem  1, 
including  those  for  the  case  c = 0. 

2.  Composition  of  LFTs.  Show  that  substituting  a linear 
fractional  transformation  (LFT)  into  an  LFT  gives 
an  LFT. 


3.  Matrices.  If  you  are  familiar  with  2X2  matrices, 
prove  that  the  coefficient  matrices  of  (1)  and  (4)  are 
inverses  of  each  other,  provided  that  ad  — be  = 1,  and 
that  the  composition  of  LFTs  corresponds  to  the 
multiplication  of  the  coefficient  matrices. 
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4.  Fig.  387.  Find  the  image  of  x = k = const  under 
w = 1/z.  Hint.  Use  formulas  similar  to  those  in 
Example  1. 

5.  Inverse.  Derive  (4)  from  (1)  and  conversely. 

6.  Fixed  points.  Find  the  fixed  points  mentioned  in  the 
text  before  formula  (5). 


7-10  INVERSE 

Find  the  inverse  z = z(w).  Check  by  solving  z(w)  for  w. 
i 


7.  w = 


2z  ~ 1 


8.  w — 


z ~ i 
z + i 


9. 

10. 


w = 


w = 


z - i 
3iz  + 4 


-giz  - 1 


FIXED  POINTS 

fixed  points. 

11.  w = (a  + ib)z2 

12.  w = z — 3/ 

13.  w = 16z5 

14.  w = az  + b 

iz  + 4 


11-16 

Find  the 


aiz  — 1 

16.  w = , a A 1 

z + ai 


17-20 


FIXED  POINTS 


Find  all  LFTs  with  fixed  point(s). 

17.  z = 0 18.  z=±l 


19.  z = — i 


20.  Without  any  fixed  points 


17.3  Special  Linear  Fractional  Transformations 

We  continue  our  study  of  linear  fractional  transformations.  We  shall  identify  linear  fractional 
transformations 


(1) 


az  + b 

w = 

cz  + d 


0 ad  — be  0) 


that  map  certain  standard  domains  onto  others.  Theorem  1 (below)  will  give  us  a tool  for 
constructing  desired  linear  fractional  transformations. 

A mapping  ( 1 ) is  determined  by  a,  b,  c,  d,  actually  by  the  ratios  of  three  of  these  constants 
to  the  fourth  because  we  can  drop  or  introduce  a common  factor.  This  makes  it  plausible 
that  three  conditions  determine  a unique  mapping  (1): 


THEOREM  1 


Three  Points  and  Their  Images  Given 

Three  given  distinct  points  Z\,  Z2>  Z3  can  always  be  mapped  onto  three  prescribed 
distinct  points  w 1,  W2,  W3  by  one,  and  only  one,  linear  fractional  transformation 
w = f(z).  This  mapping  is  given  implicitly  by  the  equation 

W - Wi  W2  - w3  Z — Zl  Z2  ~ Z3 

(2)  = . 

w - w3  w2  - vtq  z - z3  z2  ~ Zi 

{If  one  of  these  points  is  the  point  the  quotient  of  the  two  differences  containing 
this  point  must  be  replaced  by  1.) 


PROOF  Equation  (2)  is  of  the  form  F{w)  = G(z)  with  linear  fractional  F and  G.  Hence 
w = F~1{G{z))  = f(z),  where  F~1  is  the  inverse  of  F and  is  linear  fractional  (see  (4)  in 
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Sec.  17.2)  and  so  is  the  composite  F~1(G(z ))  (by  Prob.  2 in  Sec.  17.2),  that  is,  w = f(z) 
is  linear  fractional.  Now  if  in  (2)  we  set  w = w ls  w2,  W3  on  the  left  and  z = Zi,  Z2,  73  on 
the  right,  we  see  that 

F(w\)  = 0,  F(w2)  = 1,  F(w3)  = 00 

G(zi)  = 0,  G(zz)  = 1.  G(z3)  = «. 

From  the  first  column,  F(w\)  = G(zi),  thus  wi  = F~x(Giz\))  = f(z  1).  Similarly,  w2  = f(z2), 
W3  = f(z.:>,)-  This  proves  the  existence  of  the  desired  linear  fractional  transformation. 

To  prove  uniqueness,  let  w = g(z)  be  a linear  fractional  transformation,  which  also 
maps  Zj  onto  Wj,  j = 1,  2,  3.  Thus  Wj  = g(zj).  Hence  g-1(wj)  = Zj,  where  Wj  = f(zj). 
Together,  g~1(  f(Zj))  = Zj,  a mapping  with  the  three  fixed  points  z\,  z2,  73.  By  Theorem  2 
in  Sec.  17.2,  this  is  the  identity  mapping,  g~1(f(z))  = 7 for  all  z.  Thus  /Tz)  = g(z)  for  all 
Z,  the  uniqueness. 

The  last  statement  of  Theorem  1 follows  from  the  General  Remark  in  Sec.  17.2. 

Mapping  of  Standard  Domains  by  Theorem  1 

Using  Theorem  1,  we  can  now  find  linear  fractional  transformations  of  some  practically 
useful  domains  (here  called  “standard  domains”)  according  to  the  following  principle. 

Principle.  Prescribe  three  boundary  points  z 1,  z2,  73  of  the  domain  D in  the  z-plane. 
Choose  their  images  w 1,  vi’2,  w'3  on  the  boundary  of  the  image  D*  of  I)  in  the  w-plane. 
Obtain  the  mapping  from  (2).  Make  sure  that  D is  mapped  onto  D*,  not  onto  its 
complement.  In  the  latter  case,  interchange  two  w-points.  (Why  does  this  help?) 


Fig.  388.  Linear  fractional  transformation  in  Example  1 


Mapping  of  a Half-Plane  onto  a Disk  (Fig.  388) 

Find  the  linear  fractional  transformation  (1)  that  maps  z\  — — 1,Z2  = 0,^3  = 1 onto  w>\  = — 1,  vt>2  = ~i, 
W3  = 1,  respectively. 

Solution.  From  (2)  we  obtain 

w - (-1)  ~i  - 1 = z ~ (-1)  0 ~ 1 

vp  1 ~i  - (-1)  z - 1 0 - (-1)  ’ 
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EXAMPLE  2 


EXAMPLE  3 


EXAMPLE  4 


thus 


z - i 

w = . 

—iz  + 1 

Let  us  show  that  we  can  determine  the  specific  properties  of  such  a mapping  without  much  calculation.  For 
Z = x we  have  w = (x  — i)/(—ix  + 1),  thus  \w\  = 1,  so  that  the  x-axis  maps  onto  the  unit  circle.  Since  z = i 
gives  w = 0,  the  upper  half-plane  maps  onto  the  interior  of  that  circle  and  the  lower  half-plane  onto  the  exterior. 
Z = 0,  i,  00  go  onto  w = ~i,  0,  i,  so  that  the  positive  imaginary  axis  maps  onto  the  segment  S:  u = 0,  — 1 ^ v = 1. 
The  vertical  lines  x = const  map  onto  circles  (by  Theorem  1,  Sec.  17.2)  through  w = i (the  image  of  z — °°)  and 
perpendicular  to  \w\  = 1 (by  conformality;  see  Fig.  388).  Similarly,  the  horizontal  lines  y = const  map  onto 
circles  through  w = i and  perpendicular  to  S (by  conformality).  Figure  388  gives  these  circles  for  y i?  0,  and  for 
y < 0 they  lie  outside  the  unit  disk  shown. 

Occurrence  of  °° 

Determine  the  linear  fractional  transformation  that  maps  zi  — 0,  Z2  ~ 1>Z3  = 00  onto  Wi  = — 1,  W2  = ~i, 
W3  = 1,  respectively. 

Solution.  From  (2)  we  obtain  the  desired  mapping 

z - i 

w = . 

z + i 

This  is  sometimes  called  the  Cayley  transformation .2  In  this  case,  (2)  gave  at  first  the  quotient  (1  — °°)/(z  — °°), 
which  we  had  to  replace  by  1 . 

Mapping  of  a Disk  onto  a Half-Plane 

Find  the  linear  fractional  transformation  that  maps  z\  — — 1,  Z2  = U Z 3 — 1 onto  wi  = 0,  W2  — i , W3  = °°, 
respectively,  such  that  the  unit  disk  is  mapped  onto  the  right  half-plane.  (Sketch  disk  and  half-plane.) 

Solution.  From  (2)  we  obtain,  after  replacing  (/  — °o)/(w  — 00)  by  1, 

Z+  1 

W = . 

Z - 1 

Mapping  half-planes  onto  half-planes  is  another  task  of  practical  interest.  For  instance, 
we  may  wish  to  map  the  upper  half-plane  y§0  onto  the  upper  half-plane  v § 0.  Then 
the  x-axis  must  be  mapped  onto  the  n-axis. 

Mapping  of  a Half-Plane  onto  a Half-Plane 

Find  the  linear  fractional  transformation  that  maps  ci  = —2,  z2  = 0,  z3  = 2 onto  w1  = h’2  4,  vv';<  |, 

respectively. 

Solution.  You  may  verify  that  (2)  gives  the  mapping  function 

z + 1 

W = 

2z  + 4 


What  is  the  image  of  the  x-axis?  Of  the  y-axis? 


Mappings  of  disks  onto  disks  is  a third  class  of  practical  problems.  We  may  readily 
verify  that  the  unit  disk  in  the  z-plane  is  mapped  onto  the  unit  disk  in  the  w-plane  by  the 
following  function,  which  maps  z0  onto  the  center  w = 0. 


2ARTHUR  CAYLEY  (1821-1895),  English  mathematician  and  professor  at  Cambridge,  is  known  for  his 
important  work  in  algebra,  matrix  theory,  and  differential  equations. 
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EXAMPLE  5 


EXAMPLE  6 


(3)  w = c = z0,  kol  < 1- 

v ' cz  ~ 1 

zo  as  in  (3), 


|l  — cz\  = \cz  — l|. 

Hence 

I w|  = \z  - Zo\/\cz  — 1 1 = 1 

from  (3),  so  that  |z|  = 1 maps  onto  w = 1,  as  claimed,  with  zo  going  onto  0,  as  the 
numerator  in  (3)  shows. 

Formula  (3)  is  illustrated  by  the  following  example.  Another  interesting  case  will  be 
given  in  Prob.  17  of  Sec.  18.2. 

Mapping  of  the  Unit  Disk  onto  the  Unit  Disk 

Taking  Zo  = I in  (3),  we  obtain  (verify!) 

2 z-  1 

w = (Fig.  389). 

z ~ 2 


To  see  this,  take  |z|  = 1,  obtaining,  with  c = 


zol  = lz  - C| 

= Izl  I Z - c| 

= \zz  - cz I = 


Mapping  of  an  Angular  Region  onto  the  Unit  Disk 

Certain  mapping  problems  can  be  solved  by  combining  linear  fractional  transformations  with  others.  For  instance, 
to  map  the  angular  region  D:  —Tr/b^  arg  z = tt/6  (Fig.  390)  onto  the  unit  disk  |w|  = 1,  we  may  map  D by 
Z = z6  onto  the  right  Z-half-plane  and  then  the  latter  onto  the  disk  |w|  ^ 1 by 

Z - 1 z3  ~ l 

w — i , combined  w = i . 

Z + 1 z3+  1 
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(z-plane)  (Z-plane)  (u;-plane) 

Fig.  390.  Mapping  in  Example  6 

This  is  the  end  of  our  discussion  of  linear  fractional  transformations.  In  the  next  section 
we  turn  to  conformal  mappings  by  other  analytic  functions  (sine,  cosine,  etc.). 


FRQBHEM 


1.  CAS  EXPERIMENT.  Linear  Fractional  Transfor- 
mations (LFTs).  (a)  Graph  typical  regions  (squares, 
disks,  etc.)  and  their  images  under  the  LFTs  in 
Examples  1-5  of  the  text. 

(b)  Make  an  experimental  study  of  the  continuous 
dependence  of  LFTs  on  their  coefficients.  For  instance, 
change  the  LFT  in  Example  4 continuously  and  graph 
the  changing  image  of  a fixed  region  (applying  animation 
if  available). 

2.  Inverse.  Find  the  inverse  of  the  mapping  in  Example  1. 
Show  that  under  that  inverse  the  lines  x = const  are 
the  images  of  circles  in  the  w-plane  with  centers  on  the 
line  o = l. 

3.  Inverse.  If  w = f(z)  is  any  transformation  that  has  an 
inverse,  prove  the  (trivial!)  fact  that /and  its  inverse 
have  the  same  fixed  points. 

4.  Obtain  the  mapping  in  Example  1 of  this  section  from 
Prob.  18  in  Problem  Set  17.2. 

5.  Derive  the  mapping  in  Example  2 from  (2). 

6.  Derive  the  mapping  in  Example  4 from  (2).  Find  its 
inverse  and  the  fixed  points. 

7.  Verify  the  formula  for  disks. 


8-16 


LFTs  FROM  THREE  POINTS  AND  IMAGES 


Find  the  LFT  that  maps  the  given  three  points  onto  the  three 
given  points  in  the  respective  order. 

8.  0,  1,  2 onto  1,  g,  | 

9.  1 , t,  — 1 onto  i,  — 1 , — i 

10.  0,  — i,  i onto  —1,  0,  a) 

11.  —1,0,  1 onto  — i,  — l,i 

12.  0,  2 1,  —2 i onto  —1,  0,  00 

13.  0,  1,  oo  onto  oo,  1,  0 

14.  —1,  0,  1 onto  1,  1 + i,  1 + 2 i 

15.  1,  t,  2 onto  0,  — i — 1,  — g 

16.  — |,  0,  1 onto  0,  |,  1 

17.  Find  an  LFT  that  maps  |z|  S 1 onto  |w|  S 1 so  that 
z — ijl  is  mapped  onto  w = 0.  Sketch  the  images  of 
the  lines  x = const  and  y = const. 


18.  Find  all  LFTs  w(z)  that  map  the  x-axis  onto  the  n-axis. 

19.  Find  an  analytic  function  w = f(z ) that  maps  the  region 
0 £ arg  z fi  77/4  onto  the  unit  disk  |w|  S 1. 

20.  Find  an  analytic  function  that  maps  the  second  quadrant 
of  the  z-plane  onto  the  interior  of  the  unit  circle  in  the 
w-plane. 


17.4  Conformal  Mapping  by  Other  Functions 

We  shall  now  cover  mappings  by  trigonometric  and  hyperbolic  analytic  functions.  So  far 
we  have  covered  the  mappings  by  zn  and  ez  (Sec.  17.1)  as  well  as  linear  fractional 
transformations  (Secs.  17.2  and  17.3). 


Sine  Function.  Figure  391  shows  the  mapping  by 
( 1 ) w = u + iv  = sin  z = sin x coshy  + i cos  x sinh  y 


(Sec.  13.6). 
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Fig.  391. 


(w-plane) 


Mapping  w = u + iv  = sinz 


Hence 

(2)  u — sin  x cosh  y,  v = cos  x sinh  y. 

Since  sin  z is  periodic  with  period  277,  the  mapping  is  certainly  not  one-to-one  if  we 
consider  it  in  the  full  z-plane.  We  restrict  z to  the  vertical  strip  S\  —§77  Si  x = §77  in 
Fig.  391.  Since f'(z)  = cos  7 = 0 at  7 = ±g77.  the  mapping  is  not  conformal  at  these  two 
critical  points.  We  claim  that  the  rectangular  net  of  straight  lines  x = const  and  y = const 
in  Fig.  391  is  mapped  onto  a net  in  the  w-plane  consisting  of  hyperbolas  (the  images  of 
the  vertical  lines  x = const)  and  ellipses  (the  images  of  the  horizontal  lines  y = const) 
intersecting  the  hyperbolas  at  right  angles  (conformality!).  Corresponding  calculations  are 
simple.  From  (2)  and  the  relations  sin  x + cos  x = I and  cosh  y — sinh  y = 1 we 
obtain 


U U 2 2 

— — — = cosh  y — sinh  y = 1 (Hyperbolas) 

sin2  x cos2  x 


U U 2 2 

-I = sin  x + cos  x = 1 (Ellipses). 

cosh2  y sinh2  y 

Exceptions  are  the  vertical  lines  x = — \ TTx  = \ 77,  which  are  “folded”  onto  u g — 1 and 
u § 1 (u  = 0),  respectively. 

Figure  392  illustrates  this  further.  The  upper  and  lower  sides  of  the  rectangle  are  mapped 
onto  semi-ellipses  and  the  vertical  sides  onto  —cosh  l^u§-l  and  I i « § COsh  1 
(v  = 0),  respectively.  An  application  to  a potential  problem  will  be  given  in  Prob.  3 of 
Sec.  18.2. 


Fig.  392.  Mapping  by  w = sin  z 
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Cosine  Function.  The  mapping  w = cos  z could  be  discussed  independently,  but  since 

(3)  w = cos  z = sin  (z  + ^77), 

we  see  at  once  that  this  is  the  same  mapping  as  sin  z preceded  by  a translation  to  the  right 
through  §77  units. 

Hyperbolic  Sine.  Since 

(4)  w = sinh  z = — i sin  (iz), 

the  mapping  is  a counterclockwise  rotation  Z = iz  through  §77  (i.e.,  90°),  followed  by  the 
sine  mapping  Z*  = sinZ,  followed  by  a clockwise  90°-rotation  w = —iZ*. 

Hyperbolic  Cosine.  This  function 

(5)  w = cosh  z = cos  (iz) 

defines  a mapping  that  is  a rotation  Z = iz  followed  by  the  mapping  w = cos  Z. 

Figure  393  shows  the  mapping  of  a semi-infinite  strip  onto  a half-plane  by  w = cosh  z. 
Since  cosh  0=1,  the  point  z = 0 is  mapped  onto  w = 1.  For  real  z = r § 0,  cosh  z is 
real  and  increases  with  increasing  x in  a monotone  fashion,  starting  from  1.  Hence  the 
positive  .r-axis  is  mapped  onto  the  portion  u §=  1 of  the  w-axis. 

For  pure  imaginary  z = iy  we  have  cosh  iy  = cos  y.  Hence  the  left  boundary  of  the  strip 
is  mapped  onto  the  segment  1 g m g -1  of  the  u-axis,  the  point  z = 77/  corresponding  to 

w = cosh  iir  = COS  77  = — 1. 

On  the  upper  boundary  of  the  strip,  y = 77,  and  since  sin  77  = 0 and  cos  77  = —1,  it 
follows  that  this  part  of  the  boundary  is  mapped  onto  the  portion  u 3=  — 1 of  the  w-axis. 
Hence  the  boundary  of  the  strip  is  mapped  onto  the  M-axis.  It  is  not  difficult  to  see  that 
the  interior  of  the  strip  is  mapped  onto  the  upper  half  of  the  w-plane,  and  the  mapping  is 
one-to-one. 

This  mapping  in  Fig.  393  has  applications  in  potential  theory,  as  we  shall  see  in  Prob.  12 
of  Sec.  18.3. 


y 

E 

n O- 


AO 


V 

B* 

A* 

0 

0 

-1  0 1 

Fig.  393  Mapping  by  w = cosh  z 


Tangent  Function.  Figure  394  shows  the  mapping  of  a vertical  infinite  strip  onto  the 
unit  circle  by  w = tan  z,  accomplished  in  three  steps  as  suggested  by  the  representation 
(Sec.  13.6) 

sin  z ( eiz  - e~iz)/i  (e2iz  - 1 )/i 

w = tan  z = = — = = — = = 


cos  z 


£iz  + e-iz 


e2iz  + 1 
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Hence  if  we  set  Z = e2lz  and  use  1 /i  = —i,  we  have 

(6)  w = tan  z = ~iW,  W = + , Z = e2*z. 


We  now  see  that  w = tan  z is  a linear  fractional  transformation  preceded  by  an  exponential 
mapping  (see  Sec.  17.1)  and  followed  by  a clockwise  rotation  through  an  angle  277(90°). 

The  strip  is  S : — 1 77  < jc  < 5 77,  and  we  show  that  it  is  mapped  onto  the  unit  disk  in 
the  w-plane.  Since  Z = e2lz  = e~2y  + 2lx > we  see  from  (10)  in  Sec.  13.5  that  |z|  = e~2y, 
ArgZ  = 2x.  Hence  the  vertical  lines  x = —77/4,0,77/4  are  mapped  onto  the  rays 
Arg  Z = —77/2,  0,  77/2,  respectively.  Hence  S is  mapped  onto  the  right  Z-half-plane.  Also 
|z|  = e~Zy  < I il/y  > 0 and  |z|  > 1 if  y < 0.  Hence  the  upper  half  of  S is  mapped  inside 
the  unit  circle  \Z\  = 1 and  the  lower  half  of  S outside  \Z\  = 1,  as  shown  in  Fig.  394. 

Now  comes  the  linear  fractional  transformation  in  (6),  which  we  denote  by  g{Z ): 


(7) 


W = g(Z) 


z - 1 
z + r 


For  real  Z this  is  real.  Hence  the  real  Z-axis  is  mapped  onto  the  real  VV-axis.  Furthermore, 
the  imaginary  Z-axis  is  mapped  onto  the  unit  circle  |VF|  = 1 because  for  pure  imaginary 
Z = iY  we  get  from  (7) 


\W\  = I g(iY)  | 


iY  — 1 
iY  + 1 


= 1. 


The  right  Z-half-plane  is  mapped  inside  this  unit  circle  |w|  = 1,  not  outside,  because 
Z = 1 has  its  image  g(l)  = 0 inside  that  circle.  Finally,  the  unit  circle  |z|  = 1 is  mapped 
onto  the  imaginary  W-axis,  because  this  circle  is  Z = e1'1',  so  that  (7)  gives  a pure  imaginary 
expression,  namely, 


* = e*  - 1 = e^'z  - e-i4>'2  = i sin  (c/>/2) 

8KC  ei4>  + 1 ei4,/2  + e~iHZ  cos  (<^/2)  ' 

From  the  IV-planc  we  get  to  the  w-plane  simply  by  a clockwise  rotation  through  77/2;  see  (6). 

Together  we  have  shown  that  w = tan  z maps  S:  —77/4  < Re 7 < 77/4  onto  the  unit 
disk  |w|  < 1,  with  the  four  quarters  of  S mapped  as  indicated  in  Fig.  394.  This  mapping 
is  conformal  and  one-to-one. 


(z-plane)  (Z-plane) 


(W-plane) 


Fig.  394.  Mapping  by  w = tan  z 


(w-plane) 
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FITQ~BH^=SET  ~T7V3 


CONFORMAL  MAPPING  w = ez 

1.  Find  the  image  of  x = c = const,  — 77  < y S 77,  under 

w = ez. 

2.  Find  the  image  of  y = k = const,  -co<jSoo,  under 
w = ez. 


3-7 


Find  and  sketch  the  image  of  the  given  region 
under  w = ez. 

3.  — | S jc  S g,  -77  S y S 77 

4.  0 < jc  < 1 , g < y < 1 

5.  —oo  < x < oo,  0 fi  y S 27 r 

6.  0 < jc  < °°,  0 < y < 7t/2 

7.  0 < x < 1 , 0 < y < tt 

8.  CAS  EXPERIMENT.  Conformal  Mapping.  If  your 
CAS  can  do  conformal  mapping,  use  it  to  solve  Prob.  7. 
Then  increase  y beyond  tt,  say,  to  5077  or  10077.  State 
what  you  expected.  See  what  you  get  as  the  image. 
Explain. 


CONFORMAL  MAPPING  w = sin  z 

9.  Find  the  points  at  which  w = sin  z is  not  conformal. 

10.  Sketch  or  graph  the  images  of  the  lines  x = 0,  ±tt/6, 
±tt/3,  ±7t/2  under  the  mapping  w = sinz. 


11-14 


Find  and  sketch  or  graph  the  image  of  the  given 


region  under  w = sin  z. 


11.  0 < jc  < 7t/2,  0 < y < 2 

12.  — 7t/4  < x < 7t/4,  0 < y < 1 

13.  0 < jc  < 277,  1 < y < 3 

14.  0 < jc  < 77/6,  -00  < y < 00 

15.  Describe  the  mapping  w = cosh  z in  terms  of  the  map- 
ping w = sin  z and  rotations  and  translations. 

16.  Find  all  points  at  which  the  mapping  w = cosh  277z  is 
not  conformal. 


17.  Find  an  analytic  function  that  maps  the  region  R 
bounded  by  the  positive  x-  and  y-semi-axes  and  the 
hyperbola  jcy  = 77  in  the  first  quadrant  onto  the  upper 
half-plane.  Hint.  First  map  R onto  a horizontal  strip. 


CONFORMAL  MAPPING  w = cos  z 

18.  Find  the  images  of  the  lines  y = k = const  under  the 
mapping  w = cos  z. 

19.  Find  the  images  of  the  lines  jc  = c = const  under  the 
mapping  w = cos  z. 


20-23  Find  and  sketch  or  graph  the  image  of  the  given 
region  under  the  mapping  w = cos  z. 

20.  0 < jc  < 277,  2 < y < 1 

21.  0 < jc  < 77/2,  0 < y < 2 directly  and  from  Prob.  1 1 

22.  -1  < jc  < 1,  0 S y S 1 

23.  77  < jc  < 277,  y < 0 

24.  Find  and  sketch  the  image  of  the  region  2 S |z|  S3, 
77/4  S 6 S 77/2  under  the  mapping  w = Ln  z. 

z — 1 

25.  Show  that  w = Ln maps  the  upper  half-plane 

z + 1 

onto  the  horizontal  strip  0 S Im  w S 77  as  shown  in 
the  figure. 


T I 

1 l 

hT 

U.  f° 

CD 

M 

C* 

D*( =0)  E*=A* 

B*(oo) 

— 6 — 

0 

(w- plane) 

Problem  25 


17.5  Riemann  Surfaces.  Optional 

One  of  the  simplest  but  most  ingeneous  ideas  in  complex  analysis  is  that  of  Riemann 
surfaces.  They  allow  multivalued  relations,  such  as  w = Vz  or  w = ln z,  to  become 
single-valued  and  therefore  functions  in  the  usual  sense.  This  works  because  the  Riemann 
surfaces  consist  of  several  sheets  that  are  connected  at  certain  points  (called  branch  points). 
Thus  w = Vz  will  need  two  sheets,  being  single-valued  on  each  sheet.  How  many  sheets 
do  you  think  w = lnz  needs?  Can  you  guess,  by  recalling  Sec.  13.7?  (The  answer  will 
be  given  at  the  end  of  this  section).  Let  us  start  our  systematic  discussion. 

The  mapping  given  by 


w = u + iv 


z 


2 


(1) 


(Sec.  17.1) 


SEC.  17.5  Riemann  Surfaces.  Optional 
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is  conformal,  except  at  z = 0,  where  w = 2z.  = 0,  At  z = 0,  angles  are  doubled  under 
the  mapping.  Thus  the  right  z-half-plane  (including  the  positive  y-axis)  is  mapped  onto 
the  full  w-plane,  cut  along  the  negative  half  of  the  M-axis;  this  mapping  is  one-to-one. 
Similarly  for  the  left  z-half-plane  (including  the  negative  v-axis).  Hence  the  image  of  the 
full  z-plane  under  w = z “covers  the  w-plane  twice”  in  the  sense  that  every  w i=-  0 is  the 
image  of  two  z-points;  if  zi  is  one,  the  other  is  — zi-  For  example,  z = i and  — i are  both 
mapped  onto  w = — 1. 

Now  comes  the  crucial  idea.  We  place  those  two  copies  of  the  cut  w-plane  upon  each 
other  so  that  the  upper  sheet  is  the  image  of  the  right  half  z-plane  R and  the  lower  sheet 
is  the  image  of  the  left  half  z-plane  L.  We  join  the  two  sheets  crosswise  along  the  cuts 
(along  the  negative  w-axis)  so  that  if  z moves  from  R to  L.  its  image  can  move  from  the 
upper  to  the  lower  sheet.  The  two  origins  are  fastened  together  because  w = 0 is  the  image 
of  just  one  z-point,  z = 0.  The  surface  obtained  is  called  a Riemann  surface  (Fig.  395a). 
w = 0 is  called  a “winding  point"  or  branch  point,  w = z maps  the  full  z-plane  onto 
this  surface  in  a one-to-one  manner. 

By  interchanging  the  roles  of  the  variables  z and  w it  follows  that  the  double-valued 
relation 

(2)  w = Vz  (Sec.  13.2) 

becomes  single-valued  on  the  Riemann  surface  in  Fig.  395a,  that  is,  a function  in  the  usual 
sense.  We  can  let  the  upper  sheet  correspond  to  the  principal  value  of  Vz.  Its  image  is 
the  right  w-half-plane.  The  other  sheet  is  then  mapped  onto  the  left  w-half-plane. 


(a)  Riemann  surface  of  Vz~  (b)  Riemann  surface  of  V" 

Fig.  395.  Riemann  surfaces 


Similarly,  the  triple-valued  relation  w = Vz  becomes  single-valued  on  the  three-sheeted 
Riemann  surface  in  Fig.  395b,  which  also  has  a branch  point  at  z = 0. 

The  infinitely  many-valued  natural  logarithm  (Sec.  13.7) 

w = In  z = Ln  z + 2mri  (n  = 0,  ±1,  ±2,  • • • ) 

becomes  single-valued  on  a Riemann  surface  consisting  of  infinitely  many  sheets, 
w = Ln  z corresponds  to  one  of  them.  This  sheet  is  cut  along  the  negative  x-axis  and  the 
upper  edge  of  the  slit  is  joined  to  the  lower  edge  of  the  next  sheet,  which  corresponds  to 
the  argument  77  < 6 2=  377,  that  is,  to 

w = Ln  z + 277/. 

The  principal  value  Ln  z maps  its  sheet  onto  the  horizontal  strip  -77  < 0 § 77.  The 
function  w = Ln  z + 277/  maps  its  sheet  onto  the  neighboring  strip  77  < v Si  377,  and  so 
on.  The  mapping  of  the  points  z # 0 of  the  Riemann  surface  onto  the  points  of  the  w-plane 
is  one-to-one.  See  also  Example  5 in  Sec.  17.1. 
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CHAP.  17  Conformal  Mapping 


1.  If  z moves  from  z — \ twice  around  the  circle  |z|  = \ , 
what  does  w = Vz  do? 

2.  Show  that  the  Riemann  surface  of  w = 

V(z  — l)(z  — 2)  has  branch  points  at  1 and  2 sheets, 
which  we  may  cut  and  join  crosswise  from  1 to  2. 
7/z'rzZ.  Introduce  polar  coordinates  z — 1 = rj/*1  and 
z — 2 = r2eie 2,  so  that  w = e*(fll+fl2)/2. 

3.  Make  a sketch,  similar  to  Fig.  395,  of  the  Riemann 
surface  of  w = tfz  + 1. 


4-10 


RIEMANN  SURFACES 


Find  the  branch  points  and  the  number  of  sheets  of  the 
Riemann  surface. 


4.  Vz'z  — 2 + i 

5.  z2  + ^4z  + z 

6.  In  (6z  — 2 z) 

<1 

1 

M 

O 

8.  eVz,  V? 

9.  Vz3  + z 

10.  V(4  - z2)(l  - z2) 

S T I O N S AND  PROBLEMS 


1.  What  is  a conformal  mapping?  Why  does  it  occur  in 
complex  analysis? 

2.  At  what  points  are  w = z5  — z and  w = cos  (7rz2)  not 
conformal? 

3.  What  happens  to  angles  at  z0  under  a mapping  w = /(z) 

if/W  = 0,/"(zo)  = 0,  f"\z 0)  + 0? 

4.  What  is  a linear  fractional  transformation?  What  can 
you  do  with  it?  List  special  cases. 

5.  What  is  the  extended  complex  plane?  Ways  of  intro- 
ducing it? 

6.  What  is  a fixed  point  of  a mapping?  Its  role  in  this 
chapter?  Give  examples. 

7.  How  would  you  find  the  image  of  x = Re  z = 1 under 

w = iz,  z2,  ez,  1/z? 

8.  Can  you  remember  mapping  properties  of  w = In  z? 

9.  What  mapping  gave  the  loukowski  airfoil?  Explain 
details. 

10.  What  is  a Riemann  surface?  Its  motivation?  Its  simplest 
example. 


MAPPING  w = z2 

Find  and  sketch  the  image  of  the  given  region  or  curve 
under  w = z2. 
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11.  1 < |z|  < 2,  | arg  z | < 7t/8 

12.  I/V7F  < |z|  < V7 r,  0 < argz  < tr/2 

13.  -4  < xy  < 4 14.  0 < y < 2 


15.  x = -1,  1 16.  y = -2,  2 
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MAPPING  w = 1/z 


Find  and  sketch  the  image  of  the  given  region  or  curve 
under  w = 1/z. 


17.  |z|  < 1 

18.  |z|  <1,  0 < arg  z < 7t/2 

19.  2 < |z|  < 3,  y > 0 20.  0 arg  z £ 7T/4 


21.  (x-§)2  + y2  = |,  y > 0 

22.  z = 1 + iy  (— 00  < y < 00) 
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LINEAR  FRACTIONAL 
TRANSFORMATIONS  (LFTs) 


Find  the  LFT  that  maps 

23.  —1,  0,  1 onto  4 + 3z,  Si/2,  4 — 3z,  respectively 

24.  0,  2,  4 onto  °°,  g,  5,  respectively 

25.  1,  z,  — i onto  i,  —1,  1,  respectively 

26.  0,  1,  2 onto  2 i,  1 + 2 z,  2 + 2z,  respectively 

27.  0,  1,  00  onto  00,  1,0,  respectively 

28.  — 1,  — z,  i onto  1 — i,  2,  0,  respectively 
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FIXED  POINTS 


Find  the  fixed  points  of  the  mapping 

29.  w = (2  + z)z  30.  w = z4  + z — 64 

31.  w = (3z  + 2)/(z  - 1)  32.  (2 iz  ~ l)/(z  + 2 z) 

33.  w = z5  + 10z3  + lOz 

34.  w = (iz  + 5)/(5z  + z) 


35-40 


GIVEN  REGIONS 


Find  an  analytic  function  w = /(z)  that  maps 

35.  The  infinite  strip  0 < y < 7t/4  onto  the  upper  half- 
plane v > 0. 


36.  The  quarter-disk  |z|  < 1,  jc  > 0,  y > 0 onto  the  exterior 
of  the  unit  circle  |w|  = 1. 


37.  The  sector  0 < arg  z < 7t/2  onto  the  region  u < 1. 

38.  The  interior  of  the  unit  circle  |z|  = 1 onto  the  exterior 
of  the  circle  |w  + 2|  =2. 


39.  The  region  x > 0,y  > 0,  xy  < c onto  the  strip  0 < 

v < 1. 

40.  The  semi-disk  |z|  < 2,  y > 0 onto  the  exterior  of  the 
circle  \w  — tt\  = 7T. 


Summary  of  Chapter  17 
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SUMMARY  OF  CHAPTER  17 

Conformal  Mapping 


A complex  function  w = f(z)  gives  a mapping  of  its  domain  of  definition  in  the 
complex  '-plane  onto  its  range  of  values  in  the  complex  w-plane.  If/(z)  is  analytic, 
this  mapping  is  conformal,  that  is,  angle-preserving:  the  images  of  any  two 
intersecting  curves  make  the  same  angle  of  intersection,  in  both  magnitude  and  sense, 
as  the  curves  themselves  (Sec.  17.1).  Exceptions  are  the  points  at  which  / (z)  = 0 
(“critical  points,”  e.g.  z = 0 for  w = z2). 

For  mapping  properties  of  ez,  cos  z,  sin  z etc.  see  Secs.  17.1  and  17.4. 

Linear  fractional  transformations,  also  called  Mobius  transformations 

az  + b 

(1)  w = (Secs.  17.2,  17.3) 

cz  + d 

(i ad  — be  A 0)  map  the  extended  complex  plane  (Sec.  17.2)  onto  itself.  They  solve 
the  problems  of  mapping  half-planes  onto  half-planes  or  disks,  and  disks  onto  disks 
or  half-planes.  Prescribing  the  images  of  three  points  determines  (1)  uniquely. 

Riemann  surfaces  (Sec.  17.5)  consist  of  several  sheets  connected  at  certain  points 
called  branch  points.  On  them,  multivalued  relations  become  single-valued,  that  is, 
functions  in  the  usual  sense.  Examples.  For  w = Vz  we  need  two  sheets  (with  branch 
point  0)  since  this  relation  is  doubly- valued.  For  w = In  z we  need  infinitely  many 
sheets  since  this  relation  is  infinitely  many-valued  (see  Sec.  13.7). 


CHAPTER 


Complex  Analysis 
and  Potential  Theory 


In  Chapter  17  we  developed  the  geometric  approach  of  conformal  mapping.  This  meant 
that,  for  a complex  analytic  function  w = f(z)  defined  in  a domain  I)  of  the  z-plane,  we 
associated  with  each  point  in  D a corresponding  point  in  the  w-plane.  This  gave  us  a 
conformal  mapping  (angle-preserving),  except  at  critical  points  where  / (z)  = 0. 

Now,  in  this  chapter,  we  shall  apply  conformal  mappings  to  potential  problems.  This 
will  lead  to  boundary  value  problems  and  many  engineering  applications  in  electrostatics, 
heat  flow,  and  fluid  flow.  More  details  are  as  follows. 

Recall  that  Laplace’s  equation  V2<I>  = 0 is  one  of  the  most  important  PDEs  in 
engineering  mathematics  because  it  occurs  in  gravitation  (Secs.  9.7,  12.11),  electrostatics 
(Sec.  9.7),  steady-state  heat  conduction  (Sec.  12.5),  incompressible  fluid  flow,  and  other 
areas.  The  theory  of  this  equation  is  called  potential  theory  (although  “potential”  is  also 
used  in  a more  general  sense  in  connection  with  gradients  (see  Sec.  9.7)).  Because  we 
want  to  treat  this  equation  with  complex  analytic  methods,  we  restrict  our  discussion  to 
the  “two-dimensional  case.”  Then  (I>  depends  only  on  two  Cartesian  coordinates  x and  y, 
and  Laplace’s  equation  becomes 


V2<1>  = + <f>yy  = 0. 

An  important  idea  then  is  that  its  solutions  <J>  are  closely  related  to  complex  analytic 
functions  (1>  + AT  as  shown  in  Sec.  13.4.  ( Remark : We  use  the  notation  (I>  + AT  to  free 
u and  v,  which  will  be  needed  in  conformal  mapping  u + iv .)  This  important  relation  is 
the  main  reason  for  using  complex  analysis  in  problems  of  physics  and  engineering. 

We  shall  examine  this  connection  between  Laplace’s  equation  and  complex  analytic 
functions  and  illustrate  it  by  modeling  applications  from  electrostatics  (Secs.  18.1, 
18.2),  heat  conduction  (Sec.  18.3),  and  hydrodynamics  (Sec.  18.4).  This  in  turn  will 
lead  to  boundary  value  problems  in  two-dimensional  potential  theory.  As  a result, 
some  of  the  functions  of  Chap.  17  will  be  used  to  transform  complicated  regions  into 
simpler  ones. 

Section  18.5  will  derive  the  important  Poisson  formula  for  potentials  in  a circular  disk. 
Section  18.6  will  deal  with  harmonic  functions,  which,  as  you  recall,  are  solutions  of 
Laplace’s  equation  and  have  continuous  second  partial  derivatives.  In  that  section  we  will 
show  how  results  on  analytic  functions  can  be  used  to  characterize  properties  of  harmonic 
functions. 

Prerequisite:  Chaps.  13,  14,  17. 

References  and  Answers  to  Problems:  App.  1 Part  D,  App.  2. 
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SEC.  18.1  Electrostatic  Fields 
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18.1  Electrostatic  Fields 


The  electrical  force  of  attraction  or  repulsion  between  charged  particles  is  governed  by 
Coulomb’s  law  (see  Sec.  9.7).  This  force  is  the  gradient  of  a function  (I> , called  the 
electrostatic  potential.  At  any  points  free  of  charges,  is  a solution  of  Laplace’s  equation 

V2d>  = 0. 


Fig.  396.  Potential 
in  Example  1 


The  surfaces  (T>  = const  are  called  equipotential  surfaces.  At  each  point  P at  which 
the  gradient  of  (T>  is  not  the  zero  vector,  it  is  perpendicular  to  the  surface  (b  = const 
through  P;  that  is,  the  electrical  force  has  the  direction  perpendicular  to  the  equipotential 
surface.  (See  also  Secs.  9.7  and  12.11.) 

The  problems  we  shall  discuss  in  this  entire  chapter  are  two-dimensional  (for  the 
reason  just  given  in  the  chapter  opening),  that  is,  they  model  physical  systems  that  lie 
in  three-dimensional  space  (of  course!),  but  are  such  that  the  potential  (T>  is  independent 
of  one  of  the  space  coordinates,  so  that  <b  depends  only  on  two  coordinates,  which  we 
call  x and  y.  Then  Laplace’s  equation  becomes 


(1) 


d2$  d2d> 
dxz  dyz 


= 0. 


Equipotential  surfaces  now  appear  as  equipotential  lines  (curves)  in  the  xy-plane. 
Let  us  illustrate  these  ideas  by  a few  simple  examples. 


EXAMPLE  1 Potential  Between  Parallel  Plates 

Find  the  potential  of  the  field  between  two  parallel  conducting  plates  extending  to  infinity  (Fig.  396),  which 
are  kept  at  potentials  and  <l>2,  respectively. 

Solution.  From  the  shape  of  the  plates  it  follows  that  depends  only  on  x,  and  Laplace’s  equation  becomes 
<t>"  =0.  By  integrating  twice  we  obtain  = ax  + b,  where  the  constants  a and  b are  determined  by  the  given 
boundary  values  of  on  the  plates.  For  example,  if  the  plates  correspond  to  x = — 1 and  x = 1 , the  solution  is 

= l(CI>2  - 4>l)x  + 2(<I>2  + $1). 

The  equipotential  surfaces  are  parallel  planes. 


EXAMPLE  2 Potential  Between  Coaxial  Cylinders 

Find  the  potential  between  two  coaxial  conducting  cylinders  extending  to  infinity  on  both  ends  (Fig.  397) 
and  kept  at  potentials  and  <I>2>  respectively. 

Solution.  Here  depends  only  on  r=  Vx2  + y2,  for  reasons  of  symmetry,  and  Laplace’s  equation 
rUfr  + rur  + uee  = 0 [(5),  Sec.  12.10]  with  uee  = 0 and  u = becomes  r<&"  + = 0.  By  separating 

variables  and  integrating  we  obtain 

c£"  l a 

— , = , In  cFr  = —In  r + tf,  <f>—  — , <£>=  a \nr  + b 

r r 


and  a and  b are  determined  by  the  given  values  of  <f>  on  the  cylinders.  Although  no  infinitely  extended  conductors 
exist,  the  field  in  our  idealized  conductor  will  approximate  the  field  in  a long  finite  conductor  in  that  part  which 
is  far  away  from  the  two  ends  of  the  cylinders. 
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CHAP.  18  Complex  Analysis  and  Potential  Theory 


EXAMPLE  3 


Potential  in  an  Angular  Region 


Fig.  398.  Potential 
in  Example  3 


Find  the  potential  d>  between  the  conducting  plates  in  Fig.  398,  which  are  kept  at  potentials  d>i  (the  lower  plate) 
and  d>2,  and  make  an  angle  a,  where  0 < a ^ 77.  (In  the  figure  we  have  a = 120°  = 2*77/3.) 

Solution.  6 = Arg  z {z  = x + iy  0)  is  constant  on  rays  6 = const.  It  is  harmonic  since  it  is  the  imaginary 
part  of  an  analytic  function,  Ln  z (Sec.  13.7).  Hence  the  solution  is 

<F(x,  y)  = a + b Arg  z 

with  a and  b determined  from  the  two  boundary  conditions  (given  values  on  the  plates) 
a + b(—\a)  = a + b(^a)  = d>2- 

Thus  a = (d>2  + d>i)/2,  b = (d>2  — d>i )/a.  The  answer  is 

1 1 y 

d>(x,  y)  = 2 (d>2  + d>i)  + — (d>2  — ^i)  0,  6 = arctan  — . 


Complex  Potential 

Let  (T>  (x,  y)  be  harmonic  in  some  domain  D and  '•V(x,  y ) a harmonic  conjugate  of  (h  in  D. 
(Note  the  change  of  notation  from  u and  v of  Sec.  13.4  to  (b  and  ML  From  the  next  section 
on,  we  had  to  free  u and  v for  use  in  conformal  mapping.  Then 

(2)  F(z)  = ‘Ffe  y)  + i'V  (x,  y) 


is  an  analytic  function  of  z = x + iy.  This  function  F is  called  the  complex  potential 
corresponding  to  the  real  potential  (1> . Recall  from  Sec.  13.4  that  for  given  (f> , a conjugate 
'l'  is  uniquely  determined  except  for  an  additive  real  constant.  Hence  we  may  say  the 
complex  potential,  without  causing  misunderstandings. 

The  use  of  F has  two  advantages,  a technical  one  and  a physical  one.  Technically,  F 
is  easier  to  handle  than  real  or  imaginary  parts,  in  connection  with  methods  of  complex 
analysis.  Physically,  "T  has  a meaning.  By  conformality,  the  curves  4'  = const  intersect 
the  equipotential  lines  <J>  = const  in  the  xy-plane  at  right  angles  [except  where  F (z)  = 0]. 
Hence  they  have  the  direction  of  the  electrical  force  and,  therefore,  are  called  lines 
of  force.  They  are  the  paths  of  moving  charged  particles  (electrons  in  an  electron 
microscope,  etc.). 


SEC.  18.1  Electrostatic  Fields 
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EXAMPLE  4 


EXAMPLE  5 


EXAMPLE  6 


EXAMPLE  7 


Complex  Potential 

In  Example  1,  a conjugate  is  M?  = ay.  It  follows  that  the  complex  potential  is 

F(z ) = az  + b = ax + b + iay, 

and  the  lines  of  force  are  horizontal  straight  lines  y = const  parallel  to  the  x-axis. 

Complex  Potential 

In  Example  2 we  have  = a In  r + b — a In  \z\  + b.  A conjugate  is  = a Arg  z-  Hence  the  complex 
potential  is 

F(z ) = a Ln  z + b 

and  the  lines  of  force  are  straight  lines  through  the  origin.  Ffe)  may  also  be  interpreted  as  the  complex  potential 
of  a source  line  (a  wire  perpendicular  to  the  xy-plane)  whose  trace  in  the  xy-plane  is  the  origin. 

Complex  Potential 

In  Example  3 we  get  F(z ) by  noting  that  i Ln  z = i ln  \z\  ~ Arg  z,  multiplying  this  by  — b,  and  adding  a\ 

F(z)  = a — ib  Ln  z — a + b Arg  z ~ ib  ln  \z\. 

We  see  from  this  that  the  lines  of  force  are  concentric  circles  \z\  = const.  Can  you  sketch  them? 


Superposition 

More  complicated  potentials  can  often  be  obtained  by  superposition. 

Potential  of  a Pair  of  Source  Lines  (a  Pair  of  Charged  Wires) 

Determine  the  potential  of  a pair  of  oppositely  charged  source  lines  of  the  same  strength  at  the  points  z — c and 
Z — — c on  the  real  axis. 

Solution.  From  Examples  2 and  5 it  follows  that  the  potential  of  each  of  the  source  lines  is 

= K ln  \z  — c\  and  = — K ln  | z + c|, 

respectively.  Here  the  real  constant  K measures  the  strength  (amount  of  charge).  These  are  the  real  parts  of  the 
complex  potentials 

Fx(z)  = K Ln  (z  ~ c)  and  F2(z)  = -A^Ln  (z  + c ). 

Hence  the  complex  potential  of  the  combination  of  the  two  source  lines  is 

(3)  F(z)  = Fife)  + F2(z)  = K [Ln  fe  - c)  - Ln  fe  + c)] . 

The  equipotential  lines  are  the  curves 


- Re  Ffe)  = K ln 


z + c 


= const. 


thus 


z + c 


These  are  circles,  as  you  may  show  by  direct  calculation.  The  lines  of  force  are 

'L  = Im  F(z)  = AT[Arg  fe  — c)  — Arg  fe  + c)]  = const. 
We  write  this  briefly  (Fig.  399) 


= const. 


M?  = K(d  i — 62)  — const. 
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CHAP.  18  Complex  Analysis  and  Potential  Theory 


Now  6 1 — 62  is  the  angle  between  the  line  segments  from  z to  c and  — c (Fig.  399).  Hence  the  lines  of  force 
are  the  curves  along  each  of  which  the  line  segment  S':  — c ^ x c appears  under  a constant  angle.  These  curves 
are  the  totality  of  circular  arcs  over  S,  as  is  (or  should  be)  known  from  elementary  geometry.  Hence  the  lines 
of  force  are  circles.  Figure  400  shows  some  of  them  together  with  some  equipotential  lines. 

In  addition  to  the  interpretation  as  the  potential  of  two  source  lines,  this  potential  could  also  be  thought  of  as 
the  potential  between  two  circular  cylinders  whose  axes  are  parallel  but  do  not  coincide,  or  as  the  potential 
between  two  equal  cylinders  that  lie  outside  each  other,  or  as  the  potential  between  a cylinder  and  a plane  wall. 
Explain  this  using  Fig.  400. 


The  idea  of  the  complex  potential  as  just  explained  is  the  key  to  a close  relation  of  potential 
theory  to  complex  analysis  and  will  recur  in  heat  flow  and  fluid  flow. 


Fig.  399.  Arguments  in  Example  7 


Fig.  400.  Equipotential  lines  and  lines 
of  force  (dashed)  in  Example  7 


PROBLEM  SET  181 


1-4 


COAXIAL  CYLINDERS 


Find  and  sketch  the  potential  between  two  coaxial  cylinders 
of  radii  r\  and  ;-2  having  potential  U\  and  l/2 , respectively. 


1.  ri  = 2.5  mm,  r 2 — 4.0  cm,  U\  = 0 V, 

U2  = 220  V 

2.  n = 1 cm,  r2  = 2 cm,  (/1  = 400  V,  U2  = 0 V 

3.  r1  = 10  cm,  r2  — 1 m,  C/j  = 10  kV, 

U2  = -10  kV 


4.  If  r1  = 2 cm,  r2  = 6 cm  and  £/i  = 300  V,  U2  = 100  V, 
respectively,  is  the  potential  at  r = 4 cm  equal  to 
200  V?  Less?  More?  Answer  without  calculation.  Then 
calculate  and  explain. 


5-7 


PARALLEL  PLATES 


Find  and  sketch  the  potential  between  the  parallel  plates 
having  potentials  U\  and  U2 • Find  the  complex  potential. 


5.  Plates  at  xi  = —5cm,  x2  = 5cm,  potentials  U\  = 
250  V,  t/2  = 500  V,  respectively. 


6.  Plates  at  y = x and  y = x + k,  potentials  U\  = 0 V, 
t/2  = 220  V,  respectively. 


7.  Plates  at  xi  = 12  cm,  x2  = 24  cm,  potentials  U\  = 
20  kV,  U2  = 8 kV,  respectively. 


8.  CAS  EXPERIMENT.  Complex  Potentials.  Graph 
the  equipotential  lines  and  lines  of  force  in  (a)-(d)  (four 


graphs.  Re  F( z)  and  Im  F(z ) on  the  same  axes).  Then 
explore  further  complex  potentials  of  your  choice  with 
the  purpose  of  discovering  configurations  that  might 
be  of  practical  interest. 

(a)  F(z)  = z2  (b)  F(z)  = iz2 

(c)  F(z)  = 1/ z (d)  F(z)  = i/z 

9.  Argument.  Show  that  <t>  = 6/tt  = (1/tt)  arctan  (y/x) 
is  harmonic  in  the  upper  half-plane  and  satisfies  the 
boundary  condition  ct>(x,  0)  = 1 if  x < 0 and  0 if 
x > 0,  and  the  corresponding  complex  potential  is 
F(z)  = — (t/7T)  Ln  z. 

10.  Conformal  mapping.  Map  the  upper  z-half-plane 

onto  | w | S 1 so  that  0,  00,  —1  are  mapped  onto  1,  i,  — i, 
respectively.  What  are  the  boundary  conditions  on 
| w|  = 1 resulting  from  the  potential  in  Prob.  9?  What 
is  the  potential  at  w = 0? 

11.  Text  Example  7.  Verify,  by  calculation,  that  the  equipo- 
tential lines  are  circles. 


12-15 


OTHER  CONFIGURATIONS 


12.  Find  and  sketch  the  potential  between  the  axes 
(potential  500  V)  and  the  hyperbola  .xy  = 4 (potential 
100  V). 
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13.  Arccos.  Show  that  F(z ) = arccos  z (defined  in  Problem 
Set  13.7)  gives  the  potential  of  a slit  in  Fig.  401. 

y 


14.  Arccos.  Show  that  F(z)  in  Prob.  13  gives  the  potentials 
in  Fig.  402. 


y i y 


Fig.  402.  Other  apertures 


15.  Sector.  Find  the  real  and  complex  potentials  in  the 
sector  — 7t/6  S 8 S 7t/6  between  the  boundary  8 = 
±7t/6,  kept  at  0 V,  and  the  curve  x 3 — 3xyz  = 1,  kept 
at  220  V. 


18.2  Use  of  Conformal  Mapping.  Modeling 

We  have  just  explored  the  close  relation  between  potential  theory  and  complex  analysis. 
This  relationship  is  so  close  because  complex  potentials  can  be  modeled  in  complex 
analysis.  In  this  section  we  shall  explore  the  close  relation  that  results  from  the  use  of 
conformal  mapping  in  modeling  and  solving  boundary  value  problems  for  the  Laplace 
equation.  The  process  consists  of  finding  a solution  of  the  equation  in  some  domain, 
assuming  given  values  on  the  boundary  ( Dirichlet  problem , see  also  Sec.  12.6).  The  key 
idea  is  then  to  use  conformal  mapping  to  map  a given  domain  onto  one  for  which  the 
solution  is  known  or  can  be  found  more  easily.  This  solution  thus  obtained  is  then  mapped 
back  to  the  given  domain.  The  reason  this  approach  works  is  due  to  Theorem  1,  which 
asserts  that  harmonic  functions  remain  harmonic  under  conformal  mapping: 


THEOREM  1 


Harmonic  Functions  Under  Conformal  Mapping 

Let  <E>*  be  harmonic  in  a domain  D*  in  the  w-plane.  Suppose  that  w = u + iv  = f(z ) 
is  analytic  in  a domain  D in  the  z-plane  and  maps  D conformally  onto  D* . Then 
the  function 

(1)  <D(x,y)  = ®*(u(x,y),v(x,y)) 

is  harmonic  in  D. 


PROOF  The  composite  of  analytic  functions  is  analytic,  as  follows  from  the  chain  rule.  Hence,  taking 
a harmonic  conjugate  '•V*(u,  v)  of  <!>*,  as  defined  in  Sec.  13.4,  and  forming  the  analytic 
function  F*(w ) = <£>*(u,  v)  + i'V'fic  v ) we  conclude  that  F(z)  = F*(f(z))  is  analytic  in  D. 
Hence  its  real  part  <T>  (x,  y)  = Re  F(z)  is  harmonic  in  D.  This  completes  the  proof. 

We  mention  without  proof  that  if  D*  is  simply  connected  (Sec.  14.2),  then  a harmonic 
conjugate  of  <J>*  exists.  Another  proof  of  Theorem  1 without  the  use  of  a harmonic 
conjugate  is  given  in  App.  4. 
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EXAMPLE  1 


Potential  Between  Noncoaxial  Cylinders 

Model  the  electrostatic  potential  between  the  cylinders  C\.  \z\  — 1 and  C2'.  \z  ~ ||  = § in  Fig.  403.  Then  give 
the  solution  for  the  case  that  C 1 is  grounded,  U\  = 0 V,  and  C2  has  the  potential  U2  — 110  V. 

Solution.  We  map  the  unit  disk  \z\  = 1 onto  the  unit  disk  |w|  = 1 in  such  a way  that  C2  is  mapped  onto 
some  cylinder  Cj:  |w|  = r$.  By  (3),  Sec.  17.3,  a linear  fractional  transformation  mapping  the  unit  disk  onto  the 
unit  disk  is 


where  we  have  chosen  b = zo  real  without  restriction,  zo  is  of  no  immediate  help  here  because  centers  of  circles 
do  not  map  onto  centers  of  the  images,  in  general.  However,  we  now  have  two  free  constants  b and  r$  and  shall 
succeed  by  imposing  two  reasonable  conditions,  namely,  that  0 and  f (Fig.  403)  should  be  mapped  onto  r$  and 
t~Q  (Fig.  404),  respectively.  This  gives  by  (2) 


ro  = 


0-1 


and  with  this. 


~ro  ~ 


5 -b 
4b/5  - 1 


4 

5 


ro 


4r0/5  - 1 ’ 


a quadratic  equation  in  ro  with  solutions  ro  = 2 (no  good  because  r©  < 1)  and  r0  = |.  Hence  our  mapping 
function  (2)  with  b = \ becomes  that  in  Example  5 of  Sec.  17.3, 


(3) 


w = f(z)  = 


2 z ~ 1 
z - 2 


From  Example  5 in  Sec.  18.1,  writing  w for  z we  have  as  the  complex  potential  in  the  w-plane  the  function 
F*(w)  = a Ln  w + k and  from  this  the  real  potential 

<F*  (m,  v ) = Re  F*  (w)  — a ln  \w\  + k. 

This  is  our  model.  We  now  determines  and  k from  the  boundary  conditions.  If  | w|  = 1,  then  4>*  — a ln  1 + k = 0, 
hence  k = 0.  If  \w\  = r©  = then  = a ln  (^)  = 110,  hence  a = 110/ln  (|)  = —158.7.  Substitution  of  (3) 
now  gives  the  desired  solution  in  the  given  domain  in  the  4-plane 

* 1 
m = F*(f(z))  = a Ln 


The  real  potential  is 


<F(x,  y ) = Re  F(z)  = a ln 


2z  - 1 

z — 2 


a = -158.7. 


Can  we  “see”  this  result?  Well,  4>(x,  y)  = const  if  and  only  if  |(2 z ~ 1 )/(z  — 2)|  = const,  that  is,  |w|  = const 
by  (2)  with  b = \ . These  circles  are  images  of  circles  in  the  z-plane  because  the  inverse  of  a linear  fractional 
transformation  is  linear  fractional  (see  (4),  Sec.  17.2),  and  any  such  mapping  maps  circles  onto  circles  (or  straight 
lines),  by  Theorem  1 in  Sec.  17.2.  Similarly  for  the  rays  arg  w = const.  Hence  the  equipotential  lines 
<F(x,  y)  = const  are  circles,  and  the  lines  of  force  are  circular  arcs  (dashed  in  Fig.  404).  These  two  families  of 
curves  intersect  orthogonally,  that  is,  at  right  angles,  as  shown  in  Fig.  404. 
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EXAMPLE  2 


Potential  Between  Two  Semicircular  Plates 

Model  the  potential  between  two  semicircular  plates  Pi  and  P 2 in  Fig.  405  having  potentials  —3000  V and 
3000  V,  respectively.  Use  Example  3 in  Sec.  18.1  and  conformal  mapping. 

Solution.  Step  1.  We  map  the  unit  disk  in  Fig.  405  onto  the  right  half  of  the  w-plane  (Fig.  406)  by  using 
the  linear  fractional  transformation  in  Example  3,  Sec.  17.3: 

1 + z 

w = f(z)  = . 

1 - z 


The  boundary  \z\  — 1 is  mapped  onto  the  boundary  u = 0 (the  u-axis),  with  z — — 1, 1,  1 going  onto  w = 0,  i,  °°, 
respectively,  and  z — ~i  onto  w = —i.  Hence  the  upper  semicircle  of  |z|  — 1 is  mapped  onto  the  upper  half, 
and  the  lower  semicircle  onto  the  lower  half  of  the  u-axis,  so  that  the  boundary  conditions  in  the  w-plane  are 
as  indicated  in  Fig.  406. 

Step  2.  We  determine  the  potential  <F*(m,  v ) in  the  right  half-plane  of  the  w-plane.  Example  3 in  Sec.  18.1  with 
a = 77,  Ui  = —3000,  and  U2  = 3000  [with  <J>*(w,  v ) instead  of  <E>(x,  y)]  yields 

_ 6000  v 

<P ' (u,  v ) = (p , cp  = arctan  — . 

77  U 

On  the  positive  half  of  the  imaginary  axis  (cp  = 77/2),  this  equals  3000  and  on  the  negative  half  —3000,  as  it 
should  be.  <f>*  is  the  real  part  of  the  complex  potential 

* 6000  i 

F*(w)  = Ln  w. 

v ' 77 


Step  3.  We  substitute  the  mapping  function  into  F*  to  get  the  complex  potential  F(z ) in  Fig.  405  in  the  form 


F(z)  = F*(f(z. ))  = - 


6000  / 
77 


Ln 


1 + z 

1 - Z ■ 


The  real  part  of  this  is  the  potential  we  wanted  to  determine: 


^ x 6000  1 + z 

<f>  (x,  y)  = Re  F(z)  — ^ Im  Ln  ^ _ 


6000 


1 + z 
1 - z‘ 


As  in  Example  1 we  conclude  that  the  equipotential  lines  <F(x,  y)  = const  are  circular  arcs  because  they 
correspond  to  Arg  [(1  + z)/{  1 — z)]  — const,  hence  to  Arg  w = const.  Also,  Arg  w = const  are  rays  from  0 
to  00,  the  images  of  z — — 1 and  z = 1,  respectively.  Hence  the  equipotential  lines  all  have  —1  and  1 (the 
points  where  the  boundary  potential  jumps)  as  their  endpoints  (Fig.  405).  The  lines  of  force  are  circular  arcs, 
too,  and  since  they  must  be  orthogonal  to  the  equipotential  lines,  their  centers  can  be  obtained  as  intersections 
of  tangents  to  the  unit  circle  with  the  x-axis,  (Explain!) 


Further  examples  can  easily  be  constructed.  Just  take  any  mapping  w = f(z)  in  Chap.  17, 
a domain  D in  the  z-plane,  its  image  D*  in  the  w-plane,  and  a potential  O*  in  D*.  Then  (1) 
gives  a potential  in  D.  Make  up  some  examples  of  your  own,  involving,  for  instance, 
linear  fractional  transformations. 
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Basic  Comment  on  Modeling 

We  formulated  the  examples  in  this  section  as  models  on  the  electrostatic  potential.  It  is 
quite  important  to  realize  that  this  is  accidental.  We  could  equally  well  have  phrased 
everything  in  terms  of  (time-independent)  heat  flow;  then  instead  of  voltages  we  would 
have  had  temperatures,  the  equipotential  lines  would  have  become  isotherms  (=  lines  of 
constant  temperature),  and  the  lines  of  the  electrical  force  would  have  become  lines  along 
which  heat  flows  from  higher  to  lower  temperatures  (more  on  this  in  the  next  section). 
Or  we  could  have  talked  about  fluid  flow;  then  the  electrostatic  lines  of  force  would  have 
become  streamlines  (more  on  this  in  Sec.  18.4).  What  we  again  see  here  is  the  unifying 
power  of  mathematics:  different  phenomena  and  systems  from  different  areas  in  physics 
having  the  same  types  of  model  can  be  treated  by  the  same  mathematical  methods.  What 
differs  from  area  to  area  is  just  the  kinds  of  problems  that  are  of  practical  interest. 


1.  Derivation  of  (3)  from  (2).  Verify  the  steps. 

2.  Second  proof.  Give  the  details  of  the  steps  given  on 
p.  A93  of  the  book.  What  is  the  point  of  that  proof? 

APPLICATION  OF  THEOREM  1 

3.  Find  the  potential  <I>  in  the  region  R in  the  first  quadrant 
of  the  z-plane  bounded  by  the  axes  (having  potential 
U\)  and  the  hyperbola  _y  = 1/jc  (having  potential  I/2) 
by  mapping  R onto  a suitable  infinite  strip.  Show  that 
<I>  is  harmonic.  What  are  its  boundary  values? 

4.  Let  <b*  = 4 uv,  w = /(z ) = ez,  and  D:  x < 0, 
0 < y < 7 t.  Find  <b.  What  are  its  boundary  values? 

5.  CAS  PROJECT.  Graphing  Potential  Fields. 
Graph  equipotential  lines  (a)  in  Example  1 of  the  text, 

(b)  if  the  complex  potential  is  F(z ) = z2,  iz 2,  ez. 

(c)  Graph  the  equipotential  surfaces  for  F(z)  = Ln  z as 
cylinders  in  space. 

6.  Apply  Theorem  1 to  <b*(«,  v)  — u2  — v2,  w = 
f(z)  — ez,  and  any  domain  D,  showing  that  the  resulting 
potential  <I>  is  harmonic. 

7.  Rectangle,  sin  z.  Let  ft  0 S r S j7T,  0 S y S 1;  D* 
the  image  of  D under  w = sin  z;  and  <!>*  = u2  — v2. 
What  is  the  corresponding  potential  <t>  in  D1  What  are 
its  boundary  values?  Sketch  D and  D*. 

8.  Conjugate  potential.  What  happens  in  Prob.  7 if  you 
replace  the  potential  by  its  conjugate  harmonic? 

9.  Translation.  What  happens  in  Prob.  7 if  we  replace 
sin  z by  cos  z = sin  (z  + §7 r)? 

10.  Noncoaxial  Cylinders.  Find  the  potential  between 
the  cylinders  Ci:  |z|  = 1 (potential  U\  = 0)  and 
C2:  |z  — c|  = c (potential  U2  = 220  V),  where 
0 < c < \ . Sketch  or  graph  equipotential  lines  and 
their  orthogonal  trajectories  for  c = 3 . Can  you  guess 
how  the  graph  changes  if  you  increase  c (<  g)? 


11.  On  Example  2.  Verify  the  calculations. 

12.  Show  that  in  Example  2 the  y-axis  is  mapped  onto  the 
unit  circle  in  the  w-plane. 

13.  At  z = ± 1 in  Fig.  405  the  tangents  to  the  equipotential 
lines  as  shown  make  equal  angles.  Why? 

14.  Figure  405  gives  the  impression  that  the  potential  on 
the  v-axis  changes  more  rapidly  near  0 than  near  ±i. 
Can  you  verify  this? 

15.  Angular  region.  By  applying  a suitable  conformal 
mapping,  obtain  from  Fig.  406  the  potential  <t>  in  the 
sector  —37 t < Arg  z < 377  such  that  <t>  = — 3 kV  if 
Arg  z = — 377  and  <b  = 3 kV  if  Arg  z = 477. 

16.  Solve  Prob.  15  if  the  sector  is  — §77  < Arg  z < §77. 

17.  Another  extension  of  Example  2.  Find  the  linear 
fractional  transformation  z = g(Z)  that  maps  |Z|  S 1 
onto  |z|  S 1 with  Z = Z/2  being  mapped  onto  z = 0. 
Show  that  Zi  = 0.6  + 0.8/  is  mapped  onto  z = — 1 
and  Z2  = —0.6  + 0.8/  onto  z = 1,  so  that  the 
equipotential  lines  of  Example  2 look  in  |Z|  S 1 as 
shown  in  Fig.  407. 


Fig.  407  Problem  17 


SEC.  18.3  Heat  Problems 
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18.  The  equipotential  lines  in  Prob.  17  are  circles.  Why? 

19.  Jump  on  the  boundary.  Find  the  complex  and  real 
potentials  in  the  upper  half-plane  with  boundary  values 
5 kV  if  x < 2 and  0 if  x > 2 on  the  x-axis. 


20.  Jumps.  Do  the  same  task  as  in  Prob.  19  if  the  boundary 
values  on  the  x-axis  are  Vo  when  —a<x<a  and  0 
elsewhere. 


18.  Heat  Problems 


Heat  conduction  in  a body  of  homogeneous  material  is  modeled  by  the  heat  equation 

Tt  = c2V2T 

where  the  function  T is  temperature,  Tt  = dT/dt,  t is  time,  and  c2  is  a positive  constant 
(specific  to  the  material  of  the  body;  see  Sec.  12.6). 

Now  if  a heat  flow  problem  is  steady,  that  is,  independent  of  time,  we  have  Tt  = 0.  If 
it  is  also  two-dimensional,  then  the  heat  equation  reduces  to 

(1)  V2r  = Txx  + Tyy  = 0, 

which  is  the  two-dimensional  Laplace  equation.  Thus  we  have  shown  that  we  can  model 
a two-dimensional  steady  heat  flow  problem  by  Laplace’s  equation. 

Furthermore  we  can  treat  this  heat  flow  problem  by  methods  of  complex  analysis,  since 
T (or  T{x,  v))  is  the  real  part  of  the  complex  heat  potential 

F(z)  = T(x,  y ) + i'V(x,  y ). 

We  call  T(x,  y)  the  heat  potential.  The  curves  7’(x,  y)  = const  are  called  isotherms,  which 
means  lines  of  constant  temperature.  The  curves  'P  (x,  y)  = const  are  called  heat  flow 
lines  because  heat  flows  along  them  from  higher  temperatures  to  lower  temperatures. 

It  follows  that  all  the  examples  considered  so  far  (Secs.  18.1,  18.2)  can  now  be 
reinterpreted  as  problems  on  heat  flow.  The  electrostatic  equipotential  lines  <J>(x,  y)  = const 
now  become  isotherms  T{x,y)  = const,  and  the  lines  of  electrical  force  become  lines  of 
heat  flow,  as  in  the  following  two  problems. 

EXAM  Temperature  Between  Parallel  Plates 

Find  the  temperature  between  two  parallel  plates  x = 0 and  x = d in  Fig.  408  having  temperatures  0 and  100°C, 
respectively. 

Solution.  As  in  Example  1 of  Sec.  18.1  we  conclude  that  T(x,  y)  = ax  + b.  From  the  boundary  conditions, 
b = 0 and  a = 100 /d.  The  answer  is 


T(x,y)  =—x[°C]. 
a 

The  corresponding  complex  potential  is  F{z)  — (100 /d)z.  Heat  flows  horizontally  in  the  negative  x-direction 
along  the  lines  y = const. 

EXAMPLE  2 Temperature  Distribution  Between  a Wire  and  a Cylinder 

Find  the  temperature  field  around  a long  thin  wire  of  radius  = 1 mm  that  is  electrically  heated  to  7i  = 500°F 
and  is  surrounded  by  a circular  cylinder  of  radius  — 100  mm,  which  is  kept  at  temperature  = 60°F  by 
cooling  it  with  air.  See  Fig.  409.  (The  wire  is  at  the  origin  of  the  coordinate  system.) 
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EXAMPLE  3 


Solution.  T depends  only  on  r,  for  reasons  of  symmetry.  Hence,  as  in  Sec.  18.1  (Example  2), 

T(x,  y)  = a In  r + b. 


The  boundary  conditions  are 

71  = 500  = a In  1 + b,  T2  = 60  = a In  100  + b. 

Hence  b = 500  (since  In  1 = 0)  and  a = (60  — £>)/ln  100  = —95.54.  The  answer  is 

T( x,y)  = 500  - 95.54  In  r[°F], 

The  isotherms  are  concentric  circles.  Heat  flows  from  the  wire  radially  outward  to  the  cylinder.  Sketch  T as  a 
function  of  r.  Does  it  look  physically  reasonable? 


Fig.  410.  Example  3 


Mathematically  the  calculations  remain  the  same  in  the  transition  to  another  field  of 
application.  Physically,  new  problems  may  arise,  with  boundary  conditions  that  would 
make  no  sense  physically  or  would  be  of  no  practical  interest.  This  is  illustrated  by  the 
next  two  examples. 

A Mixed  Boundary  Value  Problem 

Find  the  temperature  distribution  in  the  region  in  Fig.  410  (cross  section  of  a solid  quarter-cylinder),  whose 
vertical  portion  of  the  boundary  is  at  20°C,  the  horizontal  portion  at  50°C,  and  the  circular  portion  is  insulated. 

Solution.  The  insulated  portion  of  the  boundary  must  be  a heat  flow  line,  since,  by  the  insulation,  heat  is 
prevented  from  crossing  such  a curve,  hence  heat  must  flow  along  the  curve.  Thus  the  isotherms  must  meet 
such  a curve  at  right  angles.  Since  T is  constant  along  an  isotherm,  this  means  that 

dT 

(2)  — = 0 along  an  insulated  portion  of  the  boundary. 

dn 

Here  dT/dn  is  the  normal  derivative  of  T,  that  is,  the  directional  derivative  (Sec.  9.7)  in  the  direction  normal 
(perpendicular)  to  the  insulated  boundary.  Such  a problem  in  which  T is  prescribed  on  one  portion  of  the  boundary 
and  dT/dn  on  the  other  portion  is  called  a mixed  boundary  value  problem. 

In  our  case,  the  normal  direction  to  the  insulated  circular  boundary  curve  is  the  radial  direction  toward  the 
origin.  Hence  (2)  becomes  dT/dr  = 0,  meaning  that  along  this  curve  the  solution  must  not  depend  on  r.  Now 
Argz  — 6 satisfies  (1),  as  well  as  this  condition,  and  is  constant  (0  and  tt/2)  on  the  straight  portions  of  the 
boundary.  Hence  the  solution  is  of  the  form 


T(x,  y)  — ad  + b. 

The  boundary  conditions  yield  a • tt/2  + b = 20  and  a • 0 + b = 50.  This  gives 

60  y 

T(x,  y)  = 50 6,  6 = arctan  — . 

v TT  X 
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The  isotherms  are  portions  of  rays  6 = const.  Heat  flows  from  the  x-axis  along  circles  r = const  (dashed  in 
Fig.  410)  to  the  y-axis. 


EXAMPLE  4 Another  Mixed  Boundary  Value  Problem  in  Heat  Conduction 

Find  the  temperature  field  in  the  upper  half-plane  when  the  x-axis  is  kept  at  T — 0°C  for  x < — 1,  is  insulated 
for  —1  < x < 1,  and  is  kept  at  T = 20°C  for  x > 1 (Fig.  411). 

Solution.  We  map  the  half-plane  in  Fig.  411  onto  the  vertical  strip  in  Fig.  412,  find  the  temperature  T*  ( u , v) 
there,  and  map  it  back  to  get  the  temperature  T (. x , y)  in  the  half-plane. 

The  idea  of  using  that  strip  is  suggested  by  Fig.  391  in  Sec.  17.4  with  the  roles  of  z — x + iy  and  w = u + iv 
interchanged.  The  figure  shows  that  z — sin  w maps  our  present  strip  onto  our  half-plane  in  Fig.  411.  Hence  the 
inverse  function 

w = f(z ) = arcsin  z 

maps  that  half-plane  onto  the  strip  in  the  w-plane.  This  is  the  mapping  function  that  we  need  according  to 
Theorem  1 in  Sec.  18.2. 

The  insulated  segment  — 1 < x < 1 on  the  x-axis  maps  onto  the  segment  — 77/2  < u < 77/2  on  the  w-axis. 
The  rest  of  the  x-axis  maps  onto  the  two  vertical  boundary  portions  u = —77 / 2 and  7t/2,  v > 0,  of  the  strip. 
This  gives  the  transformed  boundary  conditions  in  Fig.  412  for  T*(m,  v ),  where  on  the  insulated  horizontal 
boundary,  dT*/dn  = dT*/dv  = 0 because  v is  a coordinate  normal  to  that  segment. 

Similarly  to  Example  1 we  obtain 


T*(m,  v)  = 10  H u 

v ' 77 

which  satisfies  all  the  boundary  conditions.  This  is  the  real  part  of  the  complex  potential  F*(w)  = 10  + (20/ 77)  w. 
Hence  the  complex  potential  in  the  z-plane  is 

* 20 
F(z ) = F*(/(z))  = 10  + — arcsin  z 

and  T(x,  y)  = Re  F(z)  is  the  solution.  The  isotherms  are  u = const  in  the  strip  and  the  hyperbolas  in  the  z-plane, 
perpendicular  to  which  heat  flows  along  the  dashed  ellipses  from  the  20°-portion  to  the  cooler  0°-portion  of  the 
boundary,  a physically  very  reasonable  result. 

Sections  18.3  and  18.5  show  some  of  the  usefulness  of  conformal  mappings  and  complex 
potentials.  Furthermore,  complex  potential  models  fluid  flow  in  Sec.  18.4. 


gRQ^BXE-W=S^T-18^ 


1.  Parallel  plates.  Find  the  temperature  between  the 
plates  y — 0 and  y = d kept  at  20  and  100°C,  respec- 
tively. (i)  Proceed  directly,  (ii)  Use  Example  1 and  a 
suitable  mapping. 


2.  Infinite  plate.  Find  the  temperature  and  the  complex 
potential  in  an  infinite  plate  with  edges  y = x — 4 and 
y = x + 4keptat—  20  and  40°C,  respectively  (Fig.  413). 
In  what  case  will  this  be  an  approximate  model? 
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Fig.  413.  Problem  2:  Infinite  plate 


3.  CAS  PROJECT.  Isotherms.  Graph  isotherms  and 
lines  of  heat  flow  in  Examples  2-4.  Can  you  see  from 
the  graphs  where  the  heat  flow  is  very  rapid? 


TEMPERATURE  T (x,  y)  IN  PLATES 

Find  the  temperature  distribution  T (x,  y)  and  the  complex 
potential  F(z)  in  the  given  thin  metal  plate  whose  faces 
are  insulated  and  whose  edges  are  kept  at  the  indicated 
temperatures  or  are  insulated  as  shown. 


4-18 


T*  = T 


T*  = T 


9. 


T = 0 


10. 


b 

— o-  — 

rji rji  rji  rji  rji  rji 


11. 


T = 0 T = 100°C  T = 0 


Hint.  Apply  w — cosh  z to  Prob.  1 1 . 

14. 


17.  First  quadrant  of  the  z-plane  with  y-axis  kept  at  100°C, 
the  segment  0 < x < 1 of  the  x-axis  insulated  and  the 
x-axis  for  x > 1 kept  at  200°C.  Hint.  Use  Example  4. 

18.  Figure  410,  T( 0,  y)  = -30°C,  T(x,  0)  = 100°C 

19.  Interpretation.  Formulate  Prob.  1 1 in  terms  of  electro- 
statics. 

20.  Interpretation.  Interpret  Prob.  17  in  Sec.  18.2  as  a heat 
problem,  with  boundary  temperatures,  say,  10°C  on  the 
upper  part  and  200°C  on  the  lower. 


T=  0 


T*  = T, 
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8/  Fluid  Flow 

Laplace’s  equation  also  plays  a basic  role  in  hydrodynamics,  in  steady  nonviscous  fluid 
flow  under  physical  conditions  discussed  later  in  this  section.  For  methods  of  complex 
analysis  to  be  applicable,  our  problems  will  be  two-dimensional,  so  that  the  velocity  vector 
V by  which  the  motion  of  the  fluid  can  be  given  depends  only  on  two  space  variables  x 
and  y,  and  the  motion  is  the  same  in  all  planes  parallel  to  the  xy-plane. 

Then  we  can  use  for  the  velocity  vector  V a complex  function 

(1)  V=V1  + iV2 

giving  the  magnitude  | V\  and  direction  Arg  V of  the  velocity  at  each  point  z = x + iy. 
Here  V±  and  V2  are  the  components  of  the  velocity  in  the  x and  y directions.  V is  tangential 
to  the  path  of  the  moving  particles,  called  a streamline  of  the  motion  (Fig.  414). 

We  show  that  under  suitable  assumptions  (explained  in  detail  following  the  examples), 
for  a given  flow  there  exists  an  analytic  function 

(2)  F(z ) = dKx,  y)  + i'l'ix,  y), 

called  the  complex  potential  of  the  flow,  such  that  the  streamlines  are  given  by 
’'I'  (x,  y)  = const,  and  the  velocity  vector  or,  briefly,  the  velocity  is  given  by 


(3)  V = Vi  + iV2  = F’{z) 


Fig.  414.  Velocity 


where  the  bar  denotes  the  complex  conjugate.  "'I'  is  called  the  stream  function.  The 
function  <£>  is  called  the  velocity  potential.  The  curves  <T>(x,  y)  = const  are  called 
equipotential  lines.  The  velocity  vector  V is  the  gradient  of  <£>;  by  definition,  this 
means  that 


(4) 


V,  = 


dd> 

dx 


Vo  = 


dd> 

dy 


Indeed,  for  F = <f>  + ixV,  Eq.  (4)  in  Sec.  13.4  is  F'  = + ixVx  with  ^ x = — by 

the  second  Cauchy-Riemann  equation.  Together  we  obtain  (3): 


F\z)  = <bx-  = <f>x  + id>y  = Vl  + iv2  = V. 
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EXAMPLE  1 


EXAMPLE  2 


Furthermore,  since  F(z ) is  analytic,  <F  and  \P'  satisfy  Laplace’s  equation 


(5) 


V2<F  = 


d2d> 

dx2 


+ 


d2d> 

dy2 


= 0, 


= 


d2Ap 


+ 


dxz 


52\p 

dy2 


= 0. 


Whereas  in  electrostatics  the  boundaries  (conducting  plates)  are  equipotential  lines,  in 
fluid  flow  the  boundaries  across  which  fluid  cannot  flow  must  be  streamlines.  Hence  in 
fluid  flow  the  stream  function  is  of  particular  importance. 

Before  discussing  the  conditions  for  the  validity  of  the  statements  involving  (2)-(5),  let 
us  consider  two  flows  of  practical  interest,  so  that  we  first  see  what  is  going  on  from  a 
practical  point  of  view.  Further  flows  are  included  in  the  problem  set. 

Flow  Around  a Corner 

The  complex  potential  F(z)  = :2  = xz  — y2  + 2ixy  models  a flow  with 

Equipotential  lines  <f>  = x2  — v2  = const  (Hyperbolas) 

Streamlines  Mf  = 2xy  = const  (Hyperbolas). 

From  (3)  we  obtain  the  velocity  vector 

V = 2z  = 2(x  — iy ),  that  is,  V]  = 2x,  V2  = ~2y. 

The  speed  (magnitude  of  the  velocity)  is 

M = Vvf  + v\  = 2Vx2  + yz. 

The  flow  may  be  interpreted  as  the  flow  in  a channel  bounded  by  the  positive  coordinates  axes  and  a hyperbola, 
say,  xy  = 1 (Fig.  415).  We  note  that  the  speed  along  a streamline  S has  a minimum  at  the  point  P where  the 
cross  section  of  the  channel  is  large. 


Fig.  415.  Flow  around  a corner  (Example  1) 

Flow  Around  a Cylinder 

Consider  the  complex  potential 

F(z)  = <f>(x,  y)  + y)  = z + T . 

Using  the  polar  form  z = re‘e,  we  obtain 

F(z)  = + — g_lft  es  (r  + cos  8 + i (r  — — ^ sin  6. 

Hence  the  streamlines  are 


V(x,y)  = 


r-‘ 


sin  6 = const. 
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THEOREM  1 


PROOF 


In  particular,  ^ (jc,  jy)  = 0 gives  r — l/r  = 0 or  sin  6 = 0.  Hence  this  streamline  consists  of  the  unit  circle 
(r  = l/r  gives  r = 1)  and  the  x-axis  (6  = 0 and  6 = 77).  For  large  \z\  the  term  1 /z  in  F(z ) is  small  in  absolute 
value,  so  that  for  these  z the  flow  is  nearly  uniform  and  parallel  to  the  x-axis.  Hence  we  can  interpret  this  as  a 
flow  around  a long  circular  cylinder  of  unit  radius  that  is  perpendicular  to  the  z-plane  and  intersects  it  in  the 
unit  circle  \z\  = 1 and  whose  axis  corresponds  to  z — 0. 

The  flow  has  two  stagnation  points  (that  is,  points  at  which  the  velocity  V is  zero),  at  z — — 1 • This  follows 
from  (3)  and 


hence  z2  — 1 = 0. 


(See  Fig.  416.)  ■ 


Fig.  416.  Flow  around  a cylinder  (Example  2) 


Assumptions  and  Theory  Underlying  (2)— (5) 


Complex  Potential  of  a Flow 

If  the  domain  of  flow  is  simply  connected  and  the  flow  is  irrotational  and 
incompressible,  then  the  statements  involving  (2)-(5)  hold.  In  particular,  then  the 
flow  has  a complex  potential  F(z),  which  is  an  analytic  function.  (Explanation  of 
terms  below.) 


We  prove  this  theorem,  along  with  a discussion  of  basic  concepts  related  to  fluid  flow. 

(a)  First  Assumption:  Irrotational.  Let  C be  any  smooth  curve  in  the  z-plane  given 
by  z(s)  = x(s)  + iy(s),  where  s is  the  arc  length  of  C.  Let  the  real  variable  VJ  be  the 
component  of  the  velocity  V tangent  to  C (Lig.  417).  Then  the  value  of  the  real  line 
integral 


(6) 


Jc 


Vt  ds 


Fig.  417.  Tangential  component  of  the 
velocity  with  respect  to  a curve  C 
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taken  along  C in  the  sense  of  increasing  s is  called  the  circulation  of  the  fluid  along  C, 
a name  that  will  be  motivated  as  we  proceed  in  this  proof.  Dividing  the  circulation  by  the 
length  of  C,  we  obtain  the  mean  velocity 1 of  the  flow  along  the  curve  C.  Now 

Vt  = |v|  cos  a (Fig.  417). 

Hence  Vt  is  the  dot  product  (Sec.  9.2)  of  V and  the  tangent  vector  dz/ds  of  C (Sec.  17.1); 
thus  in  (6), 


Vtds 


Vi£  + 

ds 


ds  = Vi  dx  + V2  dy. 


The  circulation  (6)  along  C now  becomes 


(7) 


Vt  ds  = ( Vi  dx  + \ 2 dy). 

c c 


As  the  next  idea,  let  Cbe  a closed  curve  satisfying  the  assumption  as  in  Green’s  theorem 
(Sec.  10.4),  and  let  C be  the  boundary  of  a simply  connected  domain  D.  Suppose  further 
that  V has  continuous  partial  derivatives  in  a domain  containing  D and  C.  Then  we  can 
use  Green’s  theorem  to  represent  the  circulation  around  C by  a double  integral, 


(8) 


(>  (Vi  dx  + V2  dy) 
c 


D 


dx  dy. 


The  integrand  of  this  double  integral  is  called  the  vorticity  of  the  flow.  The  vorticity 
divided  by  2 is  called  the  rotation 


(9) 


m(x,  y) 


1 _ 3VA 

2 \ dx  dy  J 


We  assume  the  flow  to  be  irrotational,  that  is,  w(x,  y)  = 0 throughout  the  flow;  thus, 


(10) 


dVz  5Vi 

dx  dy 


To  understand  the  physical  meaning  of  vorticity  and  rotation,  take  for  C in  (8)  a circle. 
Let  r be  the  radius  of  C.  Then  the  circulation  divided  by  the  length  27 tr  of  C is  the  mean 


1 Definitions : 


b — a 


f(x)  dx  = mean  value  of  / on  the  interval  a S x S b. 


f(s)  ds  = mean  value  of/ on  C (L  = length  of  C), 

'c 


1 

A 


D 


fix,  y)  dx  dy  = mean  value  of / on  D (A  = area  of  D). 
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velocity  of  the  fluid  along  C.  Hence  by  dividing  this  by  r we  obtain  the  mean  angular 
velocity  co0  of  the  fluid  about  the  center  of  the  circle: 


co0  — 


1 

277"r2 


J 

D 


co(x,  y)  dx  dy. 
J . 

D 


If  we  now  let  r — » 0,  the  limit  of  a>o  is  the  value  of  to  at  the  center  of  C.  Hence  <w(x,  y)  is 
the  limiting  angular  velocity  of  a circular  element  of  the  fluid  as  the  circle  shrinks  to  the 
point  (x,  y).  Roughly  speaking,  if  a spherical  element  of  the  fluid  were  suddenly  solidified 
and  the  surrounding  fluid  simultaneously  annihilated,  the  element  would  rotate  with  the 
angular  velocity  to. 

(b)  Second  Assumption:  Incompressible.  Our  second  assumption  is  that  the  fluid  is 
incompressible.  (Fluids  include  liquids,  which  are  incompressible,  and  gases,  such  as  air, 
which  are  compressible.)  Then 


(ID 


dVj  dVz 

1 = 0 

dx  dy 


in  every  region  that  is  free  of  sources  or  sinks,  that  is,  points  at  which  fluid  is  produced 
or  disappears,  respectively.  The  expression  in  (11)  is  called  the  divergence  of  V and  is 
denoted  by  div  V.  (See  also  (7)  in  Sec.  9.8.) 

(c)  Complex  Velocity  Potential.  If  the  domain  I)  of  the  flow  is  simply  connected 
(Sec.  14.2)  and  the  flow  is  irrotational,  then  (10)  implies  that  the  line  integral  (7)  is 
independent  of  path  in  D (by  Theorem  3 in  Sec.  10.2,  where  F\  = V),  F2  = V2,  F3  = 0, 
and  z is  the  third  coordinate  in  space  and  has  nothing  to  do  with  our  present  z).  Hence  if 
we  integrate  from  a fixed  point  (a,  b)  in  D to  a variable  point  (x,  y)  in  D,  the  integral 
becomes  a function  of  the  point  (x,  y),  say,  <J>(x,  y): 


(12) 


Hx,  y) 


r (x,y) 

(V)  dx  + Vz  dy). 

( a,b ) 


We  claim  that  the  flow  has  a velocity  potential  <1>,  which  is  given  by  (12).  To  prove 
this,  all  we  have  to  do  is  to  show  that  (4)  holds.  Now  since  the  integral  (7)  is 
independent  of  path,  Vi  dx  + Vzdy  is  exact  (Sec.  10.2),  namely,  the  differential  of  <J>, 
that  is. 


V\  dx  + Vzdy  = dx  H dy. 

dx  dy 

From  this  we  see  that  V±  = d<l>/dx  and  V2  = d<\>/dy,  which  gives  (4). 

That  <J>  is  harmonic  follows  at  once  by  substituting  (4)  into  (11),  which  gives  the  first 
Laplace  equation  in  (5). 

We  finally  take  a harmonic  conjugate  'F  of  <1>.  Then  the  other  equation  in  (5)  holds. 
Also,  since  the  second  partial  derivatives  of  (f>  and  "'I'  are  continuous,  we  see  that  the 
complex  function 


F(z)  = cF(x,  y)  + PV  (x,  y) 
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is  analytic  in  D.  Since  the  curves  ''I,(x,  y)  = const  are  perpendicular  to  the  equipotential 
curves  <J>(.r,  y)  = const  (except  where  F (z)  = 0),  we  conclude  that  T'  (x,  y)  = const  are 
the  streamlines.  Hence  "'I'  is  the  stream  function  and  F(z)  is  the  complex  potential  of  the 
flow.  This  completes  the  proof  of  Theorem  1 as  well  as  our  discussion  of  the  important 
role  of  complex  analysis  in  compressible  fluid  flow.  ■ 


FRQB  ^ENFSETl:Sv4 


1.  Differentiability.  Under  what  condition  on  the  velocity 
vector  V in  (1)  will  F(z)  in  (2)  be  analytic? 

2.  Corner  flow.  Along  what  curves  will  the  speed  in 
Example  1 be  constant?  Is  this  obvious  from  Fig.  415? 

3.  Cylinder.  Guess  from  physics  and  from  Fig.  4 1 6 where 
on  the  y-axis  the  speed  is  maximum.  Then  calculate. 

4.  Cylinder.  Calculate  the  speed  along  the  cylinder  wall 
in  Fig.  416,  also  confirming  the  answer  to  Prob.  3. 

5.  Irrotational  flow.  Show  that  the  flow  in  Example  2 is 
irrotational. 

6.  Extension  of  Example  1.  Sketch  or  graph  and 
interpret  the  flow  in  Example  1 on  the  whole  upper 
half-plane. 

7.  Parallel  flow.  Sketch  and  interpret  the  flow  with 
complex  potential  F(z)  = z. 

8.  Parallel  flow.  What  is  the  complex  potential  of  an 
upward  parallel  flow  of  speed  K > 0 in  the  direction 
of  y = x?  Sketch  the  flow. 

9.  Corner.  What  F(z)  would  be  suitable  in  Example  1 if 
the  angle  of  the  comer  were  7r/4  instead  of  7t/2? 

10.  Corner.  Show  that  F(z)  = iz2  also  models  a flow 
around  a corner.  Sketch  streamlines  and  equipotential 
lines.  Find  V. 

11.  What  flow  do  you  obtain  from  F(z)  = — iKz,  K positive 
real? 

12.  Conformal  mapping.  Obtain  the  flow  in  Example  1 
from  that  in  Prob.  1 1 by  a suitable  conformal  mapping. 

13.  60°-  Sector.  What  F(z)  would  be  suitable  in  Example  1 
if  the  angle  at  the  comer  were  7t/3? 

14.  Sketch  or  graph  streamlines  and  equipotential  lines 
of  F(z ) = iz3-  Find  V.  Find  all  points  at  which  V is 
horizontal. 

15.  Change  F(z)  in  Example  2 slightly  to  obtain  a flow 
around  a cylinder  of  radius  r0  that  gives  the  flow  in 
Example  2 if  r0  —*  1 . 

16.  Cylinder.  What  happens  in  Example  2 if  you  replace 
Z by  z2?  Sketch  and  interpret  the  resulting  flow  in  the 
first  quadrant. 

17.  Elliptic  cylinder.  Show  that  F(z)  = arccos  z gives 
confocal  ellipses  as  streamlines,  with  foci  at  z — ±1, 


and  that  the  flow  circulates  around  an  elliptic  cylinder 
or  a plate  (the  segment  from  —1  to  1 in  Fig.  418). 


Fig.  418.  Flow  around  a plate  in  Prob.  17. 

18.  Aperture.  Show  that  F(z)  = arccosh  z gives  confocal 
hyperbolas  as  streamlines,  with  foci  at  z — ± 1,  and  the 
flow  may  be  interpreted  as  a flow  through  an  aperture 
(Fig.  419). 


Fig.  419.  Flow  through  an  aperture  in  Prob.  18. 

19.  Potential  F(z ) = 1/z.  Show  that  the  streamlines  of 
F(z)  — 1/z  and  circles  through  the  origin  with  centers 
on  the  y-axis. 

20.  TEAM  PROJECT.  Role  of  the  Natural  Logarithm 
in  Modeling  Flows,  (a)  Basic  flows:  Source  and  sink. 

Show  that  F(z)  = {c/2tt)  In  z with  constant  positive 
real  c gives  a flow  directed  radially  outward  (Fig.  420), 
so  that  F models  a point  source  at  z — 0 (that  is,  a 
source  line  x = 0,  y = 0 in  space)  at  which  fluid  is 
produced,  c is  called  the  strength  or  discharge  of  the 
source.  If  c is  negative  real,  show  that  the  flow  is 
directed  radially  inward,  so  that  F models  a sink  at 
z = 0,  a point  at  which  fluid  disappears.  Note  that 
z = 0 is  the  singular  point  of  F(z). 
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(b)  Basic  flows:  Vortex.  Show  that  F(z)  = -(Ki/2tt) 
In  z with  positive  real  K gives  a flow  circulating  coun- 
terclockwise around  z — 0 (Fig.  421).  z = 0 is  called  a 
vortex.  Note  that  each  time  we  travel  around  the  vortex, 
the  potential  increases  by  K. 

(c)  Addition  of  flows.  Show  that  addition  of  the 
velocity  vectors  of  two  flows  gives  a flow  whose 
complex  potential  is  obtained  by  adding  the  complex 
potentials  of  those  flows. 

y | 


i 


Fig.  421.  Vortex  flow 

(d)  Source  and  sink  combined.  Find  the  complex 
potentials  of  a flow  with  a source  of  strength  1 at  z — ~a 
and  of  a flow  with  a sink  of  strength  1 at  z — a.  Add 
both  and  sketch  or  graph  the  streamlines.  Show  that  for 
small  |fl|  these  lines  look  similar  to  those  in  Prob.  19. 

(e)  Flow  with  circulation  around  a cylinder.  Add 
the  potential  in  (b)  to  that  in  Example  2.  Show  that  this 
gives  a flow  for  which  the  cylinder  wall  |z|  = 1 is  a 
streamline.  Find  the  speed  and  show  that  the  stagnation 
points  are 


iK  I -K2 
z = — ± A ? + 1; 

477  V 16772 

if  K = 0 they  are  at  ± 1 ; as  K increases  they  move  up 
on  the  unit  circle  until  they  unite  at  z = i (K  = 477,  see 
Fig.  422),  and  if  K > 477  they  lie  on  the  imaginary  axis 
(one  lies  in  the  field  of  flow  and  the  other  one  lies  inside 
the  cylinder  and  has  no  physical  meaning). 


K=  0 


Fig.  422.  Flow  around  a cylinder  without  circulation 
(K  = 0)  and  with  circulation 


18.5  Poissons  Integral  Formula  for  Potentials 

So  far  in  this  chapter  we  have  seen  powerful  methods  based  on  conformal  mappings  and 
complex  potentials.  They  were  used  for  modeling  and  solving  two-dimensional  potential 
problems  and  demonstrated  the  importance  of  complex  analysis. 

Now  we  introduce  a further  method  that  results  from  complex  integration.  It  will  yield 
the  very  important  Poisson  integral  formula  (5)  for  potentials  in  a standard  domain 
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(a  circular  disk).  In  addition,  from  (5),  we  will  derive  a useful  series  (7)  for  these  potentials. 
This  allows  us  to  solve  problems  for  disks  and  then  map  solutions  conformally  onto  other 
domains. 


Derivation  of  Poisson’s  Integral  Formula 

Poisson’s  formula  will  follow  from  Cauchy’s  integral  formula  (Sec.  14.3) 


(1) 


F(z) 


1 

2777 


C) 


F(z*) 

z*  - z 


dz*. 


Here  C is  the  circle  z*  = Rela  (counterclockwise,  0 g a § 277),  and  we  assume  that  F(z*) 
is  analytic  in  a domain  containing  C and  its  full  interior.  Since  dz * = iRela  da  = iz*  da, 
we  obtain  from  (1) 


(2) 


F(z) 


1 

277 


r 2tt 


F(z*) 

^0 


(. 7 * = Reia,  Z = reie). 


Now  comes  a little  trick.  If  instead  of  z inside  C we  take  a Z outside  C,  the  integrals  (1) 
and  (2)  are  zero  by  Cauchy’s  integral  theorem  (Sec.  14.2).  We  choose  Z = z*z*/z  = R2/z, 
which  is  outside  C because  |z|  = R2/ \ z = R2/  r > R.  From  (2)  we  thus  have 


- 277 


0 277 


F(z *)  ■ 


da  = 

:*  - Z 277  J 


2tt 


F(Z*)  ' 


da 


and  by  straightforward  simplification  of  the  last  expression  on  the  right, 

,217 


°=^j 


F(z*)  — 


da. 


o 


We  subtract  this  from  (2)  and  use  the  following  formula  that  you  can  verify  by  direct 
calculation  {z.z*  cancels): 


(3) 


Z* Z_^  _ 7*7*  ~ ZZ 

Z*  - Z Z - Z*  _ (z*  - z)(z*  - Z)  ' 


We  then  have 

(4) 


r2n 


F(Z)  = ^ 


F(z *) 


z^z^ 


ZZ 


(Z*  - z)(z*  - z) 


- da. 


From  the  polar  representations  of  z and  z*  we  see  that  the  quotient  in  the  integrand  is  real 
and  equal  to 
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We  now  write  F(z)  = cE*(r,  6)  + 6)  and  take  the  real  part  on  both  sides  of  (4).  Then 

we  obtain  Poisson’s  integral  formula2 


(5) 


r 27 r 


6)  = — 
ITT 


$(R,  a) 


Rz  — r2 


Rz  — 2Rrcos  (6  — a)  + rz 


da. 


This  formula  represents  the  harmonic  function  (T>  in  the  disk  |z|  Si  R in  terms  of  its  values 
<S>(R,  a)  on  the  boundary  (the  circle)  z = R. 

Formula  (5)  is  still  valid  if  the  boundary  function  <t>(R,  a)  is  merely  piecewise 
continuous  (as  is  practically  often  the  case;  see  Figs.  405  and  406  in  Sec.  18.2  for  an 
example).  Then  (5)  gives  a function  harmonic  in  the  open  disk,  and  on  the  circle  |z|  = R 
equal  to  the  given  boundary  function,  except  at  points  where  the  latter  is  discontinuous. 
A proof  can  be  found  in  Ref.  [Dl]  in  App.  1. 


Series  for  Potentials  in  Disks 

From  (5)  we  may  obtain  an  important  series  development  of  (T>  in  terms  of  simple  harmonic 
functions.  We  remember  that  the  quotient  in  the  integrand  of  (5)  was  derived  from  (3). 
We  claim  that  the  right  side  of  (3)  is  the  real  part  of 

z*  + z _ (z*  + z)(z*  - z)  _ Z*Z*  — ZZ  — z*z  + zz* 

Z*  - Z (z*  - z)(z*  - z)  |z*  - z|2 


Indeed,  the  last  denominator  is  real  and  so  is  z*z*  — zz  in  the  numerator,  whereas 
— z*z  + zz*  = 2 i Im  (zz*)  in  the  numerator  is  pure  imaginary.  This  verifies  our  claim. 
Now  by  the  use  of  the  geometric  series  we  obtain  (develop  the  denominator) 


(6) 


z*  + z _ 1 + (z/z*) 
Z*  - z “ 1 - (z/z*) 


n= 0 


71=1 


Since  z = re1  and  z*  = Relcl,  we  have 


\(  z Yl 

Re 

(z*J  . 

= Re 

' inf)  —ina 

— z.  e e 

Rn 


cos  ( nO  — not). 


On  the  right,  cos  ( n6  — na)  = cos  nO  cos  not  + sin  nO  sin  na.  Hence  from  (6)  we  obtain 


(6*) 


Re 


z*  + z 
Z*  - z 


1 + 2 2 Re 

n=  1 


1 + 2^ 

n=  1 


(cos  nO  cos  na  + sin  nO  sin  na). 


2SIMEON  DENIS  POISSON  (1781-1840),  French  mathematician  and  physicist,  professor  in  Paris  from  1809. 
His  work  includes  potential  theory,  partial  differential  equations  (Poisson  equation.  Sec.  12.1),  and  probability 
(Sec.  24.7). 
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EXAMPLE  1 


This  expression  is  equal  to  the  quotient  in  (5),  as  we  have  mentioned  before,  and 
by  inserting  it  into  (5)  and  integrating  term  by  term  with  respect  to  a from  0 to  277 
we  obtain 


(7) 


$(r,  6)  = a0  + 2 ( ^ 

n= 1 


ian  cos  nO  + bn  sin  nO) 


where  the  coefficients  are  [the  2 in  (6*)  cancels  the  2 in  1/(277)  in  (5)] 


(8) 


277" 

T>(R,  a ) da. 


® n 


J_ 

77 


277" 

5>( R , a)  cos  na  da, 
■'o 


J_ 

77 


2"7T 

(b(A\  a)  sin  na  da, 
■’o 


n = 1,  2,  • • • , 


the  Fourier  coefficients  of  <!>(/?,  a);  see  Sec.  11.1.  Now,  for  r = R,  the  series  (7)  becomes 
the  Fourier  series  of  <!>(/?,  a).  Flence  the  representation  (7)  will  be  valid  whenever  the 
given  a)  on  the  boundary  can  be  represented  by  a Fourier  series. 


Dirichlet  Problem  for  the  Unit  Disk 

Find  the  electrostatic  potential  6)  in  the  unit  disk  r < 1 having  the  boundary  values 

f— <2/7 T if  —7 T < a < 0 

0>(l,a)  = < 

l ol/tt  if  0 < a < 7T 

Solution.  Since  <F(1,  a)  is  even,  bn  = 0,  and  from  (8)  we  obtain  aQ  = \ and 


1 


cos  na  da 


cos  na  da 


= —p — p (cos  nrr  — 1). 
n 7T 


Hence,  an  = — 4/(n2772)  if  n is  odd,  = 0 if  n = 2,  4,  • • • , and  the  potential  is 


^ 1 4 

<F(r,  6)  = - - 


r cos  6 H — ir  cos  36  H — ~ cos  56 
32  52 


2 7 T~ 

Figure  424  shows  the  unit  disk  and  some  of  the  equipotential  lines  (curves  = const). 


(Fig.  423). 
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1.  Give  the  details  of  the  derivation  of  the  series  (7)  from 
the  Poisson  formula  (5). 

2.  Verify  (3). 

3.  Show  that  each  term  of  (7)  is  a harmonic  function  in 
the  disk  r < R. 

4.  Why  does  the  series  in  Example  1 reduce  to  a cosine 
series? 


HARMONIC  FUNCTIONS  IN  A DISK 

Using  (7),  find  the  potential  4>(r,  0 ) in  the  unit  disk  r < 1 
having  the  given  boundary  values  d>(l,  0).  Using  the  sum 
of  the  first  few  terms  of  the  series,  compute  some  values 
of  d>  and  sketch  a figure  of  the  equipotential  lines. 

5.  <I>(1,  0)  = | sin  30 

6.  d>(l,0)  = 5 - cos  20 

7.  0(1,  0)  = a cos2  4(9 

8.  0(1,  6>)  = 4sin30 

9.  0(1,0)  = 8 sin4  8 

10.  0(1,0)  = 16  cos3  20 

11.  0(1,  0)  = 0/ 77  if  —7 T < 0 < 7T 

12.  0(1,  0)  = k if  0 < 0 < 77  and  0 otherwise 

13.  0(1,  0)  = 0 if  — §77  < 0 < §77  and  0 otherwise 

14.  0(1,  0)  = \0\/ir  if  —77  < 0 < 77 

15.  0(1,  0)  = 1 if  — §77  < 0 < 2 77  and  0 otherwise 


(0  + 77  if  —77  < 0 < 0 

16.  0(1,  0)  = < 

l 0 — 77  if  0 < 0 < 77 

17.  0(1,  0)  = 02/772  if  -77  < 0 < 77 

("0  if  —77  < 0 < 0 

18.  0(1,  0)  = 

l 0 if  0 < 0 < 77 

19.  CAS  EXPERIMENT.  Series  (7).  Write  a program  for 
series  developments  (7).  Experiment  on  accuracy  by 
computing  values  from  partial  sums  and  comparing  them 
with  values  that  you  obtain  from  your  CAS  graph.  Do 
this  (a)  for  Example  1 and  Fig.  424,  (b)  for  O in  Prob.  1 1 
(which  is  discontinuous  on  the  boundary!),  (c)  for  a <!> 
of  your  choice  with  continuous  boundary  values,  and 
(d)  for  <I>  with  discontinuous  boundary  values. 

20.  TEAM  PROJECT.  Potential  in  a Disk,  (a)  Mean 
value  property.  Show  that  the  value  of  a harmonic 
function  at  the  center  of  a circle  C equals  the  mean 
of  the  value  of  d>  on  C (see  Sec.  18.4,  footnote  1,  for 
definitions  of  mean  values). 

(b)  Separation  of  variables.  Show  that  the  terms  of 
(7)  appear  as  solutions  in  separating  the  Laplace 
equation  in  polar  coordinates. 

(c)  Harmonic  conjugate.  Find  a series  for  a harmonic 
conjugate  'P  of  d>  from  (7).  Hint.  Use  the  Cauchy- 
Riemann  equations. 

(d)  Power  series.  Find  a series  for  F(z)  = '!>  + LI'. 


8.6  General  Properties  of  Harmonic  Functions. 
Uniqueness  Theorem  for  the  Dirichlet  Problem 


Recall  from  Sec.  10.8  that  harmonic  functions  are  solutions  to  Laplace’s  equation  and 
their  second-order  partial  derivatives  are  continuous.  In  this  section  we  explore  how 
general  properties  of  harmonic  functions  often  can  be  obtained  from  properties  of  analytic 
functions.  This  can  frequently  be  done  in  a simple  fashion.  Specifically,  important  mean 
value  properties  of  harmonic  functions  follow  readily  from  those  of  analytic  functions. 
The  details  are  as  follows. 


THEOREM  1 


Mean  Value  Property  of  Analytic  Functions 

Let  F(z)  be  analytic  in  a simply  connected  domain  D.  Then  the  value  ofF(z)  at  a point 
z 0 in  D is  equal  to  the  mean  value  of  F{z)  on  any  circle  in  D with  center  at  z0- 
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PROOF 


THEOREM  2 


PROOF 


THEOREM  3 


In  Cauchy’s  integral  formula  (Sec.  14.3) 


(1) 


F(zo)  = 4ri  t 


Jc 


F(z) 

: - Jo 


dz 


we  choose  for  C the  circle  z = zo  + re‘a  in  D.  Then  z — Zo  = re"\  dz  = irela  da,  and 
(1)  becomes 


(2) 


F(z  0)  = 


1 

277 


2tt 


F(z0  + rela)  da. 


The  right  side  is  the  mean  value  of  F on  the  circle  (=  value  of  the  integral  divided  by  the 
length  277  of  the  interval  of  integration).  This  proves  the  theorem. 


For  harmonic  functions.  Theorem  1 implies 


Two  Mean  Value  Properties  of  Harmonic  Functions 

Let  $(jc,  y)  be  harmonic  in  a simply  connected  domain  D.  Then  the  value  of  y) 
at  a point  ( x0 , y0)  in  D is  equal  to  the  mean  value  of  <l>(x,  y)  on  any  circle  in  D with 
center  at  (x0,  y0).  This  value  is  also  equal  to  the  mean  value  of  d>(x,  y)  on  any 
circular  disk  in  D with  center  (x0,  yo)-  [See  footnote  1 in  Sec.  18.4.] 


The  first  part  of  the  theorem  follows  from  (2)  by  taking  the  real  parts  on  both  sides, 

r 277" 


$(*o,  To)  = Re  F(x o + iy0)  = 


1 

277  . 


4>(xo  + r cos  a,  yo  + r sin  a)  da. 


o 


The  second  part  of  the  theorem  follows  by  integrating  this  formula  over  r from  0 to  r0  (the 
radius  of  the  disk)  and  dividing  by  7q/2, 


(3) 


<Hx0,  yo)  = 


l 

„ 2 
777  o 


,2tt 


d>(xo  + 7 cos  a,  yo  + r sin  a)r  da  dr. 


o 


The  right  side  is  the  indicated  mean  value  (integral  divided  by  the  area  of  the  region  of 
integration). 


Returning  to  analytic  functions,  we  state  and  prove  another  famous  consequence  of  Cauchy’s 
integral  formula.  The  proof  is  indirect  and  shows  quite  a nice  idea  of  applying  the  ML- 
inequality.  (A  bounded  region  is  a region  that  lies  entirely  in  some  circle  about  the  origin.) 


Maximum  Modulus  Theorem  for  Analytic  Functions 

Let  F(z)  be  analytic  and  nonconstant  in  a domain  containing  a bounded  region  R 
and  its  boundary.  Then  the  absolute  value  |F(z)|  cannot  have  a maximum  at  an 
interior  point  of  R.  Consequently,  the  maximum  of  |F(z)|  is  taken  on  the  boundary 
of  R.  If  F(z)  A 0 in  R,  the  same  is  true  with  respect  to  the  minimum  of  |F(z)|. 
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PROOF 


THEOREM  4 


We  assume  that  |F(z)|  has  a maximum  at  an  interior  point  zo  of  R and  show  that  this 
leads  to  a contradiction.  Let  \F(zq)\  = M be  this  maximum.  Since  F(z)  is  not  constant, 
|F(z)|  is  not  constant,  as  follows  from  Example  3 in  Sec.  13.4.  Consequently,  we  can  find 
a circle  C of  radius  r with  center  at  zo  such  that  the  interior  of  C is  in  R and  \F(z)\  is 
smaller  than  M at  some  point  P of  C.  Since  F(z)  is  continuous,  it  will  be  smaller  than 
M on  an  arc  C\  of  C that  contains  P (see  Fig.  425),  say, 

|F(z)|  g M — k (k  > 0)  for  all  z on  Ci. 


Let  C i have  the  length  L\.  Then  the  complementary  arc  C2  of  C has  the  length  2irr  — L\. 
We  now  apply  the  ML-inequality  (Sec.  14.1)  to  (1)  and  note  that  z — zo  = r.  We  then 
obtain  (using  straightforward  calculation  in  the  second  line  of  the  formula) 


M = |F(z0)| 


F(z) 

Z - Zo 


dz 


L 1 


dz 


(277 -r  — Lf)  = M — 


kL^ 
27 Tr 


< M 


that  is,  M < M,  which  is  impossible.  Hence  our  assumption  is  false  and  the  first  statement 
is  proved. 

Next  we  prove  the  second  statement.  If  F(z)  ¥=  0 in  R,  then  I / F(z.)  is  analytic  in  R. 
From  the  statement  already  proved  it  follows  that  the  maximum  of  1 / 1 F(z)  lies  on  the 
boundary  of  R.  But  this  maximum  corresponds  to  the  minimum  of  F(z)  \ ■ This  completes 
the  proof. 


Fig.  425.  Proof  of  Theorem  3 


This  theorem  has  several  fundamental  consequences  for  harmonic  functions,  as  follows. 


Harmonic  Functions 

Let  <T»(jc,  y)  be  harmonic  in  a domain  containing  a simply  connected  bounded  region 
R and  its  boundary  curve  C.  Then: 

(I)  (Maximum  principle)  If  'Fix,  y)  is  not  constant,  it  has  neither  a maximum 
nor  a minimum  in  R.  Consequently,  the  maximum  and  the  minimum  are  taken  on 
the  boundary  of  R. 

(II)  If  'Fix,  y)  is  constant  on  C,  then  4>(x,  y)  is  a constant. 

(III)  If  h(x,  y)  is  harmonic  in  R and  on  C and  if  h(x,  y)  = 'iHx,  y)  on  C,  then 
h{x,  y)  = <F(x,  y)  everywhere  in  R. 
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PROOF  (I)  Let  "T  (x,  y ) be  a conjugate  harmonic  function  of  <F(x,  y)  in  R.  Then  the  complex 
function  F(z)  = ‘FU,  y)  + RV(x,  yj  is  analytic  in  R,  and  so  is  G (z)  = eFU).  Its  absolute 
value  is 


|G(z)|  = eReF(z)  = e<Ux’  y\ 

From  Theorem  3 it  follows  that  | G (z)  I cannot  have  a maximum  at  an  interior  point  of  R. 
Since  e'1’  is  a monotone  increasing  function  of  the  real  variable  (I> , the  statement  about 
the  maximum  of  $ follows.  From  this,  the  statement  about  the  minimum  follows  by 
replacing  $ by  — <F. 

(II)  By  (I)  the  function  <F(x,  y)  takes  its  maximum  and  its  minimum  on  C.  Thus,  if 
<T>(jc,  y)  is  constant  on  C,  its  minimum  must  equal  its  maximum,  so  that  cF(jt,  y)  must  be 
a constant. 

(HI)  If  li  and  <F  are  harmonic  in  R and  on  C,  then  h — $ is  also  harmonic  in  R and 
on  C,  and  by  assumption,  h — <t>  = 0 everywhere  on  C.  By  (II)  we  thus  have  h — <F  = 0 
everywhere  in  R,  and  (III)  is  proved. 

The  last  statement  of  Theorem  4 is  very  important.  It  means  that  a harmonic  function  is 
uniquely  determined  in  R by  its  values  on  the  boundary  of  R.  Usually,  'Fix,  y)  is  required 
to  be  harmonic  in  R and  continuous  on  the  boundary  of  R,  that  is, 

lim  <\>(x,  y)  = <F(x0,  y0),  where  (x0,  y0)  is  on  the  boundary  and  (x,  y)  is  in  R. 

x— »Xo 
y^yo 

Under  these  assumptions  the  maximum  principle  (I)  is  still  applicable.  The  problem  of 
determining  <F(jc,  y)  when  the  boundary  values  are  given  is  called  the  Dirichlet  problem 
for  the  Laplace  equation  in  two  variables,  as  we  know.  From  (III)  we  thus  have,  as  a 
highlight  of  our  discussion. 


THEOREM  5 


Uniqueness  Theorem  for  the  Dirichlet  Problem 

If  for  a given  region  and  given  boundary  values  the  Dirichlet  problem  for  the  Laplace 
equation  in  two  variables  has  a solution,  the  solution  is  unique. 


P R O BLEM  S ET  TR6 


PROBLEMS  RELATED  TO  THEOREMS  1 AND  2 


1-4  Verify  Theorem  1 for  the  given  F(z),  z o,  and 
circle  of  radius  1 . 

3 


i.  (z  + ir 


_ 5 
ZO  ~ 2 


2.  2 z 


ZO  = -2 


3.  (3z  - 2)  , z o = 4 

4.  (z  ~ l)-2,  Zo  — 1 

5.  Integrate  |z|  around  the  unit  circle.  Does  the  result 
contradict  Theorem  1? 


6.  Derive  the  first  statement  in  Theorem  2 from  Poisson’s 
integral  formula. 
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Verify  (3)  in  Theorem  2 for  the  given  <I>(.r,  y), 


(x0,  yo),  and  circle  of  radius  1. 


7.  (x  - l)(y  - 1),  (2,  -2) 

8.  x2  - y2,  (3,  8) 

9.  x + y + xy,  (1,  1) 

10.  Verify  the  calculations  involving  the  inequalities  in  the 
proof  of  Theorem  3. 

11.  CAS  EXPERIMENT.  Graphing  Potentials.  Graph 
the  potentials  in  Probs.  7 and  9 and  for  two  other 
functions  of  your  choice  as  surfaces  over  a rectangle 
in  the  xy-plane.  Find  the  locations  of  the  maxima  and 
minima  by  inspecting  these  graphs. 
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12.  TEAM  PROJECT.  Maximum  Modulus  of  Analytic 
Functions,  (a)  Verify  Theorem  3 for  (i)  F(z)  = z2  and 
the  rectangle  1 SxS5,2S_vS4,  (ii)  F(z ) = sin  z 
and  the  unit  disk,  and  (iii)  F(z)  = ez  and  any  bounded 
domain. 

(b)  F(z)  = 1 + |z|  is  not  zero  in  the  disk  |z|  S 2 and 
has  a minimum  at  an  interior  point.  Does  this  contradict 
Theorem  3? 

(c)  F(x)  = sinx  ( x real)  has  a maximum  1 at  7t/2. 
Why  can  this  not  be  a maximum  of  |F(z)|  = | sin  z|  in 
a domain  containing  z = tt/21 

(d)  If  F(z)  is  analytic  and  not  constant  in  the  closed 
unit  disk/):  |z|  S 1 and  |F(z)|  = c = const  on  the  unit 
circle,  show  that  F(z)  must  have  a zero  in  D. 

MAXIMUM  MODULUS 

Find  the  location  and  size  of  the  maximum  of  | F(z)  I in  the 
unit  disk  |z|  S 1. 

13.  F(z)  = cos  z 


14.  F(z)  = expz2 

15.  F{z)  = sinh  2z 

16.  F(z)  — az  + b (a,  b complex,  a # 0) 

17.  F(z)  = 2z2  - 2 

18.  Verify  the  maximum  principle  for  <l>(x,  y)  = exsiny 
and  the  rectangle  a £ x £ b,  OSyS  27 t. 

19.  Harmonic  conjugate.  Do  (I>  and  a harmonic  conjugate 
■'T  in  a region  R have  their  maximum  at  the  same  point 
of  F? 

20.  Conformal  mapping.  Find  the  location  (u\,  v{)  of  the 
maximum  of  <F*  = eucosv  in  R*\  |w|  £ 1,  v § 0, 
where  w = u + iv.  Find  the  region  R that  is  mapped 
onto  R*  by  w — /(z)  = z2.  Find  the  potential  in  R 
resulting  from  €>*  and  the  location  (xi,y{)  of  the 
maximum.  Is  (nj,  iq)  the  image  of  (xl5  yi)?  If  so,  is 
this  just  by  chance? 


GH^PTERTS  R1V 1EW  QU  E S T I O N S AND  PROBLEMS 


1.  Why  can  potential  problems  be  modeled  and  solved  by 
methods  of  complex  analysis?  For  what  dimensions? 

2.  What  parts  of  complex  analysis  are  mainly  of  interest 
to  the  engineer  and  physicist? 

3.  What  is  a harmonic  function?  A harmonic  conjugate? 

4.  What  areas  of  physics  did  we  consider?  Could  you 
think  of  others? 

5.  Give  some  examples  of  potential  problems  considered 
in  this  chapter.  Make  a list  of  corresponding  functions. 

6.  What  does  the  complex  potential  give  physically? 

7.  Write  a short  essay  on  the  various  assumptions  made 
in  fluid  flow  in  this  chapter. 

8.  Explain  the  use  of  conformal  mapping  in  potential 
theory. 

9.  State  the  maximum  modulus  theorem  and  mean  value 
theorems  for  harmonic  functions. 

10.  State  Poisson’ s integral  formula.  Derive  it  from  Cauchy  ’ s 
formula. 

11.  Find  the  potential  and  the  complex  potential  between 
the  plates  v = x and  y = x + 10  kept  at  10  V and  1 10  V, 
respectively. 

12.  Find  the  potential  and  complex  potential  between  the 
coaxial  cylinders  of  axis  0 (hence  the  vertical  axis 
in  space)  and  radii  = 1 cm,  r2  = 10  cm,  kept  at 
potential  I/i  = 200  V and  t/2  = 2 kV,  respectively. 

13.  Do  the  task  in  Prob.  12  if  U\  = 220  V and  the  outer 
cylinder  is  grounded,  t/2  = 0. 


14.  If  plates  at  xi  = 1 and  x2  = 10  are  kept  at  potentials 
(?i  = 200  V,  I/2  = 2 kV,  is  the  potential  at  x = 5 
larger  or  smaller  than  the  potential  at  r = 5 in  Prob.  12? 
No  calculation.  Give  reason. 

15.  Make  a list  of  important  potential  functions,  with 
applications,  from  memory. 

16.  Find  the  equipotential  lines  of  F(z)  = i Ln  z. 

17.  Find  the  potential  in  the  first  quadrant  of  the  xy-plane  if 
the  x-axis  has  potential  2 kV  and  the  y-axis  is  grounded. 

18.  Find  the  potential  in  the  angular  region  between  the 
plates  Arg  z = tt/6  kept  at  800  V and  Arg  z = 7r/3 
kept  at  600  V. 

19.  Find  the  temperature  T in  the  upper  half-plane  if,  on 
the  x-axis,  T = 30°C  for  x > 1 and  — 30°C  for  x < 1. 

20.  Interpret  Prob.  18  as  an  electrostatic  problem.  What  are 
the  lines  of  electric  force? 

21.  Find  the  streamlines  and  the  velocity  for  the  complex 
potential  F(z)  = (1  + i)z . Describe  the  flow. 

22.  Describe  the  streamlines  for  F(z)  = gz2  + z. 

23.  Show  that  the  isotherms  of  F(z)  = — iz2  + z are 
hyperbolas. 

24.  State  the  theorem  on  the  behavior  of  harmonic 
functions  under  conformal  mapping.  Verify  it  for 
<L>  * = eu  sin  v and  w = u + iv  = z2. 

25.  Find  V in  Prob.  22  and  verify  that  it  gives  vectors 
tangent  to  the  streamlines. 
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Potential  theory  is  the  theory  of  solutions  of  Laplace’s  equation 


(1) 


V2<f>  = 0. 


Solutions  whose  second  partial  derivatives  are  continuous  are  called  harmonic 
functions.  Equation  (1)  is  the  most  important  PDE  in  physics,  where  it  is  of  interest 
in  two  and  three  dimensions.  It  appears  in  electrostatics  (Sec.  18.1),  steady-state 
heat  problems  (Sec.  18.3),  fluid  flow  (Sec.  18.4),  gravity,  etc.  Whereas  the  three- 
dimensional  case  requires  other  methods  (see  Chap.  12),  two-dimensional  potential 
theory  can  be  handled  by  complex  analysis,  since  the  real  and  imaginary  parts  of 
an  analytic  function  are  harmonic  (Sec.  13.4).  They  remain  harmonic  under 
conformal  mapping  (Sec.  18.2),  so  that  conformal  mapping  becomes  a powerful 
tool  in  solving  boundary  value  problems  for  (1),  as  is  illustrated  in  this  chapter. 
With  a real  potential  <f>  in  (1)  we  can  associate  a complex  potential 


Then  both  families  of  curves  <J>  = const  and  d'  = const  have  a physical  meaning. 
In  electrostatics,  they  are  equipotential  lines  and  lines  of  electrical  force  (Sec.  18.1). 
In  heat  problems,  they  are  isotherms  (curves  of  constant  temperature)  and  lines  of 
heat  flow  (Sec.  18.3).  In  fluid  flow,  they  are  equipotential  lines  of  the  velocity 
potential  and  streamlines  (Sec.  18.4). 

For  the  disk,  the  solution  of  the  Dirichlet  problem  is  given  by  the  Poisson  formula 
(Sec.  18.5)  or  by  a series  that  on  the  boundary  circle  becomes  the  Fourier  series  of 
the  given  boundary  values  (Sec.  18.5). 

Harmonic  functions,  like  analytic  functions,  have  a number  of  general  properties; 
particularly  important  are  the  mean  value  property  and  the  maximum  modulus 
property  (Sec.  18.6),  which  implies  the  uniqueness  of  the  solution  of  the  Dirichlet 
problem  (Theorem  5 in  Sec.  18.6). 


(2) 


F{z)  = (f>  + /d' 


(Sec.  18.1). 


Software 

CHAPTER 

CHAPTER 

CHAPTER 


PART  E 


Numeric 

Analysis 


(p.  788-789) 

19  Numerics  in  General 

20  Numeric  Linear  Algebra 

21  Numerics  for  ODEs  and  PDEs 

Numeric  analysis  or  briefly  numerics  continues  to  be  one  of  the  fastest  growing  areas 
of  engineering  mathematics.  This  is  a natural  trend  with  the  ever  greater  availability  of 
computing  power  and  global  Internet  use.  Indeed,  good  software  implementation  of 
numerical  methods  are  readily  available.  Take  a look  at  the  updated  list  of  Software 
starting  on  p.  788.  It  contains  software  for  purchase  (commercial  software)  and  software 
for  free  download  (public-domain  software).  For  convenience,  we  provide  Internet 
addresses  and  phone  numbers.  The  software  list  includes  computer  algebra  systems 
(CASs),  such  as  Maple  and  Mathematica,  along  with  the  Maple  Computer  Guide,  10th 
ed.,  and  Mathematica  Computer  Guide,  10th  ed.,  by  E.  Kreyszig  and  E.  J.  Norminton 
related  to  this  text  that  teach  you  stepwise  how  to  use  these  computer  algebra  systems  and 
with  complete  engineering  examples  drawn  from  the  text.  Furthermore,  there  is  scientific 
software,  such  as  IMSL,  LAPACK  (free  download),  and  scientific  calculators  with  graphic 
capabilities  such  as  TI-Nspire.  Note  that,  although  we  have  listed  frequently  used  quality 
software,  this  list  is  by  no  means  complete. 

In  your  career  as  an  engineer,  appplied  mathematician,  or  scientist  you  are  likely  to  use 
commercially  available  software  or  proprietary  software,  owned  by  the  company  you  work 
for,  that  uses  numeric  methods  to  solve  engineering  problems,  such  as  modeling  chemical  or 
biological  processes,  planning  ecologically  sound  heating  systems,  or  computing  trajectories 
of  spacecraft  or  satellites.  For  example,  one  of  the  collaborators  of  this  book  (Herbert  Kreyszig) 
used  proprietary  software  to  determine  the  value  of  bonds,  which  amounted  to  solving  higher 
degree  polynomial  equations,  using  numeric  methods  discussed  in  Sec.  19.2. 
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However,  the  availability  of  quality  software  does  not  alleviate  your  effort  and 
responsibility  to  first  understand  these  numerical  methods.  Your  effort  will  pay  off 
because,  with  your  mathematical  expertise  in  numerics,  you  will  be  able  to  plan  your 
solution  approach,  judiciously  select  and  use  the  appropriate  software,  judge  the  quality 
of  software,  and,  perhaps,  even  write  your  own  numerics  software. 

Numerics  extends  your  ability  to  solve  problems  that  are  either  difficult  or  impossible 
to  solve  analytically.  For  example,  certain  integrals  such  as  error  function  [see  App.  3, 
formula  (35)1  or  large  eigenvalue  problems  that  generate  high-degree  characteristic 
polynomials  cannot  be  solved  analytically.  Numerics  is  also  used  to  construct  approximating 
polynomials  through  data  points  that  were  obtained  from  some  experiments. 

Part  E is  designed  to  give  you  a solid  background  in  numerics.  We  present  many  numeric 
methods  as  algorithms,  which  give  these  methods  in  detailed  steps  suitable  for  software 
implementation  on  your  computer,  CAS,  or  programmable  calculator.  The  first  chapter, 
Chap.  19,  covers  three  main  areas.  These  are  general  numerics  (floating  point,  rounding  errors, 
etc.),  solving  equations  of  the  form  fix)  = 0 (using  Newton’s  method  and  other  methods), 
interpolation  along  with  methods  of  numeric  integration  that  make  use  of  it,  and  differentiation. 

Chapter  20  covers  the  essentials  of  numeric  linear  algebra.  The  chapter  breaks  into  two 
parts:  solving  linear  systems  of  equations  by  methods  of  Gauss,  Doolittle,  Cholesky,  etc. 
and  solving  eigenvalue  problems  numerically.  Chapter  21  again  has  two  themes:  solving 
ordinary  differential  equations  and  systems  of  ordinary  differential  equations  as  well  as 
solving  partial  differential  equations. 

Numerics  is  a very  active  area  of  research  as  new  methods  are  invented,  existing  methods 
improved  and  adapted,  and  old  methods — impractical  in  precomputer  times — are 
rediscovered.  A main  goal  in  these  activities  is  the  development  of  well-structured 
software.  And  in  large-scale  work — millions  of  equations  or  steps  of  iterations — even 
small  algorithmic  improvements  may  have  a large  significant  effect  on  computing  time, 
storage  demand,  accuracy,  and  stability. 

Remark  on  Software  Use.  Part  E is  designed  in  such  a way  as  to  allow  compelete  flexibility 
on  the  use  of  CASs,  software,  or  graphing  calculators.  The  computational  requirements 
range  from  very  little  use  to  heavy  use.  The  choice  of  computer  use  is  at  the  discretion 
of  the  professor.  The  material  and  problem  sets  (except  where  clearly  indicated  such  as 
in  CAS  Projects,  CAS  Problems,  or  CAS  Experiments,  which  can  be  omitted  without  loss 
of  continuity)  do  not  require  the  use  of  a CAS  or  software.  A scientific  calculator  perhaps 
with  graphing  capabilities  is  all  that  is  required. 

Software 

See  also  http://www.wiley.com/college/kreyszig/ 

The  following  list  will  help  you  if  you  wish  to  find  software.  Y ou  may  also  obtain  information 
on  known  and  new  software  from  websites  such  as  Dr.  Dobb’s  Portal,  from  articles  published 
by  the  American  Mathematical  Society  (see  also  its  website  at  www.ams.org),  the  Society 
for  Industrial  and  Applied  Mathematics  (SIAM,  at  www.siam.org),  the  Association  for 
Computing  Machinery  (ACM,  at  www.acm.org),  or  the  Institute  of  Electrical  and  Electronics 
Engineers  (IEEE,  at  www.ieee.org).  Consult  also  your  library,  computer  science  department, 
or  mathematics  department. 
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TI-Nspire.  Includes  TI-Nspire  CAS  and  programmable  graphic  calculators.  Texas  Instru- 
ments, Inc.,  Dallas,  TX.  Telephone:  1-800-842-2737  or  (972)  917-8324;  website  at 
www. education. ti.com. 

EISPACK.  See  LAPACK. 

GAMS  (Guide  to  Available  Mathematical  Software).  Website  at  http://gams.nist.gov. 
Online  cross-index  of  software  development  by  NIST. 

IMSL  (International  Mathematical  and  Statistical  Library).  Visual  Numerics,  Inc., 
Houston,  TX.  Telephone:  1-800-222-4675  or  (713)  784-3131;  website  at  www.vni.com. 
Mathematical  and  statistical  FORTRAN  routines  with  graphics. 

LAPACK.  FORTRAN  77  routines  for  linear  algebra.  This  software  package  supersedes 
UNPACK  and  EISPACK.  You  can  download  the  routines  from  www.netlib.org/lapack. 
The  LAPACK  User’s  Guide  is  available  at  www.netlib.org. 

LINPACK  see  LAPACK 

Maple.  Waterloo  Maple,  Inc.,  Waterloo,  ON,  Canada.  Telephone:  1-800-267-6583  or 
(519)  747-2373;  website  at  www.maplesoft.com. 

Maple  Computer  Guide.  For  Advanced  Engineering  Mathematics,  10th  edition.  By 
E.  Kreyszig  and  E.  J.  Norminton.  John  Wiley  and  Sons,  Inc.,  Hoboken,  NJ.  Telephone: 
1-800-225-5945  or  (201)  748-6000. 

Mathcad.  Parametric  Technology  Corp.  (PTC),  Needham,  MA.  Website  at  www.ptc.com. 

Mathematica.  Wolfram  Research,  Inc.,  Champaign,  IL.  Telephone:  1-800-965-3726  or 
(217)  398-0700;  website  at  www.wolfram.com. 

Mathematica  Computer  Guide.  For  Advanced  Engineering  Mathematics,  10th  edition. 
By  E.  Kreyszig  and  E.  J.  Norminton.  John  Wiley  and  Sons,  Inc.,  Hoboken,  NJ.  Telephone: 
1-800-225-5945  or  (201)  748-6000. 

Matlab.  The  MathWorks,  Inc.,  Natick,  MA.  Telephone:  (508)  647-7000;  website  at 
www.mathworks.com. 

NAG.  Numerical  Algorithms  Group,  Inc.,  Lisle,  IL.  Telephone:  (630)  971-2337;  website 
at  www.nag.com.  Numeric  routines  in  FORTRAN  77,  FORTRAN  90,  and  C. 

NETLIB.  Extensive  library  of  public-domain  software.  See  at  www.netlib.org. 

NIST.  National  Institute  of  Standards  and  Technology,  Gaithersburg,  MD.  Telephone: 
(301)  975-6478;  website  at  www.nist.gov.  For  Mathematical  and  Computational  Science 
Division  telephone:  (301)  975-3800.  See  also  http://math.nist.gov. 

Numerical  Recipes.  Cambridge  University  Press,  New  York,  NY.  Telephone:  1-800-221- 
4512  or  (212)  924-3900;  website  at  www.cambridge.org/us.  Book,  3rd  ed.  (in  C++)  see 
App.  1,  Ref-  [E25];  source  code  on  CD  ROM  in  C + + , which  also  contains  old  source  code 
(but  not  text)  for  (out  of  print)  2nd  ed.  C,  FORTRAN  77,  FORTRAN  90  as  well  as  source 
code  for  (out  of  print)  1st  ed.  To  order,  call  office  at  West  Nyack,  NY,  at  1-800-872-7423 
or  (845)  353-7500  or  online  at  www.nr.com. 


FURTHER  SOFTWARE  IN  STATISTICS.  See  Part  G. 


CHAPTER 


1 9 


Numerics  in  General 


Numeric  analysis  or  briefly  numerics  has  a distinct  flavor  that  is  different  from  basic 
calculus,  from  solving  ODEs  algebraically,  or  from  other  (nonnumeric)  areas.  Whereas  in 
calculus  and  in  ODEs  there  were  very  few  choices  on  how  to  solve  the  problem  and  your 
answer  was  an  algebraic  answer,  in  numerics  you  have  many  more  choices  and  your 
answers  are  given  as  tables  of  values  (numbers)  or  graphs.  You  have  to  make  judicous 
choices  as  to  what  numeric  method  or  algorithm  you  want  to  use,  how  accurate  you  need 
your  result  to  be,  with  what  value  (starting  value)  do  you  want  to  begin  your  computation, 
and  others.  This  chapter  is  designed  to  provide  a good  transition  from  the  algebraic  type 
of  mathematics  to  the  numeric  type  of  mathematics. 

We  begin  with  the  general  concepts  such  as  floating  point,  roundoff  errors,  and  general 
numeric  errors  and  their  propagation.  This  is  followed  in  Sec.  19.2  by  the  important  topic 
of  solving  equations  of  the  type/(x)  = 0 by  various  numeric  methods,  including  the  famous 
Newton  method.  Section  19.3  introduces  interpolation  methods.  These  are  methods  that 
construct  new  (unknown)  function  values  from  known  function  values.  The  knowledge 
gained  in  Sec.  19.3  is  applied  to  spline  interpolation  (Sec.  19.4)  and  is  useful  for  under- 
standing numeric  integration  and  differentiation  covered  in  the  last  section. 

Numerics  provides  an  invaluable  extension  to  the  knowledge  base  of  the  problem- 
solving engineer.  Many  problems  have  no  solution  formula  (think  of  a complicated  integral 
or  a polynomial  of  high  degree  or  the  interpolation  of  values  obtained  by  measurements). 
In  other  cases  a complicated  solution  formula  may  exist  but  may  be  practically  useless. 
It  is  for  these  kinds  of  problems  that  a numerical  method  may  generate  a good  answer. 
Thus,  it  is  very  important  that  the  applied  mathematician,  engineer,  physicist,  or  scientist 
becomes  familiar  with  the  essentials  of  numerics  and  its  ideas,  such  as  estimation  of  errors, 
order  of  convergence,  numerical  methods  expressed  in  algorithms,  and  is  also  informed 
about  the  important  numeric  methods. 

Prerequisite:  Elementary  calculus. 

References  and  Answers  to  Problems:  App.  1 Part  E,  App.  2. 

19.1  Introduction 

As  an  engineer  or  physicist  you  may  deal  with  problems  in  elasticity  and  need  to  solve 
an  equation  such  as  x cosh  x = 1 or  a more  difficult  problem  of  finding  the  roots  of  a 
higher  order  polynomial.  Or  you  encounter  an  integral  such  as 

i 

exp  (—x2)  dx 
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[see  App.  3,  formula  (35)]  that  you  cannot  solve  by  elementary  calculus.  Such  problems, 
which  are  difficult  or  impossible  to  solve  algebraically,  arise  frequently  in  applications. 
They  call  for  numeric  methods,  that  is,  systematic  methods  that  are  suitable  for  solving, 
numerically,  the  problems  on  computers  or  calculators.  Such  solutions  result  in  tables  of 
numbers,  graphical  representation  (figures),  or  both.  Typical  numeric  methods  are  iterative 
in  nature  and,  for  a well-choosen  problem  and  a good  starting  value,  will  frequently 
converge  to  a desired  answer.  The  evolution  from  a given  problem  that  you  observed  in 
an  experimental  lab  or  in  an  industrial  setting  (in  engineering,  physics,  biology,  chemistry, 
economics,  etc.)  to  an  approximation  suitable  for  numerics  to  a final  answer  usually 
requires  the  following  steps. 

1.  Modeling.  We  set  up  a mathematical  model  of  our  problem,  such  as  an  integral,  a 
system  of  equations,  or  a differential  equation. 

2.  Choosing  a numeric  method  and  parameters  (e.g.,  step  size),  perhaps  with  a 
preliminary  error  estimation. 

3.  Programming.  We  use  the  algorithm  to  write  a corresponding  program  in  a CAS, 
such  as  Maple,  Mathematica,  Matlab,  or  Mathcad,  or,  say,  in  Java,  C or  C+  , or 
FORTRAN,  selecting  suitable  routines  from  a software  system  as  needed. 

4.  Doing  the  computation. 

5.  Interpreting  the  results  in  physical  or  other  terms,  also  deciding  to  rerun  if  further 
results  are  needed. 

Steps  1 and  2 are  related.  A slight  change  of  the  model  may  often  admit  of  a more  efficient 
method.  To  choose  methods,  we  must  first  get  to  know  them.  Chapters  19-21  contain  efficient 
algorithms  for  the  most  important  classes  of  problems  occurring  frequently  in  practice. 

In  Step  3 the  program  consists  of  the  given  data  and  a sequence  of  instructions  to  be 
executed  by  the  computer  in  a certain  order  for  producing  the  answer  in  numeric  or  graphic 
form. 

To  create  a good  understanding  of  the  nature  of  numeric  work,  we  continue  in  this 
section  with  some  simple  general  remarks. 

Floating-Point  Form  of  Numbers 

We  know  that  in  decimal  notation,  every  real  number  is  represented  by  a finite  or  an 
infinite  sequence  of  decimal  digits.  Now  most  computers  have  two  ways  of  representing 
numbers,  called  fixed  point  and  floating  point.  In  a fixed-point  system  all  numbers  are 
given  with  a fixed  number  of  decimals  after  the  decimal  point;  for  example,  numbers 
given  with  3 decimals  are  62.358,  0.014,  1.000.  In  a text  we  would  write,  say,  3 decimals 
as  3D.  Fixed-point  representations  are  impractical  in  most  scientific  computations  because 
of  their  limited  range  (explain!)  and  will  not  concern  us. 

In  a floating-point  system  we  write,  for  instance, 

0.6247  • 103,  0.1735  • 10-13,  -0.2000  • 10-1 

or  sometimes  also 

6.247  • 102,  1.735  • 10-14,  -2.000  • 10-2 

We  see  that  in  this  system  the  number  of  significant  digits  is  kept  fixed,  whereas  the  decimal 
point  is  “floating.”  Here,  a significant  digit  of  a number  c is  any  given  digit  of  c,  except 
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possibly  for  zeros  to  the  left  of  the  first  nonzero  digit;  these  zeros  serve  only  to  fix  the 
position  of  the  decimal  point.  (Thus  any  other  zero  is  a significant  digit  of  c.)  For  instance, 

13600,  1.3600,  0.0013600 

all  have  5 significant  digits.  In  a text  we  indicate,  say,  5 significant  digits,  by  5S. 

The  use  of  exponents  permits  us  to  represent  very  large  and  very  small  numbers.  Indeed, 
theoretically  any  nonzero  number  a can  be  written  as 

(1)  a = ±m  • 10n,  0.1  Si  \m\  <1,  n integer. 

On  modem  computers,  which  use  binary  (base  2)  numbers,  m is  limited  to  k binary  digits  (e.g., 
k = 8)  and  n is  limited  (see  below),  giving  representations  (for  finitely  many  numbers  only!) 

(2)  a = ±m  • 2n,  m = O.t/itfe  " ' <7fc,  d\>  0. 

These  numbers  d are  called  k-digit  binary  machine  numbers.  Their  fractional  part  m 
(or  m)  is  called  the  mantissa.  This  is  not  identical  with  “mantissa”  as  used  for  logarithms. 
n is  called  the  exponent  of  a. 

It  is  important  to  realize  that  there  are  only  finitely  many  machine  numbers  and  that 
they  become  less  and  less  “dense”  with  increasing  a.  For  instance,  there  are  as  many 
numbers  between  2 and  4 as  there  are  between  1024  and  2048.  Why? 

The  smallest  positive  machine  number  eps  with  1 + eps  > 1 is  called  the  machine 
accuracy.  It  is  important  to  realize  that  there  are  no  numbers  in  the  intervals  [1,1+  eps], 
[2,  2 + 2 • eps],  • • • , [1024,  1024  + 1024  • eps],  ■ • • . This  means  that,  if  the  mathematical 
answer  to  a computation  would  be  1024  + 1024  • eps/2,  the  computer  result  will  be  either 
1024  or  1024  • eps  so  it  is  impossible  to  achieve  greater  accuracy. 

Underflow  and  Overflow.  The  range  of  exponents  that  a typical  computer  can  handle 
is  very  large.  The  IEEE  (Institute  of  Electrical  and  Electronic  Engineers)  floating-point 
standard  for  single  precision  is  from  2-126  to  2128  (1.175  X 10-38  to  3.403  X 1038)  and 
for  double  precision  it  is  from  2-1022  to  21024  (2.225  X 10-308  to  1.798  X 10308). 

As  a minor  technicality,  to  avoid  storing  a minus  in  the  exponent,  the  ranges  are  shifted 
from  [—126,  128]  by  adding  126  (for  double  precision  1022).  Note  that  shifted  exponents 
of  255  and  1047  are  used  for  some  special  cases  such  as  representing  infinity. 

If,  in  a computation  a number  outside  that  range  occurs,  this  is  called  underflow  when 
the  number  is  smaller  and  overflow  when  it  is  larger.  In  the  case  of  underflow,  the  result 
is  usually  set  to  zero  and  computation  continues.  Overflow  might  cause  the  computer  to 
halt.  Standard  codes  (by  IMSL,  NAG,  etc.)  are  written  to  avoid  overflow.  Error  messages 
on  overflow  may  then  indicate  programming  errors  (incorrect  input  data,  etc.).  From  here 
on,  we  will  be  discussing  the  decimal  results  that  we  obtain  from  our  computations. 

Roundoff 

An  error  is  caused  by  chopping  (=  discarding  all  digits  from  some  decimal  on)  or  rounding. 
This  error  is  called  roundoff  error,  regardless  of  whether  we  chop  or  round.  The  rule  for 
rounding  off  a number  to  k decimals  is  as  follows.  (The  rule  for  rounding  off  to  k significant 
digits  is  the  same,  with  “decimal”  replaced  by  “significant  digit.”) 

Roundoff  Rule.  To  round  a number  x to  k decimals,  and  5 • io-<,£+1)  to  x and  chop  the 
digits  after  the  (k  + 1 )st  digit. 
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EXAMPLE  1 


EXAMPLE  2 


Roundoff  Rule 

Round  the  number  1.23454621  to  (a)  2 decimals,  (b)  3 decimals,  (c)  4 decimals,  (d)  5 decimals,  and  (e)  6 decimals. 

Solution,  (a)  For  2 decimals  we  add  5 ■ 10_ffe  + 1>  = 5 • 10-3  = 0.005  to  the  given  number,  that  is, 
1.2345621  + 0.005  = 1.23  954621.  Then  we  chop  off  the  digits  “954621”  after  the  space  or  equivalently 
1.23954621  - 0.00954621  = 1.23. 

(b)  1.23454621  + 0.0005  = 1.235  04621,  so  that  for  3 decimals  we  get  1.234. 

(c)  1.23459621  after  chopping  give  us  1.2345  (4  decimals). 

(d)  1.23455121  yields  1.23455  (5  decimals). 

(e)  1.23454671  yields  1.234546  (6  decimals). 

Can  you  round  the  number  to  7 decimals? 

Chopping  is  not  recommended  because  the  corresponding  error  can  be  larger  than  that 
in  rounding.  (Nevertheless,  some  computers  use  it  because  it  is  simpler  and  faster.  On  the 
other  hand,  some  computers  and  calculators  improve  accuracy  of  results  by  doing 
intermediate  calculations  using  one  or  more  extra  digits,  called  guarding  digits .) 

Error  in  Rounding.  Let  a = fl  (a)  in  (2)  be  the  floating-point  computer  approximation  of 
a in  (1)  obtained  by  rounding,  where  fl  suggests  floating.  Then  the  roundoff  rule  gives  (by 
dropping  exponents)  | m — m\  Si  | • 10_fc.  Since  \m\  §£  0.1,  this  implies  (when  a # 0) 


(3) 


a — a 

m — m 

a 

m 

• 10 


i-fc 


The  right  side  u = \ • 1 01_fe  is  called  the  rounding  unit.  If  we  write  a = a(  1 + S),  we 
have  by  algebra  ( d — a)/ a = 8,  hence  |S|  S m by  (3).  This  shows  that  the  rounding  unit 
u is  an  error  bound  in  rounding. 

Rounding  errors  may  ruin  a computation  completely,  even  a small  computation.  In 
general,  these  errors  become  the  more  dangerous  the  more  arithmetic  operations  (perhaps 
several  millions!)  we  have  to  perform.  It  is  therefore  important  to  analyze  computational 
programs  for  expected  rounding  errors  and  to  find  an  arrangement  of  the  computations 
such  that  the  effect  of  rounding  errors  is  as  small  as  possible. 

As  mentioned,  the  arithmetic  in  a computer  is  not  exact  and  causes  further  errors; 
however,  these  will  not  be  relevant  to  our  discussion. 

Accuracy  in  Tables.  Although  available  software  has  rendered  various  tables  of  function 
values  superfluous,  some  tables  (of  higher  functions,  of  coefficients  of  integration 
formulas,  etc.)  will  still  remain  in  occasional  use.  If  a table  shows  k significant  digits,  it 
is  conventionally  assumed  that  any  value  a in  the  table  deviates  from  the  exact  value  a 
by  at  most  ±g  unit  of  the  Mi  digit. 


Loss  of  Significant  Digits 

This  means  that  a result  of  a calculation  has  fewer  correct  digits  than  the  numbers  from 
which  it  was  obtained.  This  happens  if  we  subtract  two  numbers  of  about  the  same  size, 
for  example,  0.1439  — 0.1426  (“subtractive  cancellation”).  It  may  occur  in  simple 
problems,  but  it  can  be  avoided  in  most  cases  by  simple  changes  of  the  algorithm — if  one 
is  aware  of  it!  Let  us  illustrate  this  with  the  following  basic  problem. 


Quadratic  Equation.  Loss  of  Significant  Digits 

Find  the  roots  of  the  equation 

X2  + 40x  + 2 = 0, 

using  4 significant  digits  (abbreviated  4S)  in  the  computation. 
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Solution.  A formula  for  the  roots  *1,  X2  of  a quadratic  equation  ax2  + bx  + c = 0 is 

(4)  xi  = — {~b  + \/b2  — 4 ac),  x2  = — (—b  — \/b2  — 4 ac). 

2 a 2 a 

Furthermore,  since  *i*2  — c/a,  another  formula  for  those  roots 

c 

(5)  x1  = — , *2  as  in  (4). 

ax  2 

We  see  that  this  avoids  cancellation  in  *i  for  positive  b. 

If  b < 0,  calculate  *i  from  (4)  and  then  x2  = c/{ax{). 

For*2  + 40*  + 2 = 0 we  obtain  from  (4)  * = —20  ± V398  = —20  ± 19.95,  hence  *2  = —20.00  — 19.95, 
involving  no  difficulty,  and  *i  = —20.00  + 19.95  = —0.05,  a poor  value  involving  loss  of  digits  by  subtractive 
cancellation. 

In  contrast,  (5)  gives  *i  = 2.000/(— 39.95)  = —0.05006,  the  absolute  value  of  the  error  being  less  than  one 
unit  of  the  last  digit,  as  a computation  with  more  digits  shows.  The  lOS-value  is  —0.05006265674. 

Errors  of  Numeric  Results 

Final  results  of  computations  of  unknown  quantities  generally  are  approximations;  that 
is,  they  are  not  exact  but  involve  errors.  Such  an  error  may  result  from  a combination 
of  the  following  effects.  Roundoff  errors  result  from  rounding,  as  discussed  above. 
Experimental  errors  are  errors  of  given  data  (probably  arising  from  measurements). 
Truncating  errors  result  from  truncating  (prematurely  breaking  off),  for  instance,  if  we 
replace  a Taylor  series  with  the  sum  of  its  first  few  terms.  These  errors  depend  on  the 
computational  method  used  and  must  be  dealt  with  individually  for  each  method. 
[“Truncating”  is  sometimes  used  as  a term  for  chopping  off  (see  before),  a terminology 
that  is  not  recommended.] 

Formulas  for  Errors.  If  a is  an  approximate  value  of  a quantity  whose  exact  value  is 
a,  we  call  the  difference 

(6)  e = a — a 
the  error  of  a.  Hence 


(6*)  a = a + e.  True  value  = Approximation  + Error. 


For  instance,  if  a = 10.5  is  an  approximation  of  a = 10.2,  its  error  is  e = —0.3.  The 
error  of  an  approximation  a = 1.60  of  a = 1.82  is  e = 0.22. 


In  the  literature  \a  — a\  (“absolute  error”)  or  a — a are  sometimes  also 
used  as  definitions  of  error. 

The  relative  error  er  of  a is  defined  by 


(7) 


e _ a — a _ Error 
a a True  value 


0 a =£  0). 


This  looks  useless  because  a is  unknown.  But  if  | e | is  much  less  than  | a \ , then  we  can 
use  a instead  of  a and  get 

(7') 

a 
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THEOREM  1 


PROOF 


This  still  looks  problematic  because  e is  unknown — if  it  were  known,  we  could  get 
a = 5 + e from  (6)  and  we  would  be  done.  But  what  one  often  can  obtain  in  practice  is 
an  error  bound  for  5,  that  is,  a number  (3  such  that 

|e|  =/3,  hence  jo  — a \ = (3. 


This  tells  us  how  far  away  from  our  computed  a the  unknown  a can  at  most  lie.  Similarly, 
for  the  relative  error,  an  error  bound  is  a number  f3r  such  that 


|er|  g (3r,  hence 


a — a 
a 


^ A- 


Error  Propagation 

This  is  an  important  matter.  It  refers  to  how  errors  at  the  beginning  and  in  later  steps 
(roundoff,  for  example)  propagate  into  the  computation  and  affect  accuracy,  sometimes 
very  drastically.  We  state  here  what  happens  to  error  bounds.  Namely,  bounds  for  the 
error  add  under  addition  and  subtraction,  whereas  bounds  for  the  relative  error  add  under 
multiplication  and  division.  You  do  well  to  keep  this  in  mind. 


Error  Propagation 

(a)  In  addition  and  subtraction,  a bound  for  the  error  of  the  results  is  given  by 
the  sum  of  the  error  bounds  for  the  terms. 

(b)  In  multiplication  and  division,  an  error  bound  for  the  relative  error  of  the 
results  is  given  ( approximately ) by  the  sum  of  the  bounds  for  the  relative  errors 
of  the  given  numbers. 


(a)  We  use  the  notations  x = x + ex,  y = y + ey,  \ex\  § Ac , |ej  g f3y.  Then  for  the 
error  e of  the  difference  we  obtain 

|e|  = \x  — y — (x  — v)| 

= \x  - x - (y  - y)l 

— lex  ~~  ej/l  — kxl  leyl  — Px  A/- 


The  proof  for  the  sum  is  similar  and  is  left  to  the  student. 

(b)  For  the  relative  error  er  of  xy  we  get  from  the  relative  errors  erx  and  e.nj  of  3c,  y 
and  bounds  (3rx,  [3ry 


xy  — xy 

xy  - (x  - ex)(y  - ey) 

^ xy 

xy 

xy 

xy 

exy  + eyX 

< 

+ 

€v 

xy 

X 

y 

l^rxl  l^ryl  — firx  firy 


This  proof  shows  what  “approximately”  means:  we  neglected  exey  as  small  in  absolute 
value  compared  to  |ej  and  ey  \ . The  proof  for  the  quotient  is  similar  but  slightly  more 
tricky  (see  Prob.  13). 
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Basic  Error  Principle 

Every  numeric  method  should  be  accompanied  by  an  error  estimate.  If  such  a formula  is 
lacking,  is  extremely  complicated,  or  is  impractical  because  it  involves  information  (for 
instance,  on  derivatives)  that  is  not  available,  the  following  may  help. 

Error  Estimation  by  Comparison.  Do  a calculation  twice  with  different  accuracy. 
Regard  the  difference  dz  ~ 2i  of  the  results  d\,  dz  as  a ( perhaps  crude)  estimate  of  the 
error  e ] of  the  inferior  result  a \ . Indeed,  a j 4-  £i  = dz  + £2  by  formula  (4*).  This  implies 
diz  — d 1 = e 1 — £2  ~ ^1  because  a 2 is  generally  more  accurate  than  d\,  so  that  | ^2 1 is 
small  compared  to  |ejJ. 

Algorithm.  Stability 

Numeric  methods  can  be  formulated  as  algorithms.  An  algorithm  is  a step-by-step 
procedure  that  states  a numeric  method  in  a form  (a  “pseudocode”)  understandable  to 
humans.  (See  Table  19.1  to  see  what  an  algorithm  looks  like.)  The  algorithm  is  then  used 
to  write  a program  in  a programming  language  that  the  computer  can  understand  so  that 
it  can  execute  the  numeric  method.  Important  algorithms  follow  in  the  next  sections.  For 
routine  tasks  your  CAS  or  some  other  software  system  may  contain  programs  that  you 
can  use  or  include  as  parts  of  larger  programs  of  your  own. 

Stability.  To  be  useful,  an  algorithm  should  be  stable;  that  is,  small  changes  in  the  initial 
data  should  cause  only  small  changes  in  the  final  results.  However,  if  small  changes  in  the 
initial  data  can  produce  large  changes  in  the  final  results,  we  call  the  algorithm  unstable. 

This  “ numeric  instability ,”  which  in  most  cases  can  be  avoided  by  choosing  a better 
algorithm,  must  be  distinguished  from  “ mathematical  instability ” of  a problem,  which  is 
called  “ ill-conditioning ,”  a concept  we  discuss  in  the  next  section. 

Some  algorithms  are  stable  only  for  certain  initial  data,  so  that  one  must  be  careful  in 
such  a case. 


PROBLEM  SET  19.1 


1.  Floating  point.  Write  84.175,  -528.685,0.000924138, 
and  —362005  in  floating-point  form,  rounded  to  5S 
(5  significant  digits). 

2.  Write  -76.437125,  60100,  and  -0.00001  in  floating- 
point form,  rounded  to  4S. 

3.  Small  differences  of  large  numbers  may  be  parti- 
cularly strongly  affected  by  rounding  errors.  Illustrate 
this  by  computing  0.81534/(35  • 724  — 35.596)  as 
given  with  5S,  then  rounding  stepwise  to  4S,  3S,  and  2S, 
where  “stepwise”  means  round  the  rounded  numbers,  not 
the  given  ones. 

4.  Order  of  terms,  in  adding  with  a fixed  number  of 
digits,  will  generally  affect  the  sum.  Give  an  example. 
Find  empirically  a rule  for  the  best  order. 


5.  Rounding  and  adding.  Let  a^,  ■ ■ ■ , an  be  numbers  with 
flj  correctly  rounded  to  Sj  digits.  In  calculating  the  sum 
Oi  + • • • + an,  retaining  S = min  Sj  significant  digits, 
is  it  essential  that  we  first  add  and  then  round  the  result 
or  that  we  first  round  each  number  to  S significant  digits 
and  then  add? 

6.  Nested  form.  Evaluate 

fix)  = x3 4  - 7.1 5x2  + 11.2*  + 2.8 
= ((x  - 7.5)x  + 11.2)x  + 2.8 

at  x = 3.94  using  3S  arithmetic  and  rounding,  in  both 
of  the  given  forms.  The  latter,  called  the  nested  form, 
is  usually  preferable  since  it  minimizes  the  number  of 
operations  and  thus  the  effect  of  rounding. 
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7.  Quadratic  equation.  Solve  x2  - 30x  + 1 = 0 by  (4) 
and  by  (5),  using  6S  in  the  computation.  Compare  and 
comment. 

8.  Solve  x 2 — 40.r  + 2 = 0,  using  4S-computation. 

9.  Do  the  computations  in  Prob.  7 with  4S  and  2S. 

10.  Instability.  For  small  \a\  the  equation  (x  — k)2  = a 
has  nearly  a double  root.  Why  do  these  roots  show 
instability? 

11.  Theorems  on  errors.  Prove  Theorem  1(a)  for  addition. 

12.  Overflow  and  underflow  can  sometimes  be  avoided 
by  simple  changes  in  a formula.  Explain  this  in  terms 
of  V*2  + y2  = xVl  + ( y/x )2  with  x2  £ y2  and  x so 
large  that  x2  would  cause  overflow.  Invent  examples 
of  your  own. 

13.  Division.  Prove  Theorem  1(b)  for  division. 

14.  Loss  of  digits.  Square  root.  Compute  Vr+4  — 2 
with  6S  arithmetic  for  x = 0.001  (a)  as  given  and 
(b)  from  x2/(Vx2  + 4 + 2)  (derive!). 

15.  Logarithm.  Compute  In  a — In  b with  6S  arithmetic 
for  a = 4.00000  and  b = 3.99900  (a)  as  given  and 
(b)  from  In  (a/b). 

16.  Cosine.  Compute  1 — cos  x with  6S  arithmetic  for 
x = 0.02  (a)  as  given  and  (b)  by  2 sin2  |x  (derive!). 

17.  Discuss  the  numeric  use  of  (12)  in  App.  A3.1  for 
cos  v — cos  u when  u ~ v. 

18.  Quotient  near  0/0.  (a)  Compute  (1  — cos  x)/sin  x with 
6S  arithmetic  for  x = 0.005.  (b)  Looking  at  Prob.  16, 
find  a much  better  formula. 

19.  Exponential  function.  Calculate  1/e  = 0.367879  (6S) 
from  the  partial  sums  of  5-10  terms  of  the  Maclaurin 
series  (a)  of  e~x  with  x = 1,  (b)  of  ex  with  x = I and 
then  taking  the  reciprocal.  Which  is  more  accurate? 

20.  Compute  e-10  with  6S  arithmetic  in  two  ways  (as  in 
Prob.  19). 

21.  Binary  conversion.  Show  that 

23  = 20  • 101  + 3 • 10°  = 16  + 4 + 2 + 1 


= 24  + 22 

+ 21  + 2°  = (1  0 1 

be  obtained  by  the  division  algorithm 

2 [23 

Remainder  1 = c0 

2LJT 

1 = Cl 

2|JL 

1 = C2 

2|_2_ 

0 = c3 

0 

1 = c4 

22.  Convert  (0.59375)io  to  (0. 1001 1)2  by  successive 
multiplication  by  2 and  dropping  (removing)  the  integer 
parts,  which  give  the  binary  digits  c\,  c%,  ■ ■ ■ : 

0 .59375  • 2 
Cl  = CO  .1875  • 2 
c2  = ! -375  • 2 
c3  = H .75  • 2 
c4  = |U  .5  • 2 

eg  = m .0 

23.  Show  that  0. 1 is  not  a binary  machine  number. 

24.  Prove  that  any  binary  machine  number  has  a finite 
decimal  representation.  Is  the  converse  true? 

25.  CAS  EXPERIMENT.  Approximations.  Obtain 

x = 0.1  = — 2_4m  from  Prob.  23.  Which  machine 

2 i 

m=  1 

number  (partial  sum)  Sn  will  first  have  the  value  0.1 
to  30  decimal  digits? 

26.  CAS  EXPERIMENT.  Integration  from  Calculus. 

Integrating  by  parts,  show  that  /„  = J*  exxn  dx  = 
e — nln_i,  I0  = e — 1.  (a)  Compute  In,  n = 0,  ■ ■ ■ , 
using  4S  arithmetic,  obtaining  I8  = —3.906.  Why  is 
this  nonsense?  Why  is  the  error  so  large? 

(b)  Experiment  in  (a)  with  the  number  of  digits  k > 4. 
As  you  increase  k,  will  the  first  negative  value  n = N 
occur  earlier  or  later?  Find  an  empirical  formula  for 
N = N(k). 

27.  Backward  Recursion.  In  Prob.  26.  Using  ex  < e 
(0  < x < 1),  conclude  that  |/„|  S e/(n  + 1)  —*  0 as 
n— >oo.  Solve  the  iteration  formula  for  In~\  = 
(e  — In)/n,  start  from  /15  » 0 and  compute  4S  values 
°f  7i4>  7i3>  ■■  ,h- 

28.  Harmonic  series.  1 + | + | + ■ ■ ■ diverges.  Is  the 
same  true  for  the  corresponding  series  of  computer 
numbers? 

29.  Approximations  of  7 r = 3.14159265358979  • • • are 

22/7  and  355/113.  Determine  the  corresponding  errors 
and  relative  errors  to  3 significant  digits. 

30.  Compute  tt  by  Machin’s  approximation  16  arctan 
(g)  — 4 arctan  (ggg)  to  10S  (which  are  correct).  [In 
1986,  D.  H.  Bailey  (NASA  Ames  Research  Center, 
Moffett  Field,  CA  94035)  computed  almost  30  million 
decimals  of  7T  on  a CRAY-2  in  less  than  30  hrs.  The 
race  for  more  and  more  decimals  is  continuing.  See  the 
Internet  under  pi.] 
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Solution  of  Equations  by  Iteration 

For  each  of  the  remaining  sections  of  this  chapter,  we  select  basic  kinds  of  problems  and 
discuss  numeric  methods  on  how  to  solve  them.  The  reader  will  learn  about  a variety  of 
important  problems  and  become  familiar  with  ways  of  thinking  in  numerical  analysis. 
Perhaps  the  easiest  conceptual  problem  is  to  find  solutions  of  a single  equation 

(1)  m = o, 

where /is  a given  function.  A solution  of  (1)  is  a number  x = s such  that  /(.v)  = 0.  Here, 
s suggests  “solution,”  but  we  shall  also  use  other  letters. 

It  is  interesting  to  note  that  the  task  of  solving  (1)  is  a question  made  for  numeric 
algorithms,  as  in  general  there  are  no  direct  formulas,  except  in  a few  simple  cases. 

Examples  of  single  equations  are  x + x = 1 , sin  x = 0.5x,  tan  x = x,  cosh  x = sec  x, 
cosh  x cos  x = — 1,  which  can  all  be  written  in  the  form  of  (1).  The  first  of  the  five  equations 
is  an  algebraic  equation  because  the  corresponding  / is  a polynomial.  In  this  case  the 
solutions  are  called  roots  of  the  equation  and  the  solution  process  is  called  finding  roots.  The 
other  equations  are  transcendental  equations  because  they  involve  transcendental  functions. 

There  are  a very  large  number  of  applications  in  engineering,  where  we  have  to  solve  a 
single  equation  (1).  You  have  seen  such  applications  when  solving  characteristic  equations 
in  Chaps.  2,  4,  and  8;  partial  fractions  in  Chap.  6;  residue  integration  in  Chap.  16,  finding 
eigenvalues  in  Chap.  12,  and  finding  zeros  of  Bessel  functions,  also  in  Chap.  12.  Moreover, 
methods  of  finding  roots  are  very  important  in  areas  outside  of  classical  engineering.  For 
example,  in  finance,  the  problem  of  determining  how  much  a bond  is  worth  amounts  to 
solving  an  algebraic  equation. 

To  solve  (1)  when  there  is  no  formula  for  the  exact  solution  available,  we  can  use  an 
approximation  method,  such  as  an  iteration  method.  This  is  a method  in  which  we  start  from 
an  initial  guess  Xq  (which  may  be  poor)  and  compute  step  by  step  (in  general  better  and  better) 
approximations  xi,  x2,  ■ ■ ■ of  an  unknown  solution  of  (1).  We  discuss  three  such  methods  that 
are  of  particular  practical  importance  and  mention  two  others  in  the  problem  set. 

It  is  very  important  that  the  reader  understand  these  methods  and  their  underlying  ideas. 
The  reader  will  then  be  able  to  select  judiciously  the  appropriate  software  from  among 
different  software  packages  that  employ  variations  of  such  methods  and  not  just  treat  the 
software  programs  as  “black  boxes.” 

In  general,  iteration  methods  are  easy  to  program  because  the  computational  operations 
are  the  same  in  each  step — just  the  data  change  from  step  to  step — and,  more  importantly, 
if  in  a concrete  case  a method  converges,  it  is  stable  in  general  (see  Sec.  19.1). 

Fixed-Point  Iteration  for  Solving  Equations  f(x)  = 0 

Note:  Our  present  use  of  the  word  “fixed  point”  has  absolutely  nothing  to  do  with  that  in 
the  last  section. 

By  some  algebraic  steps  we  transform  (1)  into  the  form 

(2)  x = g(x). 

Then  we  choose  an  x0  and  compute  Xj  = g(x0),  x2  = gCn),  and  in  general 


(3) 


xn+l  8(xn) 


(n  = 0,  !,-■•). 
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A solution  of  (2)  is  called  a fixed  point  of  g,  motivating  the  name  of  the  method.  This  is  a 
solution  of  (1),  since  from  x = g(x)  we  can  return  to  the  original  form  f(x)  = 0.  From  (1) 
we  may  get  several  different  forms  of  (2).  The  behavior  of  corresponding  iterative  sequences 
x0,  Xi,  ■ ■ ■ may  differ,  in  particular,  with  respect  to  their  speed  of  convergence.  Indeed,  some 
of  them  may  not  converge  at  all.  Let  us  illustrate  these  facts  with  a simple  example. 


An  Iteration  Process  (Fixed-Point  Iteration) 

Set  up  an  iteration  process  for  the  equation  f(x)  = x2  — 3x  + 1 = 0.  Since  we  know  the  solutions 
x = 1.5  ± VT25,  thus  2.618034  and  0.381966, 

we  can  watch  the  behavior  of  the  error  as  the  iteration  proceeds. 

Solution.  The  equation  may  be  written 

(4a)  x = gi(x)  = 3(x2  + 1),  thus  xn+i  = g(x2  + 1). 

If  we  choose  xq  = 1,  we  obtain  the  sequence  (Fig.  426a;  computed  with  6S  and  then  rounded) 
x0  = 1.000,  x i = 0.667,  x2  = 0.481,  x3  = 0.411,  x4  = 0.390,  • • • 


which  seems  to  approach  the  smaller  solution.  If  we  choose  xq  = 2,  the  situation  is  similar.  If  we  choose 
xq  — 3,  we  obtain  the  sequence  (Fig.  426a,  upper  part) 


x0  - 3.000,  xi  = 3.333,  x2  = 4.037,  x3  = 5.766,  x4  = 11.415,  ••  • 
which  diverges. 

Our  equation  may  also  be  written  (divide  by  x) 


(4b) 


1 

x = g2(x)  = 3--,  thus 


1 

xn+l  ~ ^ — 77~  , 
xn 


and  if  we  choose  x0  = 1,  we  obtain  the  sequence  (Fig.  426b) 

x0  = 1.000,  x i = 2.000,  x2  = 2.500,  x3  = 2.600,  x4  = 2.615,  ■ ■ ■ 

which  seems  to  approach  the  larger  solution.  Similarly,  if  we  choose  Xo  = 3,  we  obtain  the  sequence 
(Fig.  426b) 

x0  = 3.000,  x ! = 2.667,  x2  = 2.625,  x3  = 2.619,  x4  = 2.618, 


Fig.  426.  Example  1,  iterations  (4a)  and  (4b) 
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Our  figures  show  the  following.  In  the  lower  part  of  Fig.  426a  the  slope  of  gi(jc)  is  less  than  the  slope  of  y = x, 
which  is  1.  thus  |g{(jt)|  < 1,  and  we  seem  to  have  convergence.  In  the  upper  part,  gi(-r)  is  steeper  (gi(x)  > 1) 
and  we  have  divergence.  In  Fig.  426b  the  slope  of  g2(x)  is  less  near  the  intersection  point  (x  = 2.618,  fixed 
point  of  g2,  solution  of  fix)  = 0),  and  both  sequences  seem  to  converge.  From  all  this  we  conclude  that 
convergence  seems  to  depend  on  the  fact  that,  in  a neighborhood  of  a solution,  the  curve  of  g(x)  is  less  steep 
than  the  straight  line  y — x,  and  we  shall  now  see  that  this  condition  |g  (jc)|  < 1 (=  slope  ofy  = x)  is  sufficient 
for  convergence. 


An  iteration  process  defined  by  (3)  is  called  convergent  for  an  Xq  if  the  corresponding 
sequence  x0,  Xy,  • ■ • is  convergent. 

A sufficient  condition  for  convergence  is  given  in  the  following  theorem,  which  has 
various  practical  applications. 


Convergence  of  Fixed-Point  Iteration 

Let  x = s be  a solution  of  x = g(x)  and  suppose  that  g has  a continuous  derivative 
in  some  inter\>al  J containing  s.  Then,  if  Si  K < 1 in  J,  the  iteration  process 

defined  by  (3)  converges  for  any  xq  in  J.  The  limit  of  the  sequence  { xn } is  s. 


By  the  mean  value  theorem  of  differential  calculus  there  is  a t between  x and  s such  that 

g(x)  - g(s)  = g'(t)(x  - s)  (x  in  J). 

Since  g(s)  = s and  X\  = g(xo),  x2  = g(xi),  • ■ ■ , we  obtain  from  this  and  the  condition  on 
Ig'Wl  in  the  theorem 

\xn  ~ s\  = I gix-n-f)  ~ g0)|  = |g'(f)IUn-l  - s\  = K\xn- 1 - i|. 

Applying  this  inequality  n times,  for  n,n  — 1,  • • • , 1 gives 

\xn  - s\  ^ K\xn_1  - s\  ^ K2 \xn_2  “ s|  = • ’ • = Kn\x0  - s\. 

Since  K < 1,  we  have  fif”  — » 0;  hence  \xn  — v|^0asn^o°. 

We  mention  that  a function  g satisfying  the  condition  in  Theorem  1 is  called  a contraction 
because  |g(x)  — g(iOl  = K\x  — u|,  where  K < 1.  Furthermore,  K gives  information  on 
the  speed  of  convergence.  For  instance,  if  K = 0.5,  then  the  accuracy  increases  by  at  least 
2 digits  in  only  7 steps  because  0.57  < 0.01. 

An  Iteration  Process.  Illustration  of  Theorem  1 

Find  a solution  of  f(x)  = x3 * * *  + x — 1=0  by  iteration. 

Solution.  A sketch  shows  that  a solution  lies  near  x = 1.  (a)  We  may  write  the  equation  as  (xz  + I )x  = 1 or 

1 1 . , . 2|jc| 

x = giM  = • so  that  xn+1  = . Also  IgiMI  = — < 1 

1 + x2  1 + Xn  (1  + x2)2 

2 / 2 4 2 / 2 

for  any  x because  4x  /(I  + x ) = Ax  /(\  + 4x  that  by  Theorem  1 we  have  convergence  for 

any  .vq.  Choosing  xq  = 1,  we  obtain  (Fig.  427) 

xx  = 0.500,  X2  = 0.800,  x3  = 0.610,  x4  = 0.729,  *5  = 0.653,  x6  = 0.701, 


The  solution  exact  to  6D  is  s = 0.682328. 
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(b)  The  given  equation  may  also  be  written 

* = &(*)=  1 “ *3-  Then  \gk*)\  = 3*  2 

and  this  is  greater  than  1 near  the  solution,  so  that  we  cannot  apply  Theorem  1 and  assert  convergence.  Try 
xq  = 1,  jcq  = 0.5,  Jto  = 2 and  see  what  happens. 

The  example  shows  that  the  transformation  of  a given  f(x ) = 0 into  the  form  x = g(x)  with  g satisfying 
^ K < 1 may  need  some  experimentation. 


Fig.  427.  Iteration  in  Example  2 


Newton’s  Method  for  Solving  Equations  f(x)  = 0 

Newton’s  method,  also  known  as  Newton-Raphson’s  method,1  is  another  iteration 
method  for  solving  equations /(x)  = 0,  where /is  assumed  to  have  a continuous  derivative// 
The  method  is  commonly  used  because  of  its  simplicity  and  great  speed. 

The  underlying  idea  is  that  we  approximate  the  graph  of /by  suitable  tangents.  Using 
an  approximate  value  x0  obtained  from  the  graph  of / we  let  x\  be  the  point  of  intersection 
of  the  x-axis  and  the  tangent  to  the  curve  of  /at  x0  (see  Fig.  428).  Then 

, fix  o)  f{x0) 

tan  /3  = / (x0)  = , hence  *i  = x0  - . 

x0-xi  f(x0) 

In  the  second  step  we  compute  x2  = xi  — f(xi)/f'(xi),  in  the  third  step  x3  from  x2  again 
by  the  same  formula,  and  so  on.  We  thus  have  the  algorithm  shown  in  Table  19.1.  Formula 
(5)  in  this  algorithm  can  also  be  obtained  if  we  algebraically  solve  Taylor’s  formula 

(5*)  fixji+ 1)  f(xn)  + (xn+i  x^f  (xn)  0. 


1JOSEPH  RAPHSON  (1648-1715),  English  mathematician  who  published  a method  similar  to  Newton’s 

method.  For  historical  details,  see  Ref.  [GenRef2],  p.  203,  listed  in  App.  1. 
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19.1  Newton’s  Method  for  Solving  Equations  f(x)  — 0 


ALGORITHM  NEWTON  (/,/',  X0,  6,  N) 


This  algorithm  computes  a solution  of  f(x ) = 0 given  an  initial  approximation  x0  (starting 
value  of  the  iteration).  Here  the  function  f(x)  is  continuous  and  has  a continuous 
derivative  f'(x). 

INPUT:  f,f',  initial  approximation  x0,  tolerance  e > 0,  maximum  number  of 

iterations  N. 


OUTPUT:  Approximate  solution  xn  {n  = AO  or  message  of  failure. 


1 

2 

3 


For  n = 0,  1,  2,  • • • , IV  — 1 do: 

Compute  f'(xj. 

If  f'(xn ) = 0 then  OUTPUT  “Failure.”  Stop. 

[ Procedure  completed  unsuccessfully] 
Else  compute 


(5) 


/On) 

xn+ 1 — xn  ,.r,  7 • 

f On) 


4 


If  \xn+ 1 — xn\  = eUn+il  then  OUTPUT  xn+1.  Stop. 
[Procedure  completed  successfully] 


End 


5 OUTPUT  “Failure”.  Stop. 

[Procedure  completed  unsuccessfully  after  N iterations] 


End  NEWTON 


If  it  happens  that  f’(x.n)  = 0 for  some  n (see  line  2 of  the  algorithm),  then  try  another 
starting  value  x0.  Line  3 is  the  heart  of  Newton’s  method. 

The  inequality  in  line  4 is  a termination  criterion.  If  the  sequence  of  the  xn  converges 
and  the  criterion  holds,  we  have  reached  the  desired  accuracy  and  stop.  Note  that  this  is  just 
a form  of  the  relative  error  test.  It  ensures  that  the  result  has  the  desired  number  of  significant 
digits.  If  Un+i|  = 0,  the  condition  is  satisfied  if  and  only  if  xn+i  = xrl  = 0,  otherwise 
\xn+\  — xn\  must  be  sufficiently  small.  The  factor  \xn+\  is  needed  in  the  case  of  zeros 
of  very  small  (or  very  large)  absolute  value  because  of  the  high  density  (or  of  the  scarcity) 
of  machine  numbers  for  those  x. 

WARNING!  The  criterion  by  itself  does  not  imply  convergence.  Example.  The 
harmonic  series  diverges,  although  its  partial  sums  xn  = XjJ=1  1 Ik  satisfy  the  criterion 
because  lim  On+i  — xn)  = lim  (l/(n  + 1))  = 0. 
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EXAMPLE  3 


EXAMPLE  4 


EXAMPLE  5 


Line  5 gives  another  termination  criterion  and  is  needed  because  Newton’s  method  may 
diverge  or,  due  to  a poor  choice  of  xo,  may  not  reach  the  desired  accuracy  by  a reasonable 
number  of  iterations.  Then  we  may  try  another  xo-  If/(x)  = 0 has  more  than  one  solution, 
different  choices  of  xo  may  produce  different  solutions.  Also,  an  iterative  sequence  may 
sometimes  converge  to  a solution  different  from  the  expected  one. 


Square  Root 

Set  up  a Newton  iteration  for  computing  the  square  root  x of  a given  positive  number  c and  apply  it  to  c = 2. 
Solution.  We  have  x = Vc,  hence  /(x)  = x2  — c = 0 ,f'(x)  = 2x,  and  (5)  takes  the  form 


xn+l  xn 


2x 


'i  I xn  .. 


For  c = 2,  choosing  x©  = 1,  we  obtain 

= 1.500000,  *2  = 1.416667,  x3  = 1.414216,  x4  = 1.414214,  • • • . 

x4  is  exact  to  6D. 


Iteration  for  a Transcendental  Equation 

Find  the  positive  solution  of  2 sin  x = x. 

Solution.  Setting /(x)  = x — 2 sin  x,  we  have  f\x)  =1—2  cosx,  and  (5)  gives 

xn  — 2 sin  xn  2(sin  xn  — xn  cos  xn ) Nn 


xn+ 1 xn  i 

— 2 cos  xn 

1—2  cos  x 

n 

Nn 

D„ 

%n+ 1 

0 

2.00000 

3.48318 

1.83229 

1.90100 

1 

1.90100 

3.12470 

1.64847 

1.89552 

2 

1.89552 

3.10500 

1.63809 

1.89550 

3 

1.89550 

3.10493 

1.63806 

1.89549 

From  the  graph  of /we  conclude  that  the  solution  is  near  xq  = 2.  We  compute: 
x4  = 1.89549  is  exact  to  5D  since  the  solution  to  6D  is  1.895494. 


Newton’s  Method  Applied  to  an  Algebraic  Equation 

Apply  Newton’s  method  to  the  equation /(x)  = x3  + x — 1 = 0. 
Solution.  From  (5)  we  have 


xn  "F  xn  1 

xn+l  xn  0 

3 xl  + 1 


2x\  + 1 
3*n  + 1 


Starting  from  xq  — 1,  we  obtain 

xi  = 0.750000,  x2  = 0.686047,  x3  = 0.682340,  x4  = 0.682328,  • • • 

where  x4  has  the  error  —1  • 10-6.  A comparison  with  Example  2 shows  that  the  present  convergence  is  much 
more  rapid.  This  may  motivate  the  concept  of  the  order  of  an  iteration  process,  to  be  discussed  next. 
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Order  of  an  Iteration  Method. 

Speed  of  Convergence 

The  quality  of  an  iteration  method  may  be  characterized  by  the  speed  of  convergence,  as 
follows. 

Let  xn+i  = g(xn)  define  an  iteration  method,  and  let  xn  approximate  a solution  .v  of 
x = g(x).  Then  xn  = s — en,  where  en  is  the  error  of  xn.  Suppose  that  g is  differentiable 
a number  of  times,  so  that  the  Taylor  formula  gives 

Xn+1  = S(xn)  = g(s)  + g\s) (Xn  ~ s)  + \g"is)ixn  ~ sf  + ■■■ 

= g(s)  - g (s)en  + ^g"(s)£n  + ' ' • . 

The  exponent  of  en  in  the  first  nonvanishing  term  after  g(s)  is  called  the  order  of  the 
iteration  process  defined  by  g.  The  order  measures  the  speed  of  convergence. 

To  see  this,  subtract  g(s)  = s on  both  sides  of  (6).  Then  on  the  left  you  get  xn+i  — s = 
— en+i,  where  en+  \ is  the  error  of  xn+\.  And  on  the  right  the  remaining  expression  equals 
approximately  its  first  nonzero  term  because  \en\  is  small  in  the  case  of  convergence. 
Thus 

(a)  en+i  ~ +g,(i)en  in  the  case  of  first  order, 

C7)  i „ 2 

(b)  en+i  ~ — g g (s)en  in  the  case  of  second  order,  etc. 

Thus  if  en  = 1 0 ~ ?l  in  some  step,  then  for  second  order,  en+1  = c • (10-k)2  = c • 10~2fc, 
so  that  the  number  of  significant  digits  is  about  doubled  in  each  step. 


Convergence  of  Newton’s  Method 

In  Newton’s  method,  g(x)  = x — f(x)/f'(x).  By  differentiation. 


(8) 


f\xf  - f(x)f"(x) 
fix? 

f(x)f(x) 

f'ixf 


Since/(.s)  = 0.  this  shows  that  also  g’(s)  = 0.  Hence  Newton’s  method  is  at  least  of  second 
order.  If  we  differentiate  again  and  set  x = s,  we  find  that 


(8*) 


g\s) 


f(s) 

As) 


which  will  not  be  zero  in  general.  This  proves 


THEOREM  2 


Second-Order  Convergence  of  Newton’s  Method 

If  fix)  is  three  times  differentiable  and  f and  f"  are  not  zero  at  a solution  s of 
fix)  = 0,  then  for  Xq  sufficiently  close  to  s,  Newton’s  method  is  of  second  order. 


SEC.  19.2  Solution  of  Equations  by  Iteration 


805 


EXAMPLE  6 


EXAMPLE 


Comments.  For  Newton’s  method,  (7b)  becomes,  by  (8*), 


(9) 


em+ 1 


f"(s) 

2/(5) 


e 


2 

n- 


For  the  rapid  convergence  of  the  method  indicated  in  Theorem  2 it  is  important  that  s be 
a simple  zero  of/(x)  (thus/ (s)  =£  0)  and  that  x0  be  close  to  s,  because  in  Taylor’s  formula 
we  took  only  the  linear  term  [see  (5*)],  assuming  the  quadratic  term  to  be  negligibly  small. 
(With  a bad  xq  the  method  may  even  diverge!) 


Prior  Error  Estimate  of  the  Number  of  Newton  Iteration  Steps 

Use  xq  = 2 and  x±  = 1.901  in  Example  4 for  estimating  how  many  iteration  steps  we  need  to  produce  the 
solution  to  5D-accuracy.  This  is  an  a priori  estimate  or  prior  estimate  because  we  can  compute  it  after  only 
one  iteration,  prior  to  further  iterations. 

Solution.  We  ha ve/(x)  = x — 2 sinx  = 0.  Differentiation  gives 

f”(s)  f\x  i)  2 sin  x i 

« = « 0.57. 

2f\s ) 2/'(*i)  2(1  -2  cos*!) 


Hence  (9)  gives 


ki+1|  - 0.51  e2n  ~ 0.57(0.574_!)2  = 0.5734-i  0.57M4+1  £ 5 • 


10“6 


where  M = 2n  + 2n  1+  + 2 + 1=  2n+1  — 1.  We  show  below  that  e0  = —0.11.  Consequently,  our 

condition  becomes 


0.57m0.11m+1  £ 5 • 10“6. 


Hence  n = 2 is  the  smallest  possible  n,  according  to  this  crude  estimate,  in  good  agreement  with  Example  4. 

€q  ~ —0.11  is  obtained  from  €i  — €q  — (ei  — s)  — (e©  — s)  = —x±  + xo  ~ 0.10,  hence  e\  = e©  + 0.10  ~ 
— 0.57eo  or  0.57eo  + e©  + 0.10  « 0,  which  gives  6q  ~ —0.11. 


Difficulties  in  Newton’s  Method.  Difficulties  may  arise  if  \f'(x)\  is  very  small  near  a 
solution  s of  f(x)  = 0.  For  instance,  let  s be  a zero  of/(x)  of  second  or  higher  order.  Then 
Newton’s  method  converges  only  linearly,  as  is  shown  by  an  application  of  l’Hopital’s  rule 
to  (8).  Geometrically,  small  | / (x)  means  that  the  tangent  of  f(x)  near  s almost  coincides 
with  the  x-axis  (so  that  double  precision  may  be  needed  to  get  /(x)  and  / (x)  accurately 
enough).  Then  for  values  x = 5 far  away  from  s we  can  still  have  small  function  values 

R(l)  =/(?). 

In  this  case  we  call  the  equation  /(x)  = 0 ill-conditioned.  R (?)  is  called  the  residual  of 
/(x)  = 0 at  ?.  Thus  a small  residual  guarantees  a small  error  of  ? only  if  the  equation  is 
not  ill-conditioned. 

An  Ill-Conditioned  Equation 

fix ) = x5  + 10_4x'  = 0 is  ill-conditioned,  x = 0 is  a solution. /’(0)  = 10-4  is  small.  At  = 0. 1 the  residual 
f(0.1)  = 2 ■ I O'  is  small,  but  the  error  —0.1  is  larger  in  absolute  value  by  a factor  5000.  Invent  a more  drastic 
example  of  your  own. 

Secant  Method  for  Solving  f(x)  = 0 

Newton’s  method  is  very  powerful  but  has  the  disadvantage  that  the  derivative  f may 
sometimes  be  a far  more  difficult  expression  than  / itself  and  its  evaluation  therefore 
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EXAMPLE  8 


computationally  expensive.  This  situation  suggests  the  idea  of  replacing  the  derivative 
with  the  difference  quotient 


f'(xn) 


f(xn)  ~ f(Xn- 1) 


Xjl  Xn—  1 

Then  instead  of  (5)  we  have  the  formula  of  the  popular  secant  method 


(10) 


X71+I  Xn  f(.Xn) 


Xn  Xn— 1 

f(xn ) - fix n- 1) 


Geometrically,  we  intersect  the  x-axis  at  xn+1  with  the  secant  of  f(x)  passing  through 
Pn-i  and  Pn  in  Fig.  429.  We  need  two  starting  values  x0  and  x-y.  Evaluation  of  derivatives 
is  now  avoided.  It  can  be  shown  that  convergence  is  superlinear  (that  is,  more  rapid  than 
linear,  |en+1|  ~ const  • \en\  ’'62;  see  [E5]  in  App.  1),  almost  quadratic  like  Newton’s 
method.  The  algorithm  is  similar  to  that  of  Newton’s  method,  as  the  student  may  show. 

CAUTION!  It  is  not  good  to  write  (10)  as 


Xn+ 1 


Xn—lf(Xn ) Xnfixn—f) 

f(xn)  ~ fix n- 1) 


because  this  may  lead  to  loss  of  significant  digits  if  xn  and  x„_i  are  about  equal.  (Can 
you  see  this  from  the  formula?) 

Secant  Method 

Find  the  positive  solution  of f{x)  = x — 2 sin*  = 0 by  the  secant  method,  starting  from  jcq  — 2,  x\  = 1.9. 
Solution.  Here,  (10)  is 


xn+ 1 xn 


(xn  - 2 sin  xn)(xn  - xn_i) 
xn  - xn_1  + 2(sin  xn_i  - sinxj 


Nn 

Dn  ' 


Numeric  values  are: 


n 

xn—  1 

xn 

Nn 

Dn 

xn+ 1 

i 

2.000000 

1.900000 

-0.000740 

-0.174005 

-0.004253 

2 

1.900000 

1.895747 

-0.000002 

-0.006986 

-0.000252 

3 

1.895747 

1.895494 

0 

0 

jc3  = 1.895494  is  exact  to  6D.  See  Example  4. 
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Summary  of  Methods.  The  methods  for  computing  solutions  s of  f(x)  = 0 with  given 
continuous  (or  differentiable)  fix)  start  with  an  initial  approximation  x0  of  s and  generate 
a sequence  x1,x2i "•by  iteration.  Fixed-point  methods  solve  fix)  = 0 written  as 
x = g(x),  so  that  s is  a.  fixed  point  of  g,  that  is,  s = g(s).  For  g(x)  = x — f{x)/f  lx)  this  is 
Newton’s  method,  which,  for  good  x0  and  simple  zeros,  converges  quadratically  (and  for 
multiple  zeros  linearly).  From  Newton’s  method  the  secant  method  follows  by  replacing 
/ (x)  by  a difference  quotient.  The  bisection  method  and  the  method  of  false  position  in 
Problem  Set  19.2  always  converge,  but  often  slowly. 


PRQBL^EM^SE^Tig^ 


1-13 


FIXED-POINT  ITERATION 


Solve  by  fixed-point  iteration  and  answer  related 
questions  where  indicated.  Show  details. 


1.  Monotone  sequence.  Why  is  the  sequence  in  Example  1 
monotone?  Why  not  in  Example  2? 


2.  Do  the  iterations  (b)  in  Example  2.  Sketch  a figure 
similar  to  Fig.  427.  Explain  what  happens. 


3.  / = x — 0.5  cos  x = 0,  Xo  — 1.  Sketch  a figure. 

4.  / = x — cosec  x the  zero  near  x = 1. 

5.  Sketch  f(x)  = x3  — 5.00x2  + l.Olx  + 1.88,  showing 
roots  near  ±1  and  5.  Write  x = g(x)  = (5.00x2  — 
l.Olx  + 1.88)/x2.  Find  a root  by  starting  from  Xo  = 
5,  4,  1,  —1.  Explain  the  (perhaps  unexpected)  results. 


6.  Find  a form  x = g(x)  of/(x)  = 0 in  Prob.  5 that  yields 
convergence  to  the  root  near  x = 1 . 

7.  Find  the  smallest  positive  solution  of  sin  x = e~x. 

8.  Solve  x4  — x — 0.12  = 0 by  starting  from  xo  = 1. 

9.  Find  the  negative  solution  of  x4  — x — 0.12  = 0. 


10.  Elasticity.  Solve  x cosh  x = 1 . (Similar  equations 
appear  in  vibrations  of  beams;  see  Problem  Set  12.3.) 


11.  Drumhead.  Bessel  functions.  A partial  sum  of  the 
Maclaurin  series  of  Jq(x)  (Sec.  5.5)  is/(x)  = 1 — |x2  + 
gjx4  — 2S54X6-  Conclude  from  a sketch  that/(x)  = 0 
nearx  = 2.  Write /(x)  = Oasx  = g(x)  (by  dividing  fix) 
by  \x  and  taking  the  resulting  x-term  to  the  other  side). 
Find  the  zero.  (See  Sec.  12.10  for  the  importance  of  these 
zeros.) 

12.  CAS  EXPERIMENT.  Convergence.  Let/(x)  = x3  + 
2x2  — 3x  — 4 = 0.  Write  this  as  x = g(x),  for  g choos- 
ing (1)  (x3  -/)1/3,  (2)  (x2  - If)1'2,  (3)  x + \f, 
(4)  (x3  -/)/x2,  (5)  (2x2  -/)/( 2x),  and  (6)  x -///' 
and  in  each  case  Xo  = 1.5.  Find  out  about  convergence 
and  divergence  and  the  number  of  steps  to  reach  68- 
values  of  a root. 


13.  Existence  of  fixed  point.  Prove  that  if  g is  continuous 
in  a closed  interval  / and  its  range  lies  in  /,  then  the 
equation  x = g(x)  has  at  least  one  solution  in  I.  Illustrate 
that  it  may  have  more  than  one  solution  in  I. 


14-23 


NEWTON’S  METHOD 


Apply  Newton’s  method  (6S-accuracy).  First  sketch  the 
function(s)  to  see  what  is  going  on. 

14.  Cube  root.  Design  a Newton  iteration.  Compute 

^7,  x0  = 2. 

15.  /=  2x  — cosx,  xQ  = 1.  Compare  with  Prob.  3. 

16.  What  happens  in  Prob.  15  for  any  other  x0? 

17.  Dependence  on  x0.  Solve  Prob.  5 by  Newton’s  method 
with  x0  = 5,  4,  1,  —3.  Explain  the  result. 


18.  Legendre  polynomials.  Find  the  largest  root  of 
the  Legendre  polynomial  P^ix)  given  by  /six)  = 
g (63x5  — 70x3  + 15x)  (Sec.  5.3)  (to  be  needed  in 
Gauss  integration  in  Sec.  19.5)  (a)  by  Newton’s 
method,  (b)  from  a quadratic  equation. 


19.  Associated  Legendre  functions.  Find  the  smallest  posi- 
tive zero  of  P2  = (1  — x2)P'l  = ^ (— 7x4  + 8x2  — 1) 
(Sec.  5.3)  (a)  by  Newton's  method,  (b)  exactly,  by 
solving  a quadratic  equation. 


20.  x + lnx  = 2,  x0  = 2 

21.  / = x3  - 5x  + 3 = 0,  x0  = 2,  0,  -2 

22.  Heating,  cooling.  At  what  time  x (4S-accuracy  only)  will 
the  processes  governed  by ffix)  = 100(1  — e~°’2x)  and 
/2(x)  = 40e~oolx  reach  the  same  temperature?  Also 
find  the  latter. 


23.  Vibrating  beam.  Find  the  solution  of  cos  x cosh  x = 1 
near  x = §7 r.  (This  determines  a frequency  of  a 
vibrating  beam;  see  Problem  Set  12.3.) 

24.  Method  of  False  Position  (Regula  falsi).  Figure  430 
shows  the  idea.  We  assume  that  / is  continuous.  We 
compute  the  x-intercept  cp  of  the  line  through 
(a0,f(a0)),  (b0,f(b0)).  If  /(c0)  = 0,  we  are  done.  If 
/(<%)/(co)  < 0 (as  in  Fig.  430),  we  set  <q  = a0,  Zq  = c0 
and  repeat  to  get  ci,  etc.  If  f(a0)f(c0 ) > 0,  then 
/(c0)/(Z>o)  < 0 and  we  set  a1  = c0,  b i = b0,  etc. 

(a)  Algorithm.  Show  that 

_ a0f{b0 ) - b0f(a0) 

°°  f(b0 ) ~f(a0) 

and  write  an  algorithm  for  the  method. 
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(b)  Solve*4  = 2,  cos  x = Vx,  and  x + In  x = 2,  with 
a = l,b  = 2. 

25.  TEAM  PROJECT.  Bisection  Method.  This  simple  but 
slowly  convergent  method  for  finding  a solution  of 
/(x)  = 0 with  continuous /is  based  on  the  intermediate 
value  theorem,  which  states  that  if  a continuous  function 
/has  opposite  signs  at  some  x = a and  x = b (>  a),  that 
is,  either /(a)  < 0 ,f{b)  > 0 or  f(a)  > 0 ,f(b)  < 0,  then/ 


must  be  0 somewhere  on  [a,  b\.  The  solution  is  found 
by  repeated  bisection  of  the  interval  and  in  each  iteration 
picking  that  half  which  also  satisfies  that  sign  condition. 

(a)  Algorithm.  Write  an  algorithm  for  the  method. 

(b)  Comparison.  Solve*  = cos  x by  Newton’s  method 
and  by  bisection.  Compare. 

Solve  e~x  = In  x and  ex  + x4  + x = 2 by  bisection. 

SECANT  METHOD 

Solve,  using  x0  and  as  indicated: 

26.  e~x  — tanx  = 0,  x0  = 1,  *i  = 0.7 

27.  Prob.  21,  x0  = 1.0,  *j  = 2.0 

28.  x = cosx,  x0  = 0.5,  *i  = l 

29.  sinx  = cotx,  *o  =1,  *i  = 0.5 

30.  WRITING  PROJECT.  Solution  of  Equations. 
Compare  the  methods  in  this  section  and  problem  set, 
discussing  advantages  and  disadvantages  in  terms  of 
examples  of  your  own.  No  proofs,  just  motivations  and 
ideas. 


(c) 

26-29 


19.3  Interpolation 

We  are  given  the  values  of  a function /(x)  at  different  points  *o,  Xi,  ■ ■ ■ , xn.  We  want  to 
find  approximate  values  of  the  function  f(x)  for  “new”  x’s  that  lie  between  these  points 
for  which  the  function  values  are  given.  This  process  is  called  interpolation.  The  student 
should  pay  close  attention  to  this  section  as  interpolation  forms  the  underlying  foundation 
for  both  Secs.  19.4  and  19.5.  Indeed,  interpolation  allows  us  to  develop  formulas  for 
numeric  integration  and  differentiation  as  shown  in  Sec.  19.5. 

Continuing  our  discussion,  we  write  these  given  values  of  a function  / in  the  form 

/o  = fix o),  A = fix i),  • ' • , fn=  fixn ) 

or  as  ordered  pairs 

(*0>/o)>  ix  1,A)>  ixnJn)- 

Where  do  these  given  function  values  come  from?  They  may  come  from  a “mathematical” 
function,  such  as  a logarithm  or  a Bessel  function.  More  frequently,  they  may  be  measured 
or  automatically  recorded  values  of  an  “empirical”  function,  such  as  air  resistance  of  a 
car  or  an  airplane  at  different  speeds.  Other  examples  of  functions  that  are  “empirical” 
are  the  yield  of  a chemical  process  at  different  temperatures  or  the  size  of  the  U.S. 
population  as  it  appears  from  censuses  taken  at  10-year  intervals. 

A standard  idea  in  interpolation  now  is  to  find  a polynomial  pn  (x)  of  degree  u (or  less) 
that  assumes  the  given  values;  thus 

(1)  Pn(Xo)=fo,  Pn(X\)  = fl,  ■,  Pnixn)  = fn- 

We  call  this  pn  an  interpolation  polynomial  and  x0,  ■ • • , xn  the  nodes.  And  if  /(*)  is  a 
mathematical  function,  we  call  pn  an  approximation  of/(or  a polynomial  approximation, 
because  there  are  other  kinds  of  approximations,  as  we  shall  see  later).  We  use  pn  to  get 
(approximate)  values  of/for  x’s  between  xo  and  xn  (“interpolation”)  or  sometimes  outside 
this  interval  xq  = x Si  xn  (“extrapolation”). 


SEC.  19.3  Interpolation 
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Motivation.  Polynomials  are  convenient  to  work  with  because  we  can  readily  differentiate 
and  integrate  them,  again  obtaining  polynomials.  Moreover,  they  approximate  continuous 
functions  with  any  desired  accuracy.  That  is,  for  any  continuous  f(x ) on  an  interval 
J:  a x fk  b and  error  bound  [3  > 0,  there  is  a polynomial  pn  (x)  (of  sufficiently  high 
degree  n)  such  that 


This  is  the  famous  Weierstrass  approximation  theorem  (for  a proof  see  Ref.  [GenRef7], 
App.  1). 

Existence  and  Uniqueness.  Note  that  the  interpolation  polynomial  pn  satisfying  (1)  for 
given  data  exists  and  we  shall  give  formulas  for  it  below.  Furthermore,  pn  is  unique: 
Indeed,  if  another  polynomial  qn  also  satisfies  qn(x0 ) = /0,  • • • , qn(xn)  = fn,  then 
Pn(x)  ~ qn{x)  = 0 at  x0,  • • • , xn,  but  a polynomial  pn  — qn  of  degree  n (or  less)  with  n + 1 
roots  must  be  identically  zero,  as  we  know  from  algebra;  thus  pn(x)  = qn(x)  for  all  x,  which 
means  uniqueness.  ■ 

How  Do  We  Find  pnl  We  shall  explain  several  standard  methods  that  give  us  pn.  By 
the  uniqueness  proof  above,  we  know  that,  for  given  data,  the  different  methods  must  give 
us  the  same  polynomial.  However,  the  polynomials  may  be  expressed  in  different  forms 
suitable  for  different  purposes. 


Given  (xo,/o),  (x  ] , f\ ),  • • • , ( xn,fn ) with  arbitrarily  spaced  x,,  Lagrange  had  the  idea  of 
multiplying  each  fj  by  a polynomial  that  is  1 at  Xj  and  0 at  the  other  n nodes  and  then 
taking  the  sum  of  these  n + 1 polynomials.  Clearly,  this  gives  the  unique  interpolation 
polynomial  of  degree  n or  less.  Beginning  with  the  simplest  case,  let  us  see  how  this 
works. 

Linear  interpolation  is  interpolation  by  the  straight  line  through  (x0,/0),  (xi,/i);  see 
Fig.  431.  Thus  the  linear  Lagrange  polynomial  pi  is  a sum  p\  = L0  f0  + Lxf\  with  L0 
the  linear  polynomial  that  is  1 at  x0  and  0 at  x i ; similarly,  Li  is  0 at  x0  and  1 at  x\. 
Obviously, 


I/M  - pn(x)\  < (3  for  all  x on  J. 


Lagrange  Interpolation 


Li(x) 


X — x0 

xi  ~ x0 


This  gives  the  linear  Lagrange  polynomial 


(2) 


Pi(x)  = L0(x)f0  + T1(x)/1 


X — Xi 


•fo  + 


X — Xq 
*1  - *0 


•A- 


X0  - *1 


y 


y = fix) 


X, 


'0 


X 


X. 


X 


Fig.  431.  Linear  Interpolation 
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EXAMPLE  2 


Linear  Lagrange  Interpolation 

Compute  a 4D-value  of  In  9.2  from  In  9.0  = 2.1972,  In  9.5  = 2.2513  by  linear  Lagrange  interpolation  and 
determine  the  error,  using  In  9.2  = 2.2192  (4D). 

Solution.  *o  = 9.0,  X\  = 9.5,  fo  = In  9.0,  f\  = In  9.5.  Ln  (2)  we  need 

L0(x)  = X = —2.0(x  - 9.5),  L0(9.2)  = -2.0(-0.3)  = 0.6 

Li(x)  = * ~ s9  ° = 2.0(x  - 9.0),  Li(9.2)  = 2 • 0.2  = 0.4 

(see  Fig.  432)  and  obtain  the  answer 

ln  9.2  « pi(9.2)  = L0(9.2)/o  + Li(9.2)/i  = 0.6  • 2.1972  + 0.4  • 2.2513  = 2.2188. 

The  error  is  e = a — a = 2.2192  — 2.2188  = 0.0004.  Hence  linear  interpolation  is  not  sufficient  here  to  get 
4D  accuracy;  it  would  suffice  for  3D  accuracy. 


0 


L0  Ll 

— ■ O— — <f— 

_ . . . I I I l l 

9 9.2  9.5  10  11  * 

Fig.  432.  L0  and  L,  in  Example  1 


Quadratic  interpolation  is  interpolation  of  given  (xo,  fo),  (x  1 , /i),  (X2,  /2)  by  a second- 
degree  polynomial  p2(x),  which  by  Lagrange’s  idea  is 

(3a)  p2(x)  = L0(x)fo  + Li(x)A  + L2(x)f2 

with  Lo(xo)  = 1,  Li(xi)  = 1,  L2{xf)  = 1,  and  Lo(xi)  = Loixf)  = 0,  etc.  We  claim  that 


(3b) 


L0(x) 

L\(x) 

L2(x) 


1q(x) 
lo(xo ) 
Ii(x) 
ll(Xl) 

h(x) 

hixf) 


(x  - Xj)(x  - x2) 

(x o - xf)(xo  - x2) 
(x  - x0)(x  - x2) 
(xi  - x0)(xi  - x2) 
(x  - x0)(x  - xt) 
(x2  - x0)(x2  - xr)  ' 


How  did  we  get  this?  Well,  the  numerator  makes  L^ixf)  = 0 if  j =f=  k.  And  the  denominator 
makes  LpJxpJ  = 1 because  it  equals  the  numerator  at  x = xk . 


Quadratic  Lagrange  Interpolation 

Compute  In  9.2  by  (3)  from  the  data  in  Example  1 and  the  additional  third  value  ln  11.0  = 2.3979. 

Solution.  In  (3), 


L0(x)  = — ' 1L0)  = x2  - 20.5x  + 104.5,  L0(9.2)  = 0.5400, 

(9.0-9.5X9.0-11.0) 


Li(x)  = 


L2(x)  = 


(x  - 9.0)(x  - 11.0) 


1 


(9.5-9.0X9.5-11.0)  0.75 


(x2  - 20x  + 99),  Li(9.2)  = 0.4800, 


(x  - 9.0)(x  - 9.5)  = i_ 

(11.0-9.0X11.0-9.5)  3 


= - (x2  - 18.5x  + 85.5), 


L2(9.2)  = -0.0200, 
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(see  Fig.  433),  so  that  (3a)  gives,  exact  to  4D. 

In  9.2  » p2(9.2)  = 0.5400  • 2.1972  + 0.4800  • 2.2513  - 0.0200  ■ 2.3979  = 2.2192. 


General  Lagrange  Interpolation  Polynomial.  For  general  n we  obtain 


(4a) 


fix)  « pn(x)  = 2 Lk(x)fk  = 2) 
fc=0  fc= 0 


Ikix) 

lkixk)fk 


where  Lk(xk)  = 1 and  Lk  is  0 at  the  other  nodes,  and  the  Lk  are  independent  of  the  function 
/to  be  interpolated.  We  get  (4a)  if  we  take 

loix)  = {x  - Xi)(x  - X2)  ■ ■ ■ (x  - xn), 

(4b)  lk(x)  = (x  - x0)  ■ ‘ ' (x  ~ xk- i)(x  - xk+ 1)  ■ • • (x  - x„),  0 < k < n, 

ln(x)  = (x  - x0)(x  ~ xi)  • ■ ■ (x  ~ xn- 1). 

We  can  easily  see  that  pn{xk ) = fk-  Indeed,  inspection  of  (4b)  shows  that  Ik(xj)  = 0 if 
j # k,  so  that  for  x = xk,  the  sum  in  (4a)  reduces  to  the  single  term  (lk(xk)/lk(xk))  fk  = fk. 

Error  Estimate.  If /is  itself  a polynomial  of  degree  n (or  less),  it  must  coincide 
with  pn  because  the  n + 1 data  (xo,/o),  ■ • ■ , ( xn,fn ) determine  a polynomial  uniquely, 
so  the  error  is  zero.  Now  the  special /has  its  (n  + l)st  derivative  identically  zero.  This 
makes  it  plausible  that  for  a general  f its  ( n + l)st  derivative /<>l+1)  should  measure  the 
error 


ejx)  = fix)  - pn(x). 

It  can  be  shown  that  this  is  true  if/Cn+1)  exists  and  is  continuous.  Then,  with  a suitable 
t between  x0  and  xn  (or  between  Xq,  xn,  and  x if  we  extrapolate), 


(5) 


en(x)  = fix)  - Pnix) 


ix  - x0)ix  - X!)  • ■ ■ (x  - Xn) 


fn+1\t ) 

in  + 1)!  ' 


Thus  |en(x)|  is  0 at  the  nodes  and  small  near  them,  because  of  continuity.  The  product 
(x  — xo)  ■ • • (x  — xn)  is  large  for  x away  from  the  nodes.  This  makes  extrapolation  risky. 
And  interpolation  at  an  x will  be  best  if  we  choose  nodes  on  both  sides  of  that  x.  Also, 
we  get  error  bounds  by  taking  the  smallest  and  the  largest  value  0f  f(n+1\t)  in  (5)  on  the 
interval  xq  = f = xn  (or  on  the  interval  also  containing  x if  we  extrapolate). 
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Most  importantly,  since  pn  is  unique,  as  we  have  shown,  we  have 


Error  of  Interpolation 

Formula  (5)  gives  the  error  for  any  polynomial  interpolation  method  if  fix. ) has  a 
continuous  (n  + 1 )st  derivative. 


Practical  error  estimate.  If  the  derivative  in  (5)  is  difficult  or  impossible  to  obtain,  apply 
the  Error  Principle  (Sec.  19.1),  that  is,  take  another  node  and  the  Lagrange  polynomial 
pn+ i(x)  and  regard  /?n+1(x)  — pn{x)  as  a (crude)  error  estimate  for  pn(x). 

Error  Estimate  (5)  of  Linear  Interpolation.  Damage  by  Roundoff.  Error  Principle 

Estimate  the  error  in  Example  1 first  by  (5)  directly  and  then  by  the  Error  Principle  (Sec.  19.1). 

Solution.  (A)  Estimation  by  (5).  We  have  n — 1,  f(t)  — In  t,f\t ) = l/t,  f"(t)  = — \/t2.  Hence 

(-1)  0.03 

ei(x)  = (x  - 9.0)(x  - 9.5)  — — , thus  <ri(9.2)  - — — . 

It2  t2 

t = 0.9  gives  the  maximum  0.03/92  = 0.00037  and  t = 9.5  gives  the  minimum  0.03/9.52  = 0.00033,  so  that 
we  get  0.00033  ^ ei(9.2)  ^ 0.00037,  or  better,  0.00038  because  0.3/81  = 0.003703 

But  the  error  0.0004  in  Example  1 disagrees,  and  we  can  learn  something!  Repetition  of  the  computation  there 
with  5D  instead  of  4D  gives 

In  9.2  « pi(9.2)  = 0.6  • 2.19722  + 0.4  • 2.25129  = 2.21885 

with  an  actual  error  e = 2.21920  — 2.21885  = 0.00035,  which  lies  nicely  near  the  middle  between  our  two 
error  bounds. 

This  shows  that  the  discrepancy  (0.0004  vs.  0.00035)  was  caused  by  rounding,  which  is  not  taken  into  account 
in  (5). 

(B)  Estimation  by  the  Error  Principle.  We  calculate  p±{9.2)  = 2.21885  as  before  and  then  p2(9.2)  as  in 
Example  2 but  with  5D,  obtaining 

p2( 9.2)  = 0.54  • 2.19722  + 0.48  • 2.25129  - 0.02  • 2.39790  = 2.21916. 

The  difference  p2{9.2)  — pi(9.2)  = 0.00031  is  the  approximate  error  of  pi(9.2)  that  we  wanted  to  obtain;  this 
is  an  approximation  of  the  actual  error  0.00035  given  above. 

Newton’s  Divided  Difference  Interpolation 

For  given  data  (x0,f0),  ■ ■ ■ , (xn,fn)  the  interpolation  polynomial  pn(x)  satisfying  (1)  is 
unique,  as  we  have  shown.  But  for  different  purposes  we  may  use  pn(x ) in  different  forms. 
Lagrange’s  form  just  discussed  is  useful  for  deriving  formulas  in  numeric  differentiation 
(approximation  formulas  for  derivatives)  and  integration  (Sec.  19.5). 

Practically  more  important  are  Newton’s  forms  of pn(x),  which  we  shall  also  use  for  solving 
ODEs  (in  Sec.  21.2).  They  involve  fewer  arithmetic  operations  than  Lagrange’s  form. 
Moreover,  it  often  happens  that  we  have  to  increase  the  degree  n to  reach  a required  accuracy. 
Then  in  Newton’s  forms  we  can  use  all  the  previous  work  and  just  add  another  term,  a 
possibility  without  counterpart  for  Lagrange’s  form.  This  also  simplifies  the  application  of 
the  Error  Principle  (used  in  Example  3 for  Lagrange).  The  details  of  these  ideas  are  as  follows. 

Let  /?„_  i (x)  be  the  (n  — l)st  Newton  polynomial  (whose  form  we  shall  determine); 
thus /?„_!(*())  = /()■  Pn- 1 (x  i ) = /i,  • • • , Pn-liXn-i)  = fn-i-  Furthermore,  let  us  write  the 
nth  Newton  polynomial  as 


(6) 


Pn(x)  = Pn-l(x)  + gnixf 
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hence 

(6')  gn(x)  = Pn(x)  - Pn-l(x). 

Here  gn(x)  is  to  be  determined  so  that  pn(x0)  = fo,  Pn(x  t)  = fi,  • ■ • , PnUn)  = fn- 

Since  pn  and  pn-i  agree  at  xo,  ■ • • , xn_i,  we  see  that  gn  is  zero  there.  Also,  gn  will 
generally  be  a polynomial  of  nth  degree  because  so  is  pn,  whereas  pn-\  can  be  of  degree 
n — 1 at  most.  Hence  gn  must  be  of  the  form 

(6")  gn(x)  = an(x  - x0)(x  - xi)  • ■ • (x  - Xn_i). 

We  determine  the  constant  an.  For  this  we  set  x = xn  and  solve  (6”)  algebraically  for  an. 
Replacing  gn(xn)  according  to  (6  ) and  using  pn(xn)  = fn,  we  see  that  this  gives 

j ^ fn  Pn—liXn) 

(xn  - X0)(xn  - Xi)  • • • (xn  - xn_i)  ' 

We  write  aj{  instead  of  an  and  show  that  r//,  equals  the  feth  divided  difference,  recursively 
denoted  and  defined  as  follows: 


a\  =/[x0,  x i] 


fi  fo 
Xi  - x0 


«2  = f\.X0,X1,X2] 


/[x  1,  x2]  — /[x0,  Xll 
- X0 


and  in  general 


(8) 


flfc  = f[x o,  ■■■  ,xk] 


fix i,  • • • , xk]  - f[x o,  • ■ • , Xfc-J 
Xk  - x0 


If  n = 1,  then  pn-  \ (xn)  = po(x  ] ) = f0  because  po(x)  is  constant  and  equal  to/o,  the  value 
of  /(x)  at  xq-  Hence  (7)  gives 


«i  = 


A ~ PoC*i) 

Xi  - x0 


A -fo 

Xi  - x0 


= fixo,x1], 


and  (6)  and  (6")  give  the  Newton  interpolation  polynomial  of  the  first  degree 


Pi(x)  =f0  + (x  - x0)/[x0,xi]. 


If  n = 2,  then  this  p1  and  (7)  give 


fl2  — 


A ~ Pl(*2) 

(x2  - x0)(x2  - Xi) 


A ~ fo  - (x2  - Xq)/[x0,x1] 
(x2  - X0)(x2  - Xi) 


= fix0,x1,x2] 


where  the  last  equality  follows  by  straightforward  calculation  and  comparison  with  the 
definition  of  the  right  side.  (Verify  it;  be  patient.)  From  (6)  and  ( 6 ' ) we  thus  obtain  the 
second  Newton  polynomial 
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P2U)  =fo  + (x-  x0)f[x0,xx | + (x  - x0)(x  - x±)f[x0,x1,x2]. 

For  n = k,  formula  (6)  gives 

(9)  Pk(x ) = Pk-i(x)  + (x  ~ x0)(x  - xx)  • • • (x  - Xfc_i )/[x0,  • ■ ■ , xk]. 

With  po(x)  = /0  by  repeated  application  with  k = 1,  ■ • • , n this  finally  gives  Newton’s 
divided  difference  interpolation  formula 


(10)  ^ ~ + ^ ~ xo)/Uo,  X]_]  + (x  - X0)(x  - X i)/[x0,  Xi,  x2] 

+ ■■■  + (x  - x0)(x  - Xi)  ■ • • (x  - xn_i)/[x0,  • • ■ , x J. 

An  algorithm  is  shown  in  Table  19.2.  The  first  do-loop  computes  the  divided  differences 
and  the  second  the  desired  value  pn(x). 

Example  4 shows  how  to  arrange  differences  near  the  values  from  which  they  are 
obtained;  the  latter  always  stand  a half-line  above  and  a half-line  below  in  the  preceding 
column.  Such  an  arrangement  is  called  a (divided)  difference  table. 


Table  19.2  Newton’s  Divided  Difference  Interpolation 


ALGORITHM  INTERPOL  (x0,  • • ■ , xn;  f0,  • • ■ , fn\  x) 

This  algorithm  computes  an  approximation  pn(x)  of  f(x)  at  x. 
INPUT:  Data  (x0,  /„),  (x,,  f±),  ■■■,  (xn,  fn );  x 

OUTPUT : Approximation  pn(x)  of  f(x) 


Set  f[Xj]  = fj  O'  = 0,  ■ ■ • , n). 

For  m — 1 ,•••,«—  1)  do: 

For  j = 0,  • ■ • , n — m do: 

f[Xj,  * ' ' 5 Xj+m] 

End 

End 


f[xj+ 1,  • ' ' 5 Xj+m\  f\.Xjy  ' ' ' ■ Xj  + m_l  ] 

Xj+m  Xj 


Set  p0(x)  = /0. 

For  k = 1,  ■ • ■ , n do: 

Pfe(x)  = Pfc-l(x)  + (X  - X0)  • ' ' (X  - Xfc.O/fXo,  • • ■ , xk\ 

End 

OUTPUT  pn(x) 

End  INTERPOL 
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EXAMPLE  4 


Newton’s  Divided  Difference  Interpolation  Formula 

Compute /(9. 2)  from  the  values  shown  in  the  first  two  columns  of  the  following  table. 


Xj  fj  = f(Xj ) 

f[Xj,  Xj+1]  f[Xj,  xj+1,  Xj+2\  f[Xj,  • • • , xJ  + 3| 

8.0  (2.079442') 

9.0  2.197225 

9.5  2.251292 

11.0  2.397895 

(0.117783) 

(-0.006433) 

0.108134  (0.000411) 

-0.005200 

0.097735 

Solution.  We  compute  the  divided  differences  as  shown.  Sample  computation: 

(0.097735  - 0.108134)/(11  - 9)  = -0.005200. 

The  values  we  need  in  (10)  are  circled.  We  have 

fix)  = p3(x)  = 2.079442  + 0.1  17783(jc  - 8.0)  - 0.006433(*  - 8.0)(jc  - 9.0) 

+ 0.00041  \(x  - 8.0)(.v  - 9.0)(x  - 9.5). 


At  v = 9.2, 


/(9.2)  *=  2.079442  + 0.141340  - 0.001544  - 0.000030  = 2.219208. 

The  value  exact  to  6D  is/(9.2)  = In  9.2  = 2.219203.  Note  that  we  can  nicely  see  how  the  accuracy  increases 
from  term  to  term: 

Pi(9.2)  = 2.220782,  pz(9.2)  = 2.219238,  p3(9.2)  = 2.219208. 

Equal  Spacing:  Newton’s  Forward  Difference  Formula 

Newton’s  formula  (10)  is  valid  for  arbitrarily  spaced  nodes  as  they  may  occur  in  practice  in 
experiments  or  observations.  However,  in  many  applications  the  xfs  are  regularly  spaced — 
for  instance,  in  measurements  taken  at  regular  intervals  of  time.  Then,  denoting  the  distance 
by  h,  we  can  write 

(11)  Xo,  xi  = xo  + h,  X2  = xo  + 2 h,  • • ■ , xn  = xo  + nh. 

We  show  how  (8)  and  (10)  now  simplify  considerably! 

To  get  started,  let  us  define  the,  first  forward  difference  of/  at  xj  by 

A/)  ~ fj+i  ~ fj> 

the  second  forward  difference  of/  at  Xj  by 

A2/  = A fj+1  - A/, 

and,  continuing  in  this  way,  the  &th  forward  difference  of  / at  x j by 


(12) 


A kf  = Afc_1/+i  - Afe“1/ 


(k=  1.  2,  ■ • ■ ). 
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Examples  and  an  explanation  of  the  name  “forward”  follow  on  the  next  page.  What  is  the 
point  of  this?  We  show  that  if  we  have  regular  spacing  (11),  then 

(13)  f[x0,---,xk]  =^-Akf0. 

k\hk 


We  prove  (13)  by  induction.  It  is  true  for  k = 1 because  *1  = Xo  + K so  that 

,r  n fi  ~ fo  1 1 . , 

f[x0,  x-l]  = = t (A  - fo)  = 777  A/0. 

x\  ~ xo  h 1 ' n 

Assuming  (13)  to  be  true  for  all  forward  differences  of  order  k,  we  show  that  (13)  holds  for 
k + 1.  We  use  (8)  with  k + 1 instead  of  k\  then  we  use  ( k + 1 )h  = xk+i  ~ A'o,  resulting 
from  (11),  and  finally  (12)  with  j = 0,  that  is,  Afc+1/o  = Afc/i  — Ak/o-  This  gives 


f[x  o,  ■ ■ ',Xk+ ll  = ' 


fix  i,  • • • , xk+1]  - f[x0,  ■ ■ ■ , xk] 


1 


(k  + 1 )h 

1_ 

(k  + i y.h 


(. k + 1 )h 

-\^kh 


kill 


kill 


k Afe/o 


fc+1 


Afc+1/o 


which  is  (13)  with  k + 1 instead  of  k.  Formula  (13)  is  proved. 


In  (10)  we  finally  set  x = xo  + rh.  Then  x — xq  = rh,  x — X\  = (r  — 1 )h  since 
X\  — Xo  = h,  and  so  on.  With  this  and  (13),  formula  (10)  becomes  Newton’s  (or 
Gregory2-Newton  ’s)  forward  difference  interpolation  formula 


(14) 


fix)  » pn(x)  = X ( „ )A7o 

s=0 


. r(r  — 1)  , n 
= fo  + ''A/o  + — A fo  + 


(x  = Xq  + rh , r = (x  — Xo  )/h) 
r(r  — 1)  • • • (r  — n + 1) 


A7o 


where  the  binomial  coefficients  in  the  first  line  are  defined  by 


(15) 


= l = r(r  - l)(r  - 2)  ■ • • (r  - s + 1) 


(, s > 0,  integer) 


and  s!  = 1 • 2 ■ ■ ■ s. 

Error.  From  (5)  we  get,  with  x — Xq  = rh,  x — x1  = (r  — 1 )h,  etc., 


hn+1 

(16)  en(x)  = f(x)  - pn(x)  = r(r  — 1)  ■ ■ ■ (r  - n)fin+1\t) 

(n  + 1)! 


with  t as  characterized  in  (5). 


2JAMES  GREGORY  (1638-1675),  Scots  mathematician,  professor  at  St.  Andrews  and  Edinburgh.  A in  (14) 
and  V2  (on  p.  818)  have  nothing  to  do  with  the  Laplacian. 
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EXAMPLE  S 


Formula  (16)  is  an  exact  formula  for  the  error,  but  it  involves  the  unknown  t.  In 
Example  5 (below)  we  show  how  to  use  (16)  for  obtaining  an  error  estimate  and  an 
interval  in  which  the  true  value  of  f(x)  must  lie. 


Comments  on  Accuracy.  (A)  The  order  of  magnitude  of  the  error  en(x)  is  about  equal 
to  that  of  the  next  difference  not  used  in  pn(x). 

(B)  One  should  choose  x0,  • • • , xn  such  that  the  x at  which  one  interpolates  is  as  well 
centered  between  x0,  ■ • ■ , xn  as  possible. 

The  reason  for  (A)  is  that  in  (16), 


fn+\t)  » 


A w+7(0 

hn+1 


I r(r  - 1 ) ■■■  (r  - n)\ 
1 • 2---{n  + 1) 


if  \r\  g 1 


(and  actually  for  any  r as  long  as  we  do  not  extrapolate).  The  reason  for  (B)  is  that 
|r(r  — 1)  • • • (r  — n) \ becomes  smallest  for  that  choice. 

Newton’s  Forward  Difference  Formula.  Error  Estimation 

Compute  cosh  0.56  from  (14)  and  the  four  values  in  the  following  table  and  estimate  the  error. 


j 

Xj  fj  = coshjq 

A /,  A % A3/, 

0 

0.5  0-127626) 

(0.057839) 

1 

0.6  1.185465 

(0)011865) 

0.069704  (0.000697)) 

2 

0.7  1.255169 

0.012562 

0.082266 

3 

0.8  1.337435 

Solution.  We  compute  the  forward  differences  as  shown  in  the  table.  The  values  we  need  are  circled.  In  (14) 
we  have  r = (0.56  — 0.50)/0.1  = 0.6,  so  that  (14)  gives 


cosh  0.56  = 1.127626  + 0.6  • 0.057839  + °6(  °'4>  • 0.011865  + a6(  °-4)(  1 4)  . 0.000697 

2 6 

= 1.127626  + 0.034703  - 0.001424  + 0.000039 

= 1.160944. 


Error  estimate.  From  (16),  since  the  fourth  derivative  is  cosh(4)  t = cosh  t, 

0.14 

e3(0.56)  0.6(— 0.4)(— 1.4)(— 2.4)  cosh? 

4! 

= A cosh  t, 

where  A = —0.00000336  and  0.5  ^ t ^ 0.8.  We  do  not  know  t,  but  we  get  an  inequality  by  taking  the  largest 
and  smallest  cosh  t in  that  interval: 

A cosh  0.8  e3(0.62)  ^ A cosh  0.5. 

Since 


fix)  = P3(x)  + e3(*), 
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this  gives 


p3(0.56)  + A cosh  0.8  S cosh  0.56  p3( 0.56)  + A cosh  0.5. 


Numeric  values  are 


1.160939  § cosh  0.56  S 1.160941. 


The  exact  6D-value  is  cosh  0.56  = 1.160941.  It  lies  within  these  bounds.  Such  bounds  are  not  always  so  tight. 
Also,  we  did  not  consider  roundoff  errors,  which  will  depend  on  the  number  of  operations. 


This  example  also  explains  the  name  ‘forward  difference  formula”:  we  see  that  the 
differences  in  the  formula  slope  forward  in  the  difference  table. 


Equal  Spacing:  Newton's  Backward  Difference  Formula 

Instead  of  forward-sloping  differences  we  may  also  employ  backward-sloping  differ- 
ences. The  difference  table  remains  the  same  as  before  (same  numbers,  in  the  same 
positions),  except  for  a very  harmless  change  of  the  running  subscript  j (which  we  explain 
in  Example  6,  below).  Nevertheless,  purely  for  reasons  of  convenience  it  is  standard  to 
introduce  a second  name  and  notation  for  differences  as  follows.  We  define  the  first 
backward  difference  of  / at  Xj  by 


Vfj  1- 

the  second  backward  difference  of  / at  Xj  by 

V2/;  = V/-  - v/;_„ 

and,  continuing  in  this  way,  the  Ath  backward  difference  of  / at  xj  by 

(17)  Vfc/-  = V^1/-  - 1/,-_1  (A  = 1,  2,  • ■ ■ ). 

A formula  similar  to  (14)  but  involving  backward  differences  is  Newton’s  (or 
Gregory-Newton  ’s)  backward  difference  interpolation  formula 


” (r  + s - l\ffl 

fix)  « pn(x)  = 2 y J V fo 

(18)  s=° 

_ r(r  + 1)  _o 

= fo  + rVfo  -* — Y\ — v /°  + 


(x  = Xo  + rh,  r = (x  — x0)/h ) 
b Kr+  l)---(r  + n-  1)  yn/o 


Newton’s  Forward  and  Backward  Interpolations 

Compute  a 7D-value  of  the  Bessel  function  Jo(x)  for  x = 1.72  from  the  four  values  in  the  following  table,  using 
(a)  Newton’s  forward  formula  (14),  (b)  Newton’s  backward  formula  (18). 
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7for 

7back 

xi 

J 0(Xj) 

1st  Diff. 

2nd  Diff. 

3rd  Diff. 

0 

-3 

1.7 

0.3979849 

-0.0579985 

i 

-2 

1.8 

0.3399864 

-0.0581678 

-0.0001693 

0.0004093 

2 

-1 

1.9 

0.2818186 

-0.0579278 

0.0002400 

3 

0 

2.0 

0.2238908 

Solution.  The  computation  of  the  differences  is  the  same  in  both  cases.  Only  their  notation  differs. 

(a)  Forward.  In  (14)  we  have  r = (1.72  — 1.70)/0.1  = 0.2,  and  j goes  from  0 to  3 (see  first  column).  In 
each  column  we  need  the  first  given  number,  and  (14)  thus  gives 

y0 (1  -72)  = 0.3979849  + 0.2(-0.0579985)  + °-2(~°-8)  (-0.0001693)  + a2(~°-8)( ~ 18)  . 0.0004093 

2 6 

= 0.3979849  - 0.0115997  + 0.0000135  + 0.0000196  = 0.3864183, 

which  is  exact  to  6D,  the  exact  7D-value  being  0.3864185. 

(b)  Backward.  For  (18)  we  use  j shown  in  the  second  column,  and  in  each  column  the  last  number.  Since 
r = (1.72  — 2.00)/0.1  = —2.8,  we  thus  get  from  (18) 

/0(1.72)  = 0.2238908  - 2.8 (-0.0579278)  + ~1 2 3 4 5 * *-8^~1-8)  . 0.0002400  + ~2  8(~ L8)(-0.8)  . 0.0004093 

= 0.2238908  + 0.1621978  + 0.0006048  - 0.0002750 

= 0.3864184.  ■ 

There  is  a third  notation  for  differences,  called  the  central  difference  notation.  It 
is  used  in  numerics  for  ODEs  and  certain  interpolation  formulas.  See  Ref.  [E5]  listed  in 
App.  1. 


PRQBL-£M==S^-T—1-9~T 


1.  Linear  interpolation.  Calculate  pi(x)  in  Example  1 
and  from  it  In  9.3. 

2.  Error  estimate.  Estimate  the  error  in  Prob.  1 by  (5). 

3.  Quadratic  interpolation.  Gamma  function.  Calculate 
the  Lagrange  polynomial  p2(x)  for  the  values 
r(1.00)  = 1.0000,  T(  1.02)  = 0.9888,  r(1.04)  = 0.9784 
of  the  gamma  function  [(24)  in  App.  A3.1]  and  from  it 
approximations  of  T(  1 .01)  and  r(1.03). 

4.  Error  estimate  for  quadratic  interpolation.  Estimate 
the  error  for  pz(9.2)  in  Example  2 from  (5). 

5.  Linear  and  quadratic  interpolation.  Find  e-0'25  and 
e-075  by  linear  interpolation  of  e~x  with  Xo  = 0, 

Xi  = 0.5  andx0  = 0.5,  x1  = 1,  respectively.  Then  find 

p 2(x)  by  quadratic  interpolation  of  e~x  with  x0  = 0, 

Xi  = 0.5,  X2  = 1 and  from  it  e-025  and  e-0'75. 
Compare  the  errors.  Use  4S-values  of  e~x. 


6.  Interpolation  and  extrapolation.  Calculate  p2(x)  in 
Example  2.  Compute  from  it  approximations  of 
In  9.4,  In  10,  In  10.5,  In  11.5,  and  In  12.  Compute  the 
errors  by  using  exact  5S-values  and  comment. 

7.  Interpolation  and  extrapolation.  Find  the  quadratic 
polynomial  that  agrees  with  sinx  at  x = 0,  7t/4,  7t/2 
and  use  it  for  the  interpolation  and  extrapolation  of  sin  x 
at  x = — 7t/8,  7t/8,  37t/8,  57t/8.  Compute  the  errors. 

8.  Extrapolation.  Does  a sketch  of  the  product  of  the 
(x  — xj)  in  (5)  for  the  data  in  Example  2 indicate  that 
extrapolation  is  likely  to  involve  larger  errors  than 
interpolation  does? 

9.  Error  function  (35)  in  App.  A3.1.  Calculate  the 
Lagrange  polynomial  p2{x)  for  the  5S-values/(0.25)  = 
0.27633,/(0.5)  = 0.52050, /( 1.0)  = 0.84270  and  from 
p2(x)  an  approximation  of/(0.75)  (=  0.71116). 
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10.  Error  bound.  Derive  an  error  bound  in  Prob.  9 from  (5). 

11.  Cubic  Lagrange  interpolation.  Bessel  function  J0. 

Calculate  and  graph  L0,  L i,  L2,  La  with  x0  = 0, 
Xi  = l,x2  = 2,*3  = 3 on  common  axes.  Find  p3(x) 
for  the  data  (0,1),  (1,0.765198),  (2,0.223891), 
(3,  —0.260052)  [values  of  the  Bessel  function  Jq{x)\. 
Find  p3  foix  = 0.5,  1.5,  2.5  and  compare  with  the  6S- 
exact  values  0.938470,  0.511828,  -0.048384. 

12.  Newton’s  forward  formula  (14).  Sine  integral.  Using 
(14),  find  /(1.25)  by  linear,  quadratic,  and  cubic 
interpolation  of  the  data  (values  of  (40)  in  App.  A31);  6S- 
value  Si(  1 .25)  = 1.14645)/(1.0)  = 0.94608, /(l. 5)  = 
1.32468, /(2.0)  = 1.60541, /(2.5)  = 1.77852,  and  com- 
pute the  en'ors.  For  the  linear  interpolation  use /( 1.0) 
and/(1.5),  for  the  quadratic/!  1.0),  /(1.5),  /(2.0),  etc. 

13  Lower  degree.  Find  the  degree  of  the  interpolation 
polynomial  for  the  data  (—4,  50),  (—2,  18),  (0,  2),  (2,  2), 
(4,  18),  using  a difference  table.  Find  the  polynomial. 

14.  Newton’s  forward  formula  (14).  Gamma  function. 
Set  up  (14)  for  the  data  in  Prob.  3 and  compute  1/1.01), 
r(1.03),  T(1.05). 

15.  Divided  differences.  Obtain  p2  in  Example  2 from  ( 10). 

16.  Divided  differences.  Error  function.  Compute  p2( 0.75) 
from  the  data  in  Prob.  9 and  Newton’s  divided  difference 
formula  (10). 

17.  Backward  difference  formula  (18).  Use  p2(x)  in  (18) 
and  the  values  of  erf  x,  x = 0.2,  0.4.  0.6  in  Table  A4  of 
App.  5,  compute  erf  0.3  and  the  error.  (4S-exact  erf  0.3  = 
0.3286). 


18.  In  Example  5 of  the  text,  write  down  the  difference  table 
as  needed  for  (18),  then  write  (18)  with  general  x and 
then  with  x = 0.56  to  verify  the  answer  in  Example  5. 

19.  CAS  EXPERIMENT.  Adding  Terms  in  Newton 
Formulas.  Write  a program  for  the  forward  formula 
(14).  Experiment  on  the  increase  of  accuracy  by 
successively  adding  terms.  As  data  use  values  of  some 
function  of  your  choice  for  which  your  CAS  gives  the 
values  needed  in  determining  errors. 

20.  TEAM  PROJECT.  Interpolation  and  Extrapolation. 

(a)  Lagrange  practical  error  estimate  (after  Theo- 
rem 1).  Apply  this  to  p1(9.2)  andp2(9.2)  for  the  data 
x0  = 9.0,  Xi  = 9.5,  x2  = 11.0,/o  = lnx0,/i  = In*!, 
/2  = In  x2  (6S-values). 

(b)  Extrapolation.  Given  (xj,  f(xj))  = (0.2,  0.9980), 
(0.4,  0.9686),  (0.6,  0.8443),  (0.8,  0.5358),  (1.0,  0).  Find 
/( 0.7)  from  the  quadratic  interpolation  polynomials 
based  on  (a)  0.6,  0.8,  1.0,  (/ 3 ) 0.4,  0.6,  0.8,  (y)  0.2,  0.4, 
0.6.  Compare  the  errors  and  comment.  [Exact /(x)  = 
cos  (|  7 tx\  f(0.1)  = 0.7181  (4S).] 

(c)  Graph  the  product  of  factors  (x  — xf)  in  the  error 
formula  (5)  for  n — 2,  ■ ■ • , 10  separately.  What  do 
these  graphs  show  regarding  accuracy  of  interpolation 
and  extrapolation? 

21.  WRITING  PROJECT.  Comparison  of  interpolation 
methods.  List  4-5  ideas  that  you  feel  are  most  important 
in  this  section.  Arrange  them  in  best  logical  order. 
Discuss  them  in  a 2-3  page  report. 


19/  Spline  Interpolation 

Given  data  (function  values,  points  in  the  xy-plane)  (x0,  /0),  (x  ] , f ),  • • • , (xn,  fn)  can  be 
interpolated  by  a polynomial  Pn(x ) of  degree  n or  less  so  that  the  curve  of  Pn(x)  passes 
through  these  n + 1 points  (xj,  fj)\  here/0  = f(x o),  ■ ■ • , fn  = /(xn),  See  Sec.  19.3. 

Now  if  n is  large,  there  may  be  trouble:  Pn(x)  may  tend  to  oscillate  for  x between  the  nodes 
x0,  • • ■ , xn.  Hence  we  must  be  prepared  for  numeric  instability  (Sec.  19.1).  Figure  434  shows 
a famous  example  by  C.  Runge3  for  which  the  maximum  error  even  approaches  °°  as  n —*  °° 
(with  the  nodes  kept  equidistant  and  their  number  increased).  Figure  435  illustrates  the  increase 
of  the  oscillation  with  n for  some  other  function  that  is  piecewise  linear. 

Those  undesirable  oscillations  are  avoided  by  the  method  of  splines  initiated  by  I.  J. 
Schoenberg  in  1946  ( Quarterly  of  Applied  Mathematics  4,  pp.  45-99,  112-141).  This 
method  is  widely  used  in  practice.  It  also  laid  the  foundation  for  much  of  modern  CAD 
(computer-aided  design).  Its  name  is  borrowed  from  a draftman’s  spline,  which  is  an 
elastic  rod  bent  to  pass  through  given  points  and  held  in  place  by  weights.  The  mathematical 
idea  of  the  method  is  as  follows: 


3CARL  RUNGE  (1856-1927),  German  mathematician,  also  known  for  his  work  on  ODEs  (Sec.  21.1). 
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X 


Fig.  434.  Runge’s  example  f(x)  = 1/(1  + x2)  and  interpolating  polynomial  p ioM 


Fig.  435.  Piecewise  linear  function  f[x)  and  interpolation  polynomials  of  increasing  degrees 

Instead  of  using  a single  high-degree  polynomial  Pn  over  the  entire  interval  a x Is  b 
in  which  the  nodes  lie,  that  is, 

(1)  a = Xo  < X\  < ■ • ■ < xn  = b, 
we  use  n low-degree,  e.g.,  cubic,  polynomials 

qo(x),  qi(x),  qn- lW, 

one  over  each  subinterval  between  adjacent  nodes,  hence  qo  from  xq  to  X\,  then  q\  from 
Xi  to  X2,  and  so  on.  From  this  we  compose  an  interpolation  function  g(x),  called  a spline, 
by  fitting  these  polynomials  together  into  a single  continuous  curve  passing  through  the 
data  points,  that  is, 

(2)  g(x0)  = f(x  o)  = /o,  g{x  t)  = fix  i)  = A,  • • • , g(xn)  = f(xn)  = fn. 

Note  that  g(x)  = qo(x)  when  x0  = x S x±,  then  g(x)  = q \ (x)  when  x Axi  x2,  and  so 
on,  according  to  our  construction  of  g. 

Thus  spline  interpolation  is  piecewise  polynomial  interpolation. 

The  simplest  q;j\  would  be  linear  polynomials.  However,  the  curve  of  a piecewise  linear 
continuous  function  has  corners  and  would  be  of  little  interest  in  general — think  of 
designing  the  body  of  a car  or  a ship. 

We  shall  consider  cubic  splines  because  these  are  the  most  important  ones  in  applications. 
By  definition,  a cubic  spline  g(x)  interpolating  given  data  (xo,/o),  ■ ■ ■ , (xn,  /„)  is  a continuous 
function  on  the  interval  a=Xo^x^xn  = b that  has  continuous  first  and  second 
derivatives  and  satisfies  the  interpolation  condition  (2);  furthermore,  between  adjacent  nodes, 
g(x)  is  given  by  a polynomial  qj(x)  of  degree  3 or  less. 

We  claim  that  there  is  such  a cubic  spline.  And  if  in  addition  to  (2)  we  also  require  that 


(3) 


g\x  o)  = k0. 


8 C %n)  kn 
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THEOREM  1 


PROOF 


(given  tangent  directions  of  g(x)  at  the  two  endpoints  of  the  interval  a x ^ b),  then  we 
have  a uniquely  determined  cubic  spline.  This  is  the  content  of  the  following  existence 
and  uniqueness  theorem,  whose  proof  will  also  suggest  the  actual  determination  of  splines. 
(Condition  (3)  will  be  discussed  after  the  proof.) 


Existence  and  Uniqueness  of  Cubic  Splines 

Let  ( xo,  fo ),  (xi,/i),  • ■ ■ , ( xn,fn ) with  given  (arbitrarily  spaced)  Xj  [see  (1)]  and 
given  fj  = f(xj),j  = 0,  1,  ■ • ■ , n.  Let  k o and  kn  be  any  given  numbers.  Then  there 
is  one  and  only  one  cubic  spline  g(x)  corresponding  to  (1)  and  satisfying  (2) 
and  (3). 


By  definition,  on  every  subinterval  Ij  given  by  X,  ^ the  spline  g(x)  must  agree 

with  a polynomial  q/x)  of  degree  not  exceeding  3 such  that 

(4)  qf  xj)  = f(xf),  qj(Xj+ 1)  = f(.Xj+ 1)  O'  = 0,  1,  1). 

For  the  derivatives  we  write 

(5)  q'jixj)  = kj , q'j(Xj+i)  = kj+1  0 = 0,  1,  • ■ • , n - 1) 

with  k0  and  kn  given  and  k\,  ■ ■ ■ , kn_i  to  be  determined  later.  Equations  (4)  and  (5)  are 
four  conditions  for  each  qfx).  By  direct  calculation,  using  the  notation 


(6*) 


J_  _ 1 

hj  Xj  + 1 x j 


0 = 0,  n-  1) 


we  can  verify  that  the  unique  cubic  polynomial  qf  x)  (j  = 0,  1,  • • ■ , n — 1)  satisfying  (4) 
and  (5)  is 

qfx)  =f(xj)cf(x  - xj+1f[  1 + 2cj(x  - xf] 

(6)  1 - 2q(,:- ,j+l)] 

+ kjcf(x  — xf)(x  — Xj+ 1)2 
+ kj+1cf(x  — Xj)2(x  — Xj+1). 


Differentiating  twice,  we  obtain 


(7)  q'j(xj)  = —6  cff(xj)  + 6cff(xj+i ) — 4 Cjkj  — 2cjkj+\ 

(8)  q'jixj+i)  = 6 cjf(xj)  - 6cff(xj+1 ) + 2 cjkj  + 4cjkj+1. 


By  definition,  g(x)  has  continuous  second  derivatives.  This  gives  the  conditions 


q"~i(xj)  = q"  (xf) 


0 = l 1). 
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If  we  use  (8)  with  j replaced  by  j — 1,  and  (7),  these  n — 1 equations  become 


where  Vfj  = f(xj ) — and  Yfj+i  = f{xj+ 1)  — f(xj)  and  j = 1,  ■ • • , n — 1,  as  before. 

This  linear  system  of  n — 1 equations  has  a unique  solution  k\,  ■ ■ ■ , kn_  | since  the  coefficient 
matrix  is  strictly  diagonally  dominant  (that  is,  in  each  row  the  (positive)  diagonal  entry  is 
greater  than  the  sum  of  the  other  (positive)  entries).  Hence  the  determinant  of  the  matrix 
cannot  be  zero  (as  follows  from  Theorem  3 in  Sec.  20.7),  so  that  we  may  determine  unique 
values  ki,  ■ ■ ■ , kn_1  of  the  first  derivative  of  g(x)  at  the  nodes.  This  proves  the  theorem. 

Storage  and  Time  Demands  in  solving  (9)  are  modest,  since  the  matrix  of  (9)  is  sparse 
(has  few  nonzero  entries)  and  tridiagonal  (may  have  nonzero  entries  only  on  the  diagonal 
and  on  the  two  adjacent  “parallels”  above  and  below  it).  Pivoting  (Sec.  7.3)  is  not  necessary 
because  of  that  dominance.  This  makes  splines  efficient  in  solving  large  problems  with 
thousands  of  nodes  or  more.  For  some  literature  and  some  critical  comments,  see  American 
Mathematical  Monthly  105  (1998),  929-941. 

Condition  (3)  includes  the  clamped  conditions 


(geometrically:  zero  curvature  at  the  ends,  as  for  the  draftman’s  spline),  giving  a natural 
spline.  These  names  are  motivated  by  Fig.  293  in  Problem  Set  12.3. 

Determination  of  Splines.  Let  k0  and  kn  be  given.  Obtain  k i,  • ■ • , kn_1  by  solving  the 
linear  system  (9).  Recall  that  the  spline  g(x)  to  be  found  consists  of  n cubic  polynomials 
q0,  ■ ■ ■ , qn_ 1 . We  write  these  polynomials  in  the  form 

(12)  qj(x)  = fljo  + flji(x  — Xj ) + aj2(x  — Xj)2  + aj  %(x  — Xj)3 

where  j = 0,  ■ ■ • , n — 1.  Using  Taylor’s  formula,  we  obtain 


(9) 


(10) 


g'(x0)  = f'(x0),  g'(xn)=f'(x  n), 


in  which  the  tangent  directions  f'(x o)  and  f\xn)  at  the  ends  are  given.  Other  conditions 
of  practical  interest  are  the  free  or  natural  conditions 


(11) 


g\x0)  = 0,  g"(xn)  = 0 


by  (2), 
by  (5), 


(13) 


by  (7), 
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with  «j3  obtained  by  calculating  q"(xj+\)  from  (12)  and  equating  the  result  to  (8), 
that  is, 


6 2 

q'j(xj+i)  = 2ajz  + 6 aj3hj  = 72  (fj  ~fj+ 1)  + (kj  + 2kj+{), 

hj  hj 

and  now  subtracting  from  this  2«.;-2  as  given  in  (13)  and  simplifying. 

Note  that  for  equidistant  nodes  of  distance  hj  = h we  can  write  Cj  = c = 1/h  in  (6*) 
and  have  from  (9)  simply 


(14) 


kj— i 4 kj  4-  kj+ 1 ^ (fj+ 1 fj— i) 


O'  = 1, •••,«-!). 


Spline  Interpolation.  Equidistant  Nodes 

Interpolate /(x)  = x4  on  the  interval  — 1 g x g 1 by  the  cubic  spline  g(x)  corresponding  to  the  nodes  x0  = — 1, 
x1  = 0,  x2  = 1 and  satisfying  the  clamped  conditions  g'(—  1)  = / (— 1),  g'(l)  = / (1). 

Solution.  In  our  standard  notation  the  given  data  are  f0  =/(—  1)  = 1 ,/)  =/( 0)  = 0,/2  =/(  1)  = 1. 
We  have  h = 1 and  n = 2,  so  that  our  spline  consists  of  n = 2 polynomials 

q0(x)  = «oo  + a0i(x  + 1)  + fl02(^  + l)2  + a03(x  + l)3  (-1  S a:  g 0), 

q^x)  = «io  + aux  + a12.v2  + a13x3  (0  g x g 1). 

We  determine  the  kj  from  (14)  (equidistance!)  and  then  the  coefficients  of  the  spline  from  (13).  Since  n = 2, 
the  system  (14)  is  a single  equation  (with  j = 1 and  h = 1) 


k o + 4*i  + k2  = 3 (/2  -/0). 


Here  fo=  fz~  1 idle  value  of  x4  at  the  ends)  and  *0  = — 4,  k2  = 4,  the  values  of  the  derivative  4x3  at  the 
ends  — 1 and  1 . Hence 


-4  + 4*!  + 4 = 3(1  - 1)  = 0,  = 0. 

From  (13)  we  can  now  obtain  the  coefficients  of  q0,  namely,  a0o  = /o  = 1,  Qoi  = * o = — 4,  and 


«02  = 4(/i  -/o)  - 7»1  + 2*o)  = 3(0  - 1)  - (0  - 8)  = 5 

12  1 

a03  = 4(/o  — A)  + ~(*i  + *o)  = 2(1  - 0)  + (0  - 4)  = -2. 

13  l2 

Similarly,  for  the  coefficients  of  qi  we  obtain  from  (13)  the  values  ciio  — /i  — 0,  an  = k i = 0,  and 


012  = 3(/2  -A)  - (*2  + 2*!)  = 3(1  - 0)  - (4  + 0)  = -1 

013  = 2(/i  - fz)  + {kz  + kd  = 2(0  - 1)  + (4  + 0)  = 2. 

This  gives  the  polynomials  of  which  the  spline  g(x)  consists,  namely, 

( q0(x)  = 1 - 4(x  + 1)  + 5 (x  + l)2  - 2(x  + l)3  = -x2  - 2x3  if  - 1 g x g 0 

8(X)  = | 2 3 

l qx(x)  = -x2  + 2x3  if  0 g x g 1. 


Figure  436  shows  f(x ) and  this  spline.  Do  you  see  that  we  could  have  saved  over  half  of  our  work  by  using 
symmetry? 


SEC.  19.4  Spline  Interpolation 


825 


EXAMPLE  2 


Fig.  436.  Function /(x)  = x4  and  cubic  spline  g(x)  in  Example  1 


Natural  Spline.  Arbitrarily  Spaced  Nodes 

Find  a spline  approximation  and  a polynomial  approximation  for  the  curve  of  the  cross  section  of  the  circular- 
shaped Shrine  of  the  Book  in  Jerusalem  shown  in  Fig.  437. 


1 s 

• 

3 r 

• 

• _ 

2 

• 

• 

1 

i 

_£  -5  -4  -3  -2  -1 

1 1 . 

Fig.  437.  Shrine  of  the  Book  in  Jerusalem  (Architects  F.  Kissler  and  A.  M.  Bartus) 
Solution.  Thirteen  points,  about  equally  distributed  along  the  contour  (not  along  the  jc-axis!),  give  these  data: 


xi 

-5.8 

-5.0 

-4.0 

-2.5 

-1.5 

-0.8 

0 

0.8 

1.5 

2.5 

4.0 

5.0 

5.8 

fj 

0 

1.5 

1.8 

2.2 

2.7 

3.5 

3.9 

3.5 

2.7 

2.2 

1.8 

1.5 

0 

The  figure  shows  the  corresponding  interpolation  polynomial  of  12th  degree,  which  is  useless  because  of  its 
oscillation.  (Because  of  roundoff  your  software  will  also  give  you  small  error  terms  involving  odd  powers  of  x .) 
The  polynomial  is 

p12(x)  = 3.9000  - 0.65083X2  + 0.033858x4  + 0.011041x6  - 0.0014010x8 
+ 0.000055595.x10  - 0.00000071867.x12. 

The  spline  follows  practically  the  contour  of  the  roof,  with  a small  error  near  the  nodes  —0.8  and  0.8.  The  spline 
is  symmetric.  Its  six  polynomials  corresponding  to  positive  x have  the  following  coefficients  of  their 
representations  (12).  (Note  well  that  (12)  is  in  terms  of  powers  of  x — Xj,  not  x!) 


j 

x- interval 

«J0 

aH 

°j2 

«J3 

0 

OO 

© 

d 

d 

3.9 

0.00 

-0.61 

-0.015 

1 

0.8. ..1.5 

3.5 

-1.01 

-0.65 

0.66 

2 

1.5. ..2.5 

2.7 

-0.95 

0.73 

-0.27 

3 

2.5. ..4.0 

2.2 

-0.32 

-0.091 

0.084 

4 

4.0. ..5.0 

1.8 

-0.027 

0.29 

-0.56 

5 

5.0...5.8 

1.5 

-1.13 

-1.39 

0.58 
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1.  WRITING  PROJECT.  Splines.  In  your  own  words, 
and  using  as  few  formulas  as  possible,  write  a short 
report  on  spline  interpolation,  its  motivation,  a 
comparison  with  polynomial  interpolation,  and  its 
applications. 


VERIFICATIONS.  DERIVATIONS. 
COMPARISONS 

2.  Individual  polynomial  qj.  Show  that  qj{ x)  in  (6) 
satisfies  the  interpolation  condition  (4)  as  well  as  the 
derivative  condition  (5). 

3.  Verify  the  differentiations  that  give  (7)  and  (8)  from  (6). 

4.  System  for  derivatives.  Derive  the  basic  linear  system 
(9)  for  k\,  ■ ■ ■ , kn_  i as  indicated  in  the  text. 

5.  Equidistant  nodes.  Derive  (14)  from  (9). 

6.  Coefficients.  Give  the  details  of  the  derivation  of  aj2 
and  cijs  in  (13). 

7.  Verify  the  computations  in  Example  1. 

8.  Comparison.  Compare  the  spline  g in  Example  1 with 
the  quadratic  interpolation  polynomial  over  the  whole 
interval.  Find  the  maximum  deviations  of  g and  p2 
from/.  Comment. 

9.  Natural  spline  condition.  Using  the  given  coefficients, 
verify  that  the  spline  in  Example  2 satisfies  g"(x)  = 0 
at  the  ends. 


DETERMINATION  OF  SPLINES 

Find  the  cubic  spline  g(x)  for  the  given  data  with  k0  and 
kn  as  given. 

10.  /(— 2)  = /(- 1)  = /( 1)  = /( 2)  = 0,  /(0)  = 1, 
k0  = A'4  = 0 

11.  If  we  started  from  the  piecewise  linear  function  in 
Fig.  438,  we  would  obtain  g(x)  in  Prob.  10  as  the  spline 
satisfying  g'(-2)  = /'(-2)  = 0,  g'(2)  =/'(2)  = 0. 

Find  and  sketch  or  graph  the  corresponding  interpolation 
polynomial  of  4th  degree  and  compare  it  with  the  spline. 
Comment. 


Fig.  438.  Spline  and  interpolation  polynomial  in 
Probs.  10  and  11 


12.  f0  =/(0)  = 1,  A =/( 2)  = 9,  h =/(4)  = 41, 
/3=/(  6)  = 41,  k0  = 0,  k3  = — 12 

13.  /0  =/(0)  = 1,  A =/(l)  = 0,  h =/(2)  = -1, 
/3=/(  3)  = 0,  ko  = 0,  k3  = —6 

14. /o=/(0)  = 2,  /i  =/(l)  = 3,  /2=/( 2)  = 8, 
/a=/(3)=12,  k0  = k3  = 0 

15. /o=/(0)  = 4,  /i=/(2)  = 0,  /2=/( 4)  = 4, 
/3=/(6)  = 80,  k0  = k3  = 0 

16.  fo  = m = 2,  a = /( 2)  = -2,  h = m = 2, 

/3  = /(6)  = 78,  k0  = k3  = 0.  Can  you  obtain  the 
answer  from  that  of  Prob.  15? 

17.  If  a cubic  spline  is  three  times  continuously  differen- 
tiable (that  is,  it  has  continuous  first,  second,  and  third 
derivatives),  show  that  it  must  be  a single  polynomial. 

18.  CAS  EXPERIMENT.  Spline  versus  Polynomial.  If 
your  CAS  gives  natural  splines,  find  the  natural  splines 
when  x is  integer  from  —m  to  m,  and  y (0)  = 1 and  all 
other  y equal  to  0.  Graph  each  such  spline  along  with 
the  interpolation  polynomial  p2m-  Do  this  for  m = 2 to 
10  (or  more).  What  happens  with  increasing  m? 

19.  Natural  conditions.  Explain  the  remark  after  (11). 

20.  TEAM  PROJECT.  Hermite  Interpolation  and  Bezier 
Curves.  In  Hermite  interpolation  we  are  looking  for 
a polynomial  p( x)  (of  degree  2 n + 1 or  less)  such  that 
p(x)  and  its  derivative  p'(x)  have  given  values  at  n + 1 
nodes.  (More  generally,  p(x),  p’(x),  p"(x),  ■ ■ ■ may  be 
required  to  have  given  values  at  the  nodes.) 

(a)  Curves  with  given  endpoints  and  tangents.  Let 
C be  a curve  in  the  xy-plane  parametrically  represented 
by  r(r)  = \x{t),  y(f)],  0 S t S 1 (see  Sec.  9.5).  Show 
that  for  given  initial  and  terminal  points  of  a curve  and 
given  initial  and  terminal  tangents,  say, 

A:  r0  = [jt(0),y(0)] 

= [ x0 , yoL 

B:  ri  = [x(l),y(l)] 

= U'l.Vil 
v0  = U'(0),y'(0)| 

= Uo,  vol. 

Vl  = [*'(l),y'(D] 

= 

we  can  find  a curve  C,  namely, 
r(0  = r0  + v0t 

(15)  + (3(rj  - r0)  - (2v0  + v^f2 

+ (2(r0  - rD  + v0  + Vi)t3; 
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in  components, 

x{t)  = x0  + Xq  t + (3(xi  — .t0)  — (2-Tq  + xf))t2 
+ (2(t0  — ti)  + to  + x[  )t3 

y(l)  = yo  + yo t + (3(yi  - >’o)  - (2yo  + y{))t2 

+ (2(v0  - vi)  + yo  + y[)t 3. 

Note  that  this  is  a cubic  Hermite  interpolation  poly- 
nomial, and  n = 1 because  we  have  two  nodes  (the 
endpoints  of  C).  (This  has  nothing  to  do  with  the 
Hermite  polynomials  in  Sec.  5.8.)  The  two  points 

Ga '■  go  = r0  + v0 

= [to  + to,  >'o  + Vo] 


and 


Gb-  gi  = ri  — Vi 

= [ti  - x{ , Vi  - y{] 

are  called  guidepoints  because  the  segments  AG  a and 
BGb  specify  the  tangents  graphically.  A , B,  Ga > Gb 
determine  C,  and  C can  be  changed  quickly  by  moving 
the  points.  A curve  consisting  of  such  Hermite 
interpolation  polynomials  is  called  a Bezier  curve, 
after  the  French  engineer  P.  Bezier  of  the  Renault 


Automobile  Company,  who  introduced  them  in  the 
early  1960s  in  designing  car  bodies.  Bezier  curves  (and 
surfaces)  are  used  in  computer-aided  design  (CAD)  and 
computer-aided  manufacturing  (CAM).  (For  more 
details,  see  Ref.  [E21]  in  App.  1.) 

(b)  Find  and  graph  the  Bezier  curve  and  its 
guidepoints  if  A:  [0,  0],  B:  [1,  0],  v0  =[§,§], 
vi  = [— 2,-|V3]. 

(c)  Changing  guidepoints  changes  C.  Moving  guide- 
points  farther  away  results  in  C “staying  near  the 
tangents  for  a longer  time.”  Confirm  this  by  changing 
Vo  and  vi  in  (b)  to  2vo  and  2vi  (see  Fig.  439). 

(d)  Make  experiments  of  your  own.  What  happens  if 
you  change  vi  in  (b)  to  — Vi.  If  you  rotate  the  tangents? 
If  you  multiply  vq  and  vi  by  positive  factors  less  than  1? 


Fig.  439.  Team  Project  20(b)  and  (c):  Bezier  curves 


19.5  Numeric  Integration  and  Differentiation 

In  applications,  the  engineer  often  encounters  integrals  that  are  very  difficult  or  even 
impossible  to  solve  analytically.  For  example,  the  error  function,  the  Fresnel  integrals 
(see  Probs.  16-25  on  nonelementary  integrals  in  this  section),  and  others  cannot 
be  evaluated  by  the  usual  methods  of  calculus  (see  App.  3,  (24)-(44)  for  such 
“difficult”  integrals).  We  then  need  methods  from  numerical  analysis  to  evaluate  such 
integrals.  We  also  need  numerics  when  the  integrand  of  the  integral  to  be  evaluated 
consists  of  an  empirical  function,  where  we  are  given  some  recorded  values  of  that 
function.  Methods  that  address  these  kinds  of  problems  are  called  methods  of  numeric 
integration. 

Numeric  integration  means  the  numeric  evaluation  of  integrals 


J = 


fix)  dx 


where  a and  b are  given  and /is  a function  given  analytically  by  a formula  or  empirically 
by  a table  of  values.  Geometrically,  J is  the  area  under  the  curve  of  / between  a and  b 
(Fig.  440),  taken  with  a minus  sign  where /is  negative. 
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We  know  that  if/is  such  that  we  can  find  a differentiable  function  F whose  derivative 
is  / then  we  can  evaluate  J directly,  i.e.,  without  resorting  to  numeric  integration,  by 
applying  the  familiar  formula 


J = 


b 

f(x)  dx 


F(b)  - F(a ) 


[F'(x)  =/(*)]. 


Your  CAS  (Mathematica,  Maple,  etc.)  or  tables  of  integrals  may  be  helpful  for  this  purpose. 


Rectangular  Rule.  Trapezoidal  Rule 

Numeric  integration  methods  are  obtained  by  approximating  the  integrand  / by  functions 
that  can  easily  be  integrated. 

The  simplest  formula,  the  rectangular  rule,  is  obtained  if  we  subdivide  the  interval  of 
integration  a Si  x Si  b into  n subintervals  of  equal  length  h = (b  — a)/n  and  in  each 
subinterval  approximate  / by  the  constant /(jc*),  the  value  of  / at  the  midpoint  x*  of  the  yth 
subinterval  (Fig.  441).  Then/is  approximated  by  a step  function  (piecewise  constant  function), 
the  n rectangles  in  Fig.  441  have  the  areas  f(x*)h,  ■ ■ ■ ,f(x£)h,  and  the  rectangular  rule  is 


(1) 


J = 


fb 

f(x)  dx  « h[f(x't)  +/(4)  + 


+ /(*£)] 


The  trapezoidal  rule  is  generally  more  accurate.  We  obtain  it  if  we  take  the  same 
subdivision  as  before  and  approximate  / by  a broken  line  of  segments  (chords)  with 
endpoints  [a,  /(«)],  [jci,  f(x i)],  ■••,[/?,  f(b)]  on  the  curve  of  / (Fig.  442).  Then  the  area 
under  the  curve  of  / between  a and  b is  approximated  by  n trapezoids  of  areas 

l[m+f(Xl)]h,  +f(x2)]h,  •••,  2 [/(.*«- 1)  +f(b)]h. 


Fig.  440.  Geometric  interpretation 
of  a definite  integral 


Fig.  442.  Trapezoidal  rule 
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EXAMPLE  1 


By  taking  their  sum  we  obtain  the  trapezoidal  rule 


rb 

r i 

1 , 1 

J = 

. 

fix)  dx  ~ h 
a 

-fid)  +/(X!)  +/(x2)  + ■ 

• +fixn~  i)  + 2 m 

where  h = (b  — d)/n,  as  in  (1).  The  x/s  and  a and  b are  called  nodes. 

Trapezoidal  Rule 

f1  -r2 

Evaluate  J = e dx  by  means  of  (2)  with  n = 10. 

Note  that  this  integral  cannot  be  evaluated  by  elementary  calculus,  but  leads  to  the  error  function  (see  Eq.  (35), 
App.  3). 

Solution.  J = 0. 1(0.5  ■ 1.367879  + 6.778167)  = 0.746211  from  Table  19.3. 


Table  19.3  Computations  in  Example  1 


j 

4 

2 

e~xi 

0 

0 

0 

1.000000 

1 

0.1 

0.01 

0.990050 

2 

0.2 

0.04 

0.960789 

3 

0.3 

0.09 

0.913931 

4 

0.4 

0.16 

0.852144 

5 

0.5 

0.25 

0.778801 

6 

0.6 

0.36 

0.697676 

7 

0.7 

0.49 

0.612626 

8 

0.8 

0.64 

0.527292 

9 

0.9 

0.81 

0.444858 

10 

1.0 

1.00 

0.367879 

Sums 

1.367879 

6.778167 

Error  Bounds  and  Estimate  for  the  Trapezoidal  Rule 

An  error  estimate  for  the  trapezoidal  rule  can  be  derived  from  (5)  in  Sec.  19.3  with  n = 1 
by  integration  as  follows.  For  a single  subinterval  we  have 


fix)  ~ Pi(x')  = (x  - x0)(x  - Xi) 


fit) 


with  a suitable  t depending  on  x,  between  xo  and  xi.  Integration  over  x from  a = x o to 
X\  = Xq  + h gives 


x0+h 


fix)  dx  - ^ [/(x0)  + /(Xi)]  = 


r*o+h  f"(t(  'll 

(x  — Xq)(x  — xq  — h)  — — dx. 
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EXAMPLE  2 


Setting  x — xo  = v and  applying  the  mean  value  theorem  of  integral  calculus,  which  we  can 
use  because  ( x — xq)(x  — xq  — h)  does  not  change  sign,  we  find  that  the  right  side  equals 


(3*) 


rk 

u(v  — h)  dv 


fit) 

2 


3 2)2 


fit) 


where  7 is  a (suitable,  unknown)  value  between  xq  and  x±.  This  is  the  error  for  the 
trapezoidal  rule  with  n = 1,  often  called  the  local  error. 

Hence  the  error  e of  (2)  with  any  n is  the  sum  of  such  contributions  from  the  n 
subintervals;  since  h = (b  — a)/n,  nh  = n(b  — a)  /n  , and  {b  — a)  = n h , we  obtain 


(3) 


6 


jb  ~ af 

12m2 


fit) 


b - a 
12 


h2f(t) 


with  (suitable,  unknown)  t between  a and  b. 

Because  of  (3)  the  trapezoidal  rule  (2)  is  also  written 


rb 

\ i 

1 1 

(2*)  J = 

. 

fix)  dx  ~ h 

a 

~fia)  +/(xi)  + • 

• +fi*n- 1)  + 2 fib) 

Error  Bounds  are  now  obtained  by  taking  the  largest  value  for  f”,  say,  M2,  and  the 
smallest  value,  Mf,  in  the  interval  of  integration.  Then  (3)  gives  (note  that  K is  negative) 


(4) 


KM 2 K:  e KM2  where 


K = - 


(. b - af 
12  n2 


b — a 
12 


h2. 


Error  Estimation  by  Halving  h is  advisable  if f"  is  very  complicated  or  unknown,  for 
instance,  in  the  case  of  experimental  data.  Then  we  may  apply  the  Error  Principle  of 
Sec.  19.1.  That  is,  we  calculate  by  (2),  first  with  h,  obtaining,  say,  J = J},  + e;,,  and  then 
with  \ h,  obtaining  J = J^/2  + £h/2-  Now  if  we  replace  hz  in  (3)  with  (|/i)2,  the  error  is 
multiplied  by  \ . Hence  e^/2  ~ \ £ h (not  exactly  because  t may  differ).  Together, 
Jh/2  + efc/2  = Jh  + £h  ~ A + 4 eh/2-  Thus  Jh/2  - Jh  = (4  - l)eh/2.  Division  by  3 
gives  the  error  formula  for  Jh/2 

(5)  eh/2  ~ 3 iJh/2  - Jh)- 


Error  Estimation  for  the  Trapezoidal  Rule  by  (4)  and  (5) 

Estimate  the  error  of  the  approximate  value  in  Example  1 by  (4)  and  (5). 

Solution.  (A)  Error  bounds  by  (4).  By  differentiation, /"(x)  = 2(2x2  — Also  > OifO  < x < 1, 

so  that  the  minimum  and  maximum  occur  at  the  ends  of  the  interval.  We  compute  M2  = / (1)  = 0.735759  and 
Ml  = f "(G)  = — 2.  Furthermore,  K = —1/1200,  and  (4)  gives 

-0.000614  SeS  0.001667. 

Hence  the  exact  value  of  J must  lie  between 

0.746211  - 0.000614  = 0.745597  and  0.746211  + 0.001667  = 0.747878. 

Actually,  J = 0.746824,  exact  to  6D. 
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(B)  Error  estimate  by  (5).  J/t  = 0.7462 1 1 in  Example  1 . Also. 
Jh/2  = 0.05 


r is  i 

-O/20)2  1 


3=1 


+ - (1  + 0.367879) 
2 


= 0.746671. 


Hence  e^2  = a (Jh/2  ~ Jh)  = 0.000153  and  J^/ 2 + eh/2  = 0.746824,  exact  to  6D. 


Simpson’s  Rule  of  Integration 

Piecewise  constant  approximation  of  / led  to  the  rectangular  rule  (1),  piecewise  linear 
approximation  to  the  trapezoidal  rule  (2),  and  piecewise  quadratic  approximation  will  lead 
to  Simpson’s  rule,  which  is  of  great  practical  importance  because  it  is  sufficiently  accurate 
for  most  problems,  but  still  sufficiently  simple. 

To  derive  Simpson’s  rule,  we  divide  the  interval  of  integration  a Si  x g b into  an  even 
number  of  equal  subintervals,  say,  into  n = 2m  subintervals  of  length  h = (b  — a)/ (2m), 
with  endpoints  xo  (=  a),  Xy,  • • ■ , x2m_i,  x2m  (=  b)\  see  Fig.  443.  We  now  take  the  first 
two  subintervals  and  approximate  f(x)  in  the  interval  x0  = xS=x2  = Xo  + 2h  by  the 
Lagrange  polynomial  p2(x)  through  (x0,f0),  (x  y,fy),  (x2,/2),  where  fj  = f(xj).  From  (3) 
in  Sec.  19.3  we  obtain 


(6)  p2(x) 


(X  - Xy)(x  - X2)  + (X  - x0)(x  - X2)  + (x  ~ Xp)(x  ~ Xy) 

(x0  - XiKxo  - x2)  ' (X!  - X0)(Xl  - X2)  j (x2  - x0)(x2  - Xy)  ' ' 


2 2 2 / 

The  denominators  in  (6)  are  2 h , —h  , and  2 h , respectively.  Setting  s = (x  — xy )/h,  we 

have 


x — xi  = sh,  x — xo  = x — (x'i  — h)  = (s  + 1 )/? 
x — x2  = x — (xi  + h)  = (s  — l)h 


and  we  obtain 


P2(x)  = g.v(.S'  - l)/o  - (s  + 1 )(.v  - 1 )/i  + \(s  + 1 )sf2. 

We  now  integrate  with  respect  to  x from  xo  to  x2.  This  corresponds  to  integrating  with 
respect  to  s from  — 1 to  1 . Since  dx  = h ds,  the  result  is 


(7*) 


f(x)  dx  - 


p2{x)  dx  = h ( | fo  + | fy  + | f2  ) ■ 


Fig.  443.  Simpson’s  rule 


832 


CHAP.  19  Numerics  in  General 


A similar  formula  holds  for  the  next  two  subintervals  from  x2  to  X4,  and  so  on.  By  summing 
all  these  m formulas  we  obtain  Simpson’s  rule4 


(7) 


rb 

f{x)  dx  ~ 


h 

3 


(/o  + 4/i  + 2/2  + 4/3  + • • • + 2/2m_2  + 4/2m_i  + /2m), 


where  h = (b  — a)/(2m)  and  / = f(xf).  Table  19.4  shows  an  algorithm  for  Simpson’s 
rule. 


Table  19.4  Simpson’s  Rule  of  Integration 


ALGORITHM  SIMPSON  (a,  b,  m,  f0,  f\,  ■ ■ ■ , f2m) 

This  algorithm  computes  the  integral  J = Jb  fix)  dxftom  given  values  /■  = /( xf  at 

equidistant  x0  = a,  x1  = x0  + h,  ■ ■ ■ 
where  h = (b  — a)/i2m). 

x2 m — x0  + 2mh  = b by  Simpson’s  rule  (7), 

INPUT:  a,  b,  m,  f0,  ■ ■ ■ , f2m 

OUTPUT:  Approximate  value  J of  J 

Compute  s0  =f0+  f2m 

*1  = /l  + /3  + ■ 

4 f 2.m—\ 

^2  = fz  4 4 

4 f>2m— 2 

h = ib  — a)  12m 

~ h 

J = - iso  + 4ji 

+ 2 s2) 

OUTPUT  J.  Stop. 

End  SIMPSON 

Error  of  Simpson’s  Rule  (7).  If  the  fourth  derivative  /(4)  exists  and  is  continuous  on 
a ts  x ts  b,  the  error  of  (7),  call  it  es,  is 


(8) 


( b — a)5 
180  (2m)4 


/C4)(f) 


b — a 
180 


hY4\ry, 


here  t is  a suitable  unknown  value  between  a and  b.  This  is  obtained  similarly  to  (3). 
With  this  we  may  also  write  Simpson’s  rule  (7)  as 


(7**) 


[b 

fix)  dx 


3 ifo  + 4/i  + ■ • • + /am)  - /i4/(4)(f)' 


4THOMAS  SIMPSON  (1710-1761),  self-taught  English  mathematician,  author  of  several  popular  textbooks. 

Simpson’s  rule  was  used  much  earlier  by  Torricelli,  Gregory  (in  1668),  and  Newton  (in  1676). 
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Error  Bounds.  By  taking  for/<4)  in  (8)  the  maximum  M4  and  minimum  M%  on  the  interval 
of  integration  we  obtain  from  (8)  the  error  bounds  (note  that  C is  negative) 

(9)  C«lSesaC«i  where 

Degree  of  Precision  (DP)  of  an  integration  formula.  This  is  the  maximum  degree  of 
arbitrary  polynomials  for  which  the  formula  gives  exact  values  of  integrals  over  any 
intervals. 

Hence  for  the  trapezoidal  rule, 


DP  = 1 


because  we  approximate  the  curve  of/ by  portions  of  straight  lines  (linear  polynomials). 
For  Simpson’s  rule  we  might  expect  DP  = 2 (why?).  Actually, 

DP  = 3 

by  (9)  because /(4)  is  identically  zero  for  a cubic  polynomial.  This  makes  Simpson’s  rule 
sufficiently  accurate  for  most  practical  problems  and  accounts  for  its  popularity. 

Numeric  Stability  with  respect  to  rounding  is  another  important  property  of  Simpson’s 
rule.  Indeed,  for  the  sum  of  the  roundoff  errors  ej  of  the  2m  + 1 values/  in  (7)  we  obtain, 
since  h = (b  — a)/2m, 

^ |e0  + 4ei  + ■ ■ • + e2m|  g \ ° 6 mu  = (b  - a)u 

3 3.2  m 

where  u is  the  rounding  unit  (u  = \ • 10-6  if  we  round  off  to  6D;  see  Sec.  19.1).  Also 
6=l+4+lis  the  sum  of  the  coefficients  for  a pair  of  intervals  in  (7);  take  m = 1 in 
(7)  to  see  this.  The  bound  (b  — a)u  is  independent  of  m,  so  that  it  cannot  increase  with 
increasing  m,  that  is,  with  decreasing  h.  This  proves  stability. 

Newton-Cotes  Formulas.  We  mention  that  the  trapezoidal  and  Simpson  rules  are  special 
closed  Newton-Cotes  formulas,  that  is,  integration  formulas  in  which /(x)  is  interpolated 
at  equally  spaced  nodes  by  a polynomial  of  degree  n(n  = 1 for  trapezoidal,  n = 2 for 
Simpson),  and  closed  means  that  a and  b are  nodes  (a  = xo,  b = xn).  n = 3 and  higher 
n are  used  occasionally.  From  n = 8 on,  some  of  the  coefficients  become  negative,  so 
that  a positive  /■  could  make  a negative  contribution  to  an  integral,  which  is  absurd.  For 
more  on  this  topic  see  Ref.  [E25]  in  App.  1 . 


EXAMPLE  3 


Simpson’s  Rule.  Error  Estimate 

f1  -x2 

Evaluate  J = \ e ^ dx  by  Simpson’s  rule  with  2m  =10  and  estimate  the  error. 


Jo 

Solution.  Since  h = 0.1,  Table  19.5  gives 


0.1 

/ « — (1.367879  + 4 • 3.740266  + 2 • 3.037901)  = 0.746825. 
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Estimate  of  error.  Differentiation  gives /<4>(x)  = 4(4.v4  — 1 2x2  + 3)e~x  . By  considering  the  derivative /t5> 
of  /l4)  we  find  that  the  largest  value  of  /<4>  in  the  interval  of  integration  occurs  at  0 and  the  smallest  value  at 
x*  = (2.5  — 0.5  VTO)1/2.  Computation  gives  the  values  M4  =/c4>(  0)  = 12andM|  =/<4>( x*)  = —7.419.  Since 
2m  = 10  and  b — a = 1,  we  obtain  C = —1/1800000  = —0.00000056.  Therefore,  from  (9), 

-0.000007  sf)s  0.000005. 

Hence  J must  lie  between  0.746825  - 0.000007  = 0.746818  and  0.746825  + 0.000005  = 0.746830,  so  that  at 
least  four  digits  of  our  approximate  value  are  exact.  Actually,  the  value  0.746825  is  exact  to  5D  because 
J = 0.746824  (exact  to  6D). 

Thus  our  result  is  much  better  than  that  in  Example  1 obtained  by  the  trapezoidal  rule,  whereas  the  number 
of  operations  is  nearly  the  same  in  both  cases. 


Table  19.5  Computations  in  Example  3 


j 

4 

2 

0 

0 

0 

1.000000 

1 

0.1 

0.01 

0.990050 

2 

0.2 

0.04 

0.960789 

3 

0.3 

0.09 

0.913931 

4 

0.4 

0.16 

0.852144 

5 

0.5 

0.25 

0.778801 

6 

0.6 

0.36 

0.697676 

7 

0.7 

0.49 

0.612626 

8 

0.8 

0.64 

0.527292 

9 

0.9 

0.81 

0.444858 

10 

1.0 

1.00 

0.367879 

Sums 

1.367879 

3.740266 

3.037901 

Instead  of  picking  an  n = 2m  and  then  estimating  the  error  by  (9),  as  in  Example  3,  it  is 
better  to  require  an  accuracy  (e.g.,  6D)  and  then  determine  n = 2m  from  (9). 


EXAMPLE  4 


Determination  of  n = 2m  in  Simpson’s  Rule  from  the  Required  Accuracy 


What  n should  we  choose  in  Example  3 to  get  6D-accuracy? 

Solution.  Using  M4  = 12  (which  is  bigger  in  absolute  value  than  Mj,  we  get  from  (9),  with  b — a = 1 and 
the  required  accuracy. 


|cm4| 


12 

180(2m)4 


1 

2 


• 10 


-6 


thus 


m 


'2  ■ 106  • 12 
. 180  • 24 


1/4 


9.55. 


Hence  we  should  choose  n = 2m  = 20.  Do  the  computation,  which  parallels  that  in  Example  3. 

Note  that  the  error  bounds  in  (4)  or  (9)  may  sometimes  be  loose,  so  that  in  such  a case  a smaller  n = 2m 
may  already  suffice. 


Error  Estimation  for  Simpson’s  Rule  by  Halving  h.  The  idea  is  the  same  as  in  (5) 
and  gives 


(10) 


*h/2  ~ 


x 

15 


{Jh/2  ~ Jh)- 


J},  is  obtained  by  using  h and  J^/ 2 by  using  ^h,  and  e /l/2  is  the  error  of  Jh/2- 
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EXAMPLE  6 


1 112 
Derivation.  In  (5)  we  had  3 as  the  reciprocal  of  3 = 4 — 1 and  4 = (2)  resulted  from 

h2  in  (3)  by  replacing  h with  \ h.  In  (10)  we  have  /5  as  the  reciprocal  of  15  = 16  — 1 

and  jg  = (g)4  results  from  h4  in  (8)  by  replacing  h with  \ h. 

Error  Estimation  for  Simpson’s  Rule  by  Halving 

Integrate /(jc)  = \ ttx4  cos  | ttx  from  0 to  2 with  h = 1 and  apply  (10). 

Solution.  The  exact  5D-value  of  the  integral  is  J = 1.25953.  Simpson’s  rule  gives 

Jh  = s[/(0)  + 4/(1)  + /( 2)]  = J(0  + 4 • 0.555360  + 0)  = 0.740480, 

Jhl  2 = B 1/(0)  + 4/(|)  + 2/(1)  + 4/(1)  +/( 2)] 

= g[0  + 4 • 0.045351  + 2 ■ 0.555361  + 4 • 1.521579  + 0]  = 1.22974. 

Hence  (10)  gives  e/,/2  = jj(l. 22974  — 0.74048)  = 0.032617  and  thus  J « Jh/z  + e/,/2  = 1.26236,  with  an 
error  —0.00283  which  is  less  in  absolute  value  than  jg  of  the  error  0.02979  of  Jh/2.  Hence  the  use  of  (10)  was 
well  worthwhile. 


Adaptive  Integration 

The  idea  is  to  adapt  step  h to  the  variability  of  fix).  That  is,  where/varies  but  little,  we  can 
proceed  in  large  steps  without  causing  a substantial  error  in  the  integral,  but  where  / varies 
rapidly,  we  have  to  take  small  steps  in  order  to  stay  everywhere  close  enough  to  the  curve 

of  f 

Changing  h is  done  systematically,  usually  by  halving  h,  and  automatically  (not  “by  hand”) 
depending  on  the  size  of  the  (estimated)  error  over  a subinterval.  The  subinterval  is  halved 
if  the  corresponding  error  is  still  too  large,  that  is,  larger  than  a given  tolerance  TOL 
(maximum  admissible  absolute  error),  or  is  not  halved  if  the  error  is  less  than  or  equal  to 
TOL  (or  doubled  if  the  error  is  very  small). 

Adapting  is  one  of  the  techniques  typical  of  modern  software.  In  connection  with 
integration  it  can  be  applied  to  various  methods.  We  explain  it  here  for  Simpson’s  rule.  In 
Table  19.6  an  asterisk  means  that  for  that  subinterval,  TOL  has  been  reached. 


Adaptive  Integration  with  Simpson’s  Rule 

Integrate  f(x ) = \ ttx4  cos  \ ttx  from  x = 0 to  2 by  adaptive  integration  and  with  Simpson’s  rule  and 
TOL[0,  2]  = 0.0002. 

Solution.  Table  19.6  shows  the  calculations.  Figure  444  shows  the  integrand  f(x)  and  the  adapted  intervals 
used.  The  first  two  intervals  ([0,  0.5],  [0.5,  1.0])  have  length  0.5,  hence  h = 0.25  [because  we  use  2m  = 2 
subintervals  in  Simpson’s  rule  (7**)].  The  next  two  intervals  ([1.00,  1.25],  [1.25,  1.50])  have  length  0.25 
(hence  h = 0.125)  and  the  last  four  intervals  have  length  0.125.  Sample  computations.  For  0.740480  see 
Example  5.  Formula  (10)  gives  (0.123716  — 0.1 22794)/ 15  = 0.000061.  Note  that  0.123716  refers  to  [0,  0.5] 
and  [0.5,  1],  so  that  we  must  subtract  the  value  corresponding  to  [0,  1]  in  the  line  before.  Etc. 
TOL[0,  2]  = 0.0002  gives  0.0001  for  subintervals  of  length  1,  0.00005  for  length  0.5,  etc.  The  value  of  the 
integral  obtained  is  the  sum  of  the  values  marked  by  an  asterisk  (for  which  the  error  estimate  has  become 
less  than  TOL).  This  gives 

J « 0.123716  + 0.528895  + 0.388263  + 0.218483  = 1.25936. 

The  exact  5D-value  is  J = 1.25953.  Hence  the  error  is  0.00017.  This  is  about  1/200  of  the  absolute  value  of 
that  in  Example  5.  Our  more  extensive  computation  has  produced  a much  better  result. 
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Table  19.6  Computations  in  Example  6 


Interval 

Integral 

Error  (10) 

TOL 

Comment 

[0,  2] 

0.740480 

0.0002 

[0,  1] 
[1,2] 

0.122794 
1.10695 
Sum  = 1.22974 

0.032617 

0.0002 

Divide  further 

[0.0,  0.5] 
[0.5,  1.0] 

0.004782 
0.118934 
Sum  = 0.123716* 

0.000061 

0.0001 

TOL  reached 

[1.0,  1.5] 
[1.5,  2.0] 

0.528176 
0.605821 
Sum  = 1.13300 

0.001803 

0.0001 

Divide  further 

[1.00,  1.25] 
[1.25,  1.50] 

0.200544 

0.328351 

Sum  = 0.528895* 

0.000048 

0.00005 

TOL  reached 

[1.50,  1.75] 
[1.75,  2.00] 

0.388235 
0.218457 
Sum  = 0.606692 

0.000058 

0.00005 

Divide  further 

[1.500,  1.625] 
[1.625,  1.750] 

0.196244 
0.192019 
Sum  = 0.388263* 

0.000002 

0.000025 

TOL  reached 

[1.750,  1.875] 
[1.875,  2.000] 

0.153405 
0.065078 
Sum  = 0.218483* 

0.000002 

0.000025 

TOL  reached 

Gauss  Integration  Formulas 
Maximum  Degree  of  Precision 

Our  integration  formulas  discussed  so  far  use  function  values  at  predetermined 
(equidistant)  x- values  (nodes)  and  give  exact  results  for  polynomials  not  exceeding  a 
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EXAMPLE  7 


certain  degree  [called  the  degree  of  precision',  see  after  (9)].  But  we  can  get  much  more 
accurate  integration  formulas  as  follows.  We  set 


(ID 


t 

m dt  « 

J-t 


3=  1 


with  fixed  n,  and  t = ±1  obtained  from  x = a,  b by  setting  x = \ [ait  — 1 ) + b(t  + 1)]. 
Then  we  determine  the  n coefficients  A1;  ■ ■ ■ , An  and  n nodes  t\,  - ■ ■ ,tn  so  that  (11)  gives 
exact  results  for  polynomials  of  degree  k as  high  as  possible.  Since  n + n = 2 n is  the 
number  of  coefficients  of  a polynomial  of  degree  2 n — 1,  it  follows  that  k Si  2n  — 1. 

Gauss  has  shown  that  exactness  for  polynomials  of  degree  not  exceeding  2 n — 1 (instead 
of  n — 1 for  predetermined  nodes)  can  be  attained,  and  he  has  given  the  location  of  the 
tj(=  the  jth  zero  of  the  Legendre  polynomial  Pn  in  Sec.  5.3)  and  the  coefficients  ,4.;  which 
depend  on  n but  not  on f(t),  and  are  obtained  by  using  Lagrange’s  interpolation  polynomial, 
as  shown  in  Ref.  [E5]  listed  in  App.  1.  With  these  tj  and  Aj,  formula  (11)  is  called  a Gauss 
integration  formula  or  Gauss  quadrature  formula.  Its  degree  of  precision  is  2 n — 1 , as 
just  explained.  Table  19.7  gives  the  values  needed  for  n = 2,  • • • , 5.  (For  larger  n,  see 
pp.  916-919  of  Ref.  [GenRefl]  in  App.  1.) 


Table  19.7  Gauss  Integration:  Nodes  t,  and  Coefficients  Aj 


n 

Nodes  tj 

Coefficients  Aj 

Degree  of  Precision 

-0.5773502692 

1 

2 

0.5773502692 

1 

3 

-0.7745966692 

0.5555555556 

3 

0 

0.8888888889 

5 

0.7745966692 

0.5555555556 

-0.8611363116 

0.3478548451 

-0.3399810436 

0.6521451549 

4 

0.3399810436 

0.6521451549 

7 

0.8611363116 

0.3478548451 

-0.9061798459 

0.2369268851 

-0.5384693101 

0.4786286705 

5 

0 

0.5688888889 

9 

0.5384693101 

0.4786286705 

0.9061798459 

0.2369268851 

Gauss  Integration  Formula  with  n = 3 

Evaluate  the  integral  in  Example  3 by  the  Gauss  integration  formula  (11)  with  n = 3. 

Solution.  We  have  to  convert  our  integral  from  0 to  1 into  an  integral  from  —1  to  1.  We  set  x = \{t  + 1). 
Then  dx  = \dt,  and  (11)  with  n = 3 and  the  above  values  of  the  nodes  and  the  coefficients  yields 
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exp  (— x2)  dx 


(t  + l)2) 


dt 


1 

2 


5 

9 exP 


0.746815 


(exact  to  6D:  0.746825),  which  is  almost  as  accurate  as  the  Simpson  result  obtained  in  Example  3 with  a much 
larger  number  of  arithmetic  operations.  With  3 function  values  (as  in  this  example)  and  Simpson’s  rule  we  would 
get  g (1  + 4e-0'25  + e_1)  = 0.747180,  with  an  error  over  30  times  that  of  the  Gauss  integration. 


Gauss  Integration  Formula  with  n = 4 and  5 

Integrate /(x)  = \ 7Tx4  cos  ] ttx  from  x = 0 to  2 by  Gauss.  Compare  with  the  adaptive  integration  in  Example  6 
and  comment. 

Solution,  x = t + 1 gives /(?)  = 4 77" (7  + l)4  cos  (\  7T  (l  + 1)),  as  needed  in  (11).  For  n = 4 we  calculate  (6S) 


j = Aj/i  + ■ ■ • + A4/4  = Aj(/i  + /4)  + A2(/2  + /3) 

= 0.347855(0.000290309  + 1.02570)  + 0.652145(0.129464  + 1.25459)  = 1.25950. 

The  error  is  0.00003  because  J = 1.25953  (6S).  Calculating  with  10S  and  n = 4 gives  the  same  result;  so  the 
error  is  due  to  the  formula,  not  rounding.  For  n = 5 and  IDS  we  get  J ' ■ 1.259526185,  too  large  by  the  amount 
0.000000250  because  J = 1.259525935  (10S).  The  accuracy  is  impressive,  particularly  if  we  compare  the  amount 
of  work  with  that  in  Example  6. 

Gauss  integration  is  of  considerable  practical  importance.  Whenever  the  integrand  / is 
given  by  a formula  (not  just  by  a table  of  numbers)  or  when  experimental  measurements 
can  be  set  at  times  u (or  whatever  t represents)  shown  in  Table  19.7  or  in  Ref.  [GenRefl], 
then  the  great  accuracy  of  Gauss  integration  outweighs  the  disadvantage  of  the  complicated 
tj  and  Aj  (which  may  have  to  be  stored).  Also,  Gauss  coefficients  Aj  are  positive  for  all 
n,  in  contrast  with  some  of  the  Newton-Cotes  coefficients  for  larger  n. 

Of  course,  there  are  frequent  applications  with  equally  spaced  nodes,  so  that  Gauss 
integration  does  not  apply  (or  has  no  great  advantage  if  one  first  has  to  get  the  tj  in  (11) 
by  interpolation). 

Since  the  endpoints  — 1 and  1 of  the  interval  of  integration  in  (11)  are  not  zeros  of  Pn, 
they  do  not  occur  among  t0,  ■ ■ • , tn,  and  the  Gauss  formula  (11)  is  called,  therefore,  an 
open  formula,  in  contrast  with  a closed  formula,  in  which  the  endpoints  of  the  interval 
of  integration  are  t0  and  tn.  [For  example,  (2)  and  (7)  are  closed  formulas.] 


Numeric  Differentiation 

Numeric  differentiation  is  the  computation  of  values  of  the  derivative  of  a function/from 
given  values  off.  Numeric  differentiation  should  be  avoided  whenever  possible.  Whereas 
integration  is  a smoothing  process  and  is  not  very  sensitive  to  small  inaccuracies  in  function 
values,  differentiation  tends  to  make  matters  rough  and  generally  gives  values  off  that  are 
much  less  accurate  than  those  of  /.  The  difficulty  with  differentiation  is  tied  in  with  the 
definition  of  the  derivative,  which  is  the  limit  of  the  difference  quotient,  and,  in  that  quotient, 
you  usually  have  the  difference  of  a large  quantity  divided  by  a small  quantity.  This  can 
cause  numerical  instability.  While  being  aware  of  this  caveat,  we  must  still  develop  basic 
differentiation  formulas  for  use  in  numeric  solutions  of  differential  equations. 

We  use  the  notations  f = f (x;),  f = f (Xj),  etc.,  and  may  obtain  rough  approximation 
formulas  for  derivatives  by  remembering  that 


fix)  = lim 
J h—*0 


fix  + h)  - f(x) 
h 


SEC.  19.5 


Numeric  Integration  and  Differentiation 


839 


This  suggests 
(12) 


,/l/2 


£/l/2  fl  ~ fo 


h h 

Similarly,  for  the  second  derivative  we  obtain 


(13) 


„ ^ = fo-lfo+fo 

•'l  . o . o 


etc. 


More  accurate  approximations  are  obtained  by  differentiating  suitable  Lagrange 
polynomials.  Differentiating  (6)  and  remembering  that  the  denominators  in  (6)  are  2 h2. 


— /i2,  2 h2,  we  have 


, , 2x  — Xi  — t2  2x  — to  — X?  2x  — In  ~~  X1 

fix)  « p'2(x)  = \ fo 1 A + \ 1 fo- 
lk2 h2  2/? 2 

Evaluating  this  at  x(),  xj,  x2,  we  obtain  the  “three-point  formulas” 

1 


(14) 


(a)  /o  - ^(-3/0 + 4A -A), 

(b) 


(c)  fi  “ — (/o  - 4/i  + 3/2). 

2/7 


Applying  the  same  idea  to  the  Lagrange  polynomial  /74A),  we  obtain  similar  formulas, 
in  particular, 


(15) 


/2  “^(/o-  8A  + 8/3  - fo). 


Some  examples  and  further  formulas  are  included  in  the  problem  set  as  well  as  in 
Ref.  [E5]  listed  in  App.  1. 


P R OB  L E M-S€T  1 9 .5 


RECTANGULAR  AND  TRAPEZOIDAL  RULES 

1.  Rectangular  rule.  Evaluate  the  integral  in  Example 
1 by  the  rectangular  rule  (1)  with  subintervals  of 
length  0.1.  Compare  with  Example  1.  (6S-exact: 
0.746824) 

2.  Bounds  for  (1).  Derive  a formula  for  lower  and  upper 
bounds  for  the  rectangular  rule.  Apply  it  to  Prob.  1. 


3.  Trapezoidal  rule.  To  get  a feel  for  increase  in  accuracy, 
integrate x2  from 0 to  1 by  (2)  with/;  = 1,0.5,0.25,0.1. 

4.  Error  estimation  by  halfing.  Integrate /(x)  = x4  from 
0 to  1 by  (2)  with  h = 1,  h = 0.5,  h = 0.25  and  esti- 
mate the  error  for  h = 0.5  and  h = 0.25  by  (5). 

5.  Error  estimation.  Do  the  tasks  in  Prob.  4 for 

f(x)  = sin  1 7Tx. 
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6.  Stability.  Prove  that  the  trapezoidal  rule  is  stable  with 
respect  to  rounding. 


7-15 


SIMPSON’S  RULE 


' dx 

Evaluate  the  integrals  A = | — , B = 


xe  1 dx. 


J = 


dx 


by  Simpson’s  rule  with  2m  as  indicated, 


„ 1 + x 

and  compare  with  the  exact  value  known  from  calculus. 
7.  A,  2m  = 4 8.  A,  2m  = 10 

9.  B,  2m  = 4 10.  B , 2m  = 10 

11.  J,  2m  = 4 12.  J,  2m  = 10 


13.  Error  estimate.  Compute  the  integral  J by  Simpson’s 
rule  with  2m  = 8 and  use  the  value  and  that  in  Prob. 
11  to  estimate  the  error  by  (10). 

14.  Error  bounds  and  estimate.  Integrate  e~x  from  0 to  2 
by  (7)  with  h — 1 and  with  h = 0.5.  Give  error  bounds 
for  the  li  = 0.5  value  and  an  error  estimate  by  (10). 

15.  Given  TOL.  Find  the  smallest  n in  computing  A (see 
Probs.  7 and  8)  such  that  5S-accuracy  is  guaranteed 
(a)  by  (4)  in  the  use  of  (2),  (b)  by  (9)  in  the  use  of  (7). 


16-21 


NONELEMENTARY  INTEGRALS 


The  following  integrals  cannot  be  evaluated  by  the  usual 
methods  of  calculus.  Evaluate  them  as  indicated.  Compare 
your  value  with  that  possibly  given  by  your  CAS.  Si  (A)  is 
the  sine  integral.  S(A)  and  C(x)  are  the  Fresnel  integrals. 
See  App.  A3.1.  They  occur  in  optics. 


Si(je) 


sin  x* 
x* 


dx*. 


S(x)  = sin  (x*  ) dx*. 


C(jc)  = cos  ( x * ) dx* 

'o 


26.  TEAM  PROJECT.  Romberg  Integration  (W.  Rom- 
berg, Norske  Videnskab.  Trondheim,  Fcjirh.  28,  Nr.  7, 
1955).  This  method  uses  the  trapezoidal  rule  and  gains 
precision  stepwise  by  halving  h and  adding  an  error 
estimate.  Do  this  for  the  integral  of  f(x)  = e~x  from 
x = 0 to  x = 2 with  TOL  = 10~3,  as  follows. 

Step  1.  Apply  the  trapezoidal  rule  (2)  with  h = 2 
(hence  n = 1)  to  get  an  approximation  Ji  j . Halve  h 
and  use  (2)  to  get  J21  and  an  error  estimate 

e2i  = „ (At  — At)- 
22  - 1 

If  |e2il  S TOL,  stop.  The  result  is  J22  = At  + e2i- 
Step  2.  Show  that  e21  = —0.066596,  hence 
|e21|  > TOL  and  go  on.  Use  (2)  with  h/4  to  get  731 
and  add  to  it  the  error  estimate  e34  = 3 (73j  — 72i)  to 
get  the  better  ,/32  = J31  + e31.  Calculate 

1 ,,  1 

€32  ~ y*  _ ] (A2  — J22)  ~ | j (^32  _ A 2)- 

If  |e32|  S TOL,  stop.  The  result  is  J33  = J32  + e32. 
(Why  does  24  = 16  come  in?)  Show  that  we  obtain 
e32  = —0.000266,  so  that  we  can  stop.  Arrange  your 
J-  and  e-values  in  a kind  of  “difference  table.” 


If  |e32|  were  greater  than  TOL,  you  would  have  to 
go  on  and  calculate  in  the  next  step  741  from  (2)  with 
h = 4;  then 


16.  Si  ( 1 ) by  (2),  n = 5,  n = 10,  and  apply  (5). 

17.  Si  ( 1)  by  (7),  2m  = 2,  2m  = 4 

18.  Obtain  a better  value  in  Prob.  17.  Hint.  Use  (10). 

19.  Si  ( 1)  by  (7),  2m  = 10 

20.  S(1.25)  by  (7),  2m  = 10 

21.  C(1.25)  by  (7),  2m  = 10 


22-25  GAUSS  INTEGRATION 

Integrate  by  (11)  with  n = 5: 

22.  cos  x from  0 to  \ tt 

23.  xe~x  from  0 to  1 

24.  sin  (x2)  from  0 to  1.25 

25.  exp  (—x2)  from  0 to  1 


J42  ~ At  + £41  with  e4i  = 3(i4i  - 73i) 

743  = J42  + £42  with  e42  = jj(742  — J32) 

A 4 = A 3 + £43  with  e43  = fi3  (As  — J33) 

where  63  = 26  — 1.  (How  does  this  come  in?) 

Apply  the  Romberg  method  to  the  integral 
of  /( x)  = 47rx4cos47rx  from  x = 0 to  2 with 
TOL  = 10-4 


27-30 


DIFFERENTIATION 


27.  Consider  f(x)  = jc4  for  Xq  = 0,  Xi  = 0.2,  x2  = 0.4, 
x3  = 0.6,  x4  = 0.8.  Calculate  f2  from  (14a),  (14b), 
(14c),  (15).  Determine  the  errors.  Compare  and 
comment. 
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28.  A “four-point  formula”  for  the  derivative  is 

A “ ^7  (-2/1  - 3/2  + 6/3  -/4). 

6/7 

Apply  it  to  /(x)  = x4  with  xi,  ■ ■ • , x4  as  in  Prob.  27, 
determine  the  error,  and  compare  it  with  that  in  the  case 
of  (15). 

29.  The  derivative  f\x)  can  also  be  approximated  in 
terms  of  first-order  and  higher  order  differences  (see 
Sec.  19.3): 


/'(x 0)  » \ (a/o  - 2 A2/0 

+ g A3/0  - - A4/0  + j . 

Compute  f'(0A)  in  Prob.  27  from  this  formula,  using 
differences  up  to  and  including  first  order,  second 
order,  third  order,  fourth  order. 

30.  Derive  the  formula  in  Prob.  29  from  (14)  in  Sec.  19.3. 


3^gagE^E33ISBigBiS^^BEESTIONS  AND  PROBLEMS 


1.  What  is  a numeric  method?  How  has  the  computer 
influenced  numerics? 

2.  What  is  an  error?  A relative  error?  An  error  bound? 

3.  Why  are  roundoff  errors  important?  State  the  rounding 
rules. 

4.  What  is  an  algorithm?  Which  of  its  properties  are 
important  in  software  implementation? 

5.  What  do  you  know  about  stability? 

6.  Why  is  the  selection  of  a good  method  at  least  as 
important  on  a large  computer  as  it  is  on  a small  one? 

7.  Can  the  Newton  (-Raphson)  method  diverge?  Is  it  fast? 
Same  questions  for  the  bisection  method. 

8.  What  is  fixed-point  iteration? 

9.  What  is  the  advantage  of  Newton’s  interpolation 
formulas  over  Lagrange’s? 

10.  What  is  spline  interpolation?  Its  advantage  over 
polynomial  interpolation? 

11.  List  and  compare  the  integration  methods  we  have 
discussed. 

12.  How  did  we  use  an  interpolation  polynomial  in  deriving 
Simpson’s  rule? 

13.  What  is  adaptive  integration?  Why  is  it  useful? 

14.  In  what  sense  is  Gauss  integration  optimal? 

15.  How  did  we  obtain  formulas  for  numeric  differentiation? 

16.  Write  -46.9028104,0.000317399,54/7,-890/3  in 
floating-point  form  with  5S  (5  significant  digits, 
properly  rounded). 

17.  Compute  (5.346  — 3.644)/(3.444  — 3.055)  as  given 
and  then  rounded  stepwise  to  3S,  2S,  IS.  Comment. 
(“Stepwise”  means  rounding  the  rounded  numbers,  not 
the  given  ones.) 

18.  Compute  0.38755/(5.6815  — 0.38419)  as  given  and 
then  rounded  stepwise  to  4S,  3S,  2S,  IS.  Comment. 

19.  Let  19.1  and  25.84  be  correctly  rounded.  Find  the 
shortest  interval  in  which  the  sum  s of  the  true 
(unrounded)  numbers  must  lie. 


20.  Do  the  same  task  as  in  Prob.  19  for  the  difference 
3.2  - 6.29. 

21.  What  is  the  relative  error  of  na  in  terms  of  that  of  3? 

22.  Show  that  the  relative  error  of  d2  is  about  twice  that 
of  a. 

23.  Solve  x2  — 40x  + 2 = 0 in  two  ways  (cf.  Sec.  19.1). 
Use  4S-arithmetic. 

24.  Solve  x2  - 100.x  +1=0.  Use  5S-arithmetic. 

25.  Compute  the  solution  of  x4  = x + 0.1  near  x = 0 by 
transforming  the  equation  algebraically  to  the  form 
x = g(x)  and  starting  from  Xo  = 0. 

26.  Solve  cosx  = x2  by  Newton’s  method,  starting  from 
x = 0.5. 

27.  Solve  Prob.  25  by  bisection  (3S-accuracy). 

28.  Compute  sinh  0.4  from  sinh  0,  sinh  0.5  = 0.521, 
sinh  1.0  = 1.175  by  quadratic  interpolation. 

29.  Find  the  cubic  spline  for  the  data/(0)  = 0, /(l)  = 0, 

/(2)  = 4,*0  = -1,*2  = 5. 

30.  Find  the  cubic  spline  q and  the  interpolation  polynomial 
p for  the  data  (0,  0),  (1,  1),  (2,  6),  (3,  10),  with 
q ( 0)  = 0,  q\ 3)  = 0 and  graph  p and  q on  common 
axes. 

31.  Compute  the  integral  of  x3  from  0 to  1 by  the 
trapezoidal  rule  with  n = 5.  What  error  bounds  are 
obtained  from  (4)  in  Sec.  19.5?  What  is  the  actual  error 
of  the  result? 

32.  Compute  the  integral  of  cos  (x2)  from  0 to  1 by 
Simpson’s  rule  with  2m  = 4. 

33.  Solve  Prob.  32  by  Gauss  integration  with  n = 3 and 
n = 5. 

34.  Compute/ZO^)  for/(x)  = x3  using  (14b)  in  Sec.  19.5 
with  (a)  h = 0.2,  (b)  h = 0.1.  Compare  the  accuracy. 

35.  Compute /”(0. 2)  for /(x)  = x3  using  (13)  in  Sec.  19.5 
with  (a)  h = 0.2,  (b)  h = 0.1. 
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SUMMARY  OF  CHAPTER  19 

Numerics  in  General 


In  this  chapter  we  discussed  concepts  that  are  relevant  throughout  numeric  work  as 
a whole  and  methods  of  a general  nature,  as  opposed  to  methods  for  linear  algebra 
(Chap.  20)  or  differential  equations  (Chap.  21). 

In  scientific  computations  we  use  the  floating-point  representation  of  numbers 
(Sec.  19.1);  fixed-point  representation  is  less  suitable  in  most  cases. 

Numeric  methods  give  approximate  values  a of  quantities.  The  error  e of  a is 

(1)  e = a - a (Sec.  19.1) 

where  a is  the  exact  value.  The  relative  error  of  a is  e/a.  Errors  arise  from  rounding, 
inaccuracy  of  measured  values,  truncation  (that  is,  replacement  of  integrals  by  sums, 
series  by  partial  sums),  and  so  on. 

An  algorithm  is  called  numerically  stable  if  small  changes  in  the  initial  data  give 
only  correspondingly  small  changes  in  the  final  results.  Unstable  algorithms  are 
generally  useless  because  errors  may  become  so  large  that  results  will  be  very 
inaccurate.  The  numeric  instability  of  algorithms  must  not  be  confused  with  the 
mathematical  instability  of  problems  (“ ill-conditioned  problems ,”  Sec.  19.2). 

Fixed-point  iteration  is  a method  for  solving  equations  fix ) = 0 in  which  the 
equation  is  first  transformed  algebraically  to  x = g(x),  an  initial  guess  xo  for  the 
solution  is  made,  and  then  approximations  xi,X2,'",  are  successively  computed 
by  iteration  from  (see  Sec.  19.2) 

(2)  xn+1  = gixn)  in  = 0,  1,  • • • ). 

Newton’s  method  for  solving  equations  fix)  = 0 is  an  iteration 

f\xn) 

(3)  xn+1  = xn  - — — - (Sec.  19.2). 

/ C*n) 

Here  xn+\  is  the  x-intercept  of  the  tangent  of  the  curve  y = fix)  at  the  point  xn. 
This  method  is  of  second  order  (Theorem  2,  Sec.  19.2).  If  we  replace  / in  (3)  by 
a difference  quotient  (geometrically:  we  replace  the  tangent  by  a secant),  we  obtain 
the  secant  method;  see  (10)  in  Sec.  19.2.  For  the  bisection  method  (which  converges 
slowly)  and  the  method  of  false  position,  see  Problem  Set  19.2. 

Polynomial  interpolation  means  the  determination  of  a polynomial  pnix)  such 
that  pnixj)  = f,  where  y = 0,  • ■ • , n and  (xo,/o)>  • ■ ■ , ixn,fn)  are  measured  or 
observed  values,  values  of  a function,  etc . pn(x)  is  called  an  interpolation  polynomial. 
For  given  data,  pnix)  of  degree  n (or  less)  is  unique.  However,  it  can  be  written  in 
different  forms,  notably  in  Lagrange’s  form  (4),  Sec.  19.3,  or  in  Newton’s  divided 
difference  form  (10),  Sec.  19.3,  which  requires  fewer  operations.  For  regularly 
spaced  Xq,  X\  = Xq  + h,-  ■ ■ , xn  = Xq  + nh  the  latter  becomes  Newton’s  forward 
difference  formula  (formula  (14)  in  Sec.  19.3): 
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r(r  — 1 )■  ■ ■ (r  — n + 1) 

(4)  f(x)  ~ pn(x ) = fo  + r A/q  + •••-( : A f0 

n\ 


where  r = (x  — xo)/h  and  the  forward  differences  are  A fj  = fa+i  — fj  and 


A kfj  = A fc"1J5+1  - Afc-^-  (*  = 2,  3,  • • • )■ 


A similar  formula  is  Newton ’s  backward  difference  interpolation  formula  (formula 
(18)  in  Sec.  19.3). 

Interpolation  polynomials  may  become  numerically  unstable  as  n increases,  and 
instead  of  interpolating  and  approximating  by  a single  high-degree  polynomial  it  is 
preferable  to  use  a cubic  spline  g(x),  that  is,  a twice  continuously  differentiable 
interpolation  function  [thus,  g(xj)  = fj],  which  in  each  subinterval  xj  = x Si  Xj+ y 
consists  of  a cubic  polynomial  qj(x);  see  Sec.  19.4. 

Simpson’s  rule  of  numeric  integration  is  [see  (7),  Sec.  19.5] 


(5) 


rb 

f(x)  dx  ~ 


h 

3 


(fo  + 4/i  + 2/2  + 4/3  + • • • + 2/2m_2  + 4f2m-i  + /2m) 


with  equally  spaced  nodes  xj  = Xo  + jh,j  = 1,  ■ • ■ , 2m,  h = ib  — a)/ (2m),  and 
fj  = f(xj).  It  is  simple  but  accurate  enough  for  many  applications.  Its  degree  of 
precision  is  DP  = 3 because  the  error  (8),  Sec.  19.5,  involves  h4.  A more  practical 
error  estimate  is  (10),  Sec.  19.5, 

€h/2  = 13  (Jh/2  ~ Jh ). 

obtained  by  first  computing  with  step  h,  then  with  step  h/ 2,  and  then  taking  yg  of 
the  difference  of  the  results. 

Simpson’s  rule  is  the  most  important  of  the  Newton-Cotes  formulas,  which  are 
obtained  by  integrating  Lagrange  interpolation  polynomials,  linear  ones  for  the 
trapezoidal  rule  (2),  Sec.  19.5.  quadratic  for  Simpson’s  rule,  cubic  for  the  three- 
eights  rule  (see  the  Chap.  19  Review  Problems),  etc. 

Adaptive  integration  (Sec.  19.5,  Example  6)  is  integration  that  adjusts 
(“adapts")  the  step  (automatically)  to  the  variability  of f(x). 

Romberg  integration  (Team  Project  26,  Problem  Set  19.5)  starts  from  the 
trapezoidal  rule  (2),  Sec.  19.5,  with  h,  h/2,  hj 4,  etc.  and  improves  results  by 
systematically  adding  error  estimates. 

Gauss  integration  (11),  Sec.  19.5,  is  important  because  of  its  great  accuracy 
(DP  = 2 n — 1,  compared  to  Newton-Cotes’ s DP  = n — 1 or  n).  This  is  achieved 
by  an  optimal  choice  of  the  nodes,  which  are  not  equally  spaced;  see  Table  19.7, 
Sec.  19.5. 

Numeric  differentiation  is  discussed  at  the  end  of  Sec.  19.5.  (Its  main  application 
(to  differential  equations)  follows  in  Chap.  21.) 


CHAPTER 


Numeric  Linear  Algebra 


This  chapter  deals  with  two  main  topics.  The  first  topic  is  how  to  solve  linear  systems  of 
equations  numerically.  We  start  with  Gauss  elimination,  which  may  be  familiar  to  some 
readers,  but  this  time  in  an  algorithmic  setting  with  partial  pivoting.  Variants  of  this  method 
(Doolittle,  Crout,  Cholesky,  Gauss-Jordan)  are  discussed  in  Sec.  20.2.  All  these  methods 
are  direct  methods,  that  is,  methods  of  numerics  where  we  know  in  advance  how  many 
steps  they  will  take  until  they  arrive  at  a solution.  However,  small  pivots  and  roundoff 
error  magnification  may  produce  nonsensical  results,  such  as  in  the  Gauss  method.  A shift 
occurs  in  Sec.  20.3,  where  we  discuss  numeric  iteration  methods  or  indirect  methods  to 
address  our  first  topic.  Here  we  cannot  be  totally  sure  how  many  steps  will  be  needed  to 
arrive  at  a good  answer.  Several  factors — such  as  how  far  is  the  starting  value  from  our 
initial  solution,  how  is  the  problem  structure  influencing  speed  of  convergence,  how 
accurate  would  we  like  our  result  to  be — determine  the  outcome  of  these  methods. 
Moreover,  our  computation  cycle  may  not  converge.  Gauss-Seidel  iteration  and  Jacobi 
iteration  are  discussed  in  Sec.  20.3.  Section  20.4  is  at  the  heart  of  addressing  the  pitfalls 
of  numeric  linear  algebra.  It  is  concerned  with  problems  that  are  ill-conditioned.  We  learn 
to  estimate  how  “bad”  such  a problem  is  by  calculating  the  condition  number  of  its  matrix. 

The  second  topic  (Secs.  20.6-20.9)  is  how  to  solve  eigenvalue  problems  numerically. 
Eigenvalue  problems  appear  throughout  engineering,  physics,  mathematics,  economics, 
and  many  areas.  For  large  or  very  large  matrices,  determining  the  eigenvalues  is  difficult 
as  it  involves  finding  the  roots  of  the  characteristic  equations,  which  are  high-degree 
polynomials.  As  such,  there  are  different  approaches  to  tackling  this  problem.  Some 
methods,  such  as  Gerschgorin’s  method  and  Collatz’s  method  only  provide  a range  in 
which  eigenvalues  lie  and  thus  are  known  as  inclusion  methods.  Others  such  as 
tridiagonalization  and  QR-factorization  actually  find  all  the  eigenvalues.  The  area  is  quite 
ingeneous  and  should  be  fascinating  to  the  reader. 

COMMENT  This  chapter  is  independent  of  Chap.  19  and  can  be  studied  immediately 
after  Chap.  7 or  8. 

Prerequisite:  Secs.  7.1,  7.2,  8.1. 

Sections  that  may  be  omitted  in  a shorter  course:  20.4,  20.5,  20.9. 

References  and  Answers  to  Problems:  App.  1 Part  E,  App.  2. 

20.1  Linear  Systems:  Gauss  Elimination 
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The  basic  method  for  solving  systems  of  linear  equations  by  Gauss  elimination  and  back 
substitution  was  explained  in  Sec.  7.3.  If  you  covered  Sec.  7.3,  you  may  wonder  why  we 
cover  Gauss  elimination  again.  The  reason  is  that  here  we  cover  Gauss  elimination  in  the 
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setting  of  numerics  and  introduce  new  material  such  as  pivoting,  row  scaling,  and  operation 
count.  Furthermore,  we  give  an  algorithmic  representation  of  Gauss  elimination  in  Table  20. 1 
that  can  be  readily  converted  into  software.  We  also  show  when  Gauss  elimination  runs 
into  difficulties  with  small  pivots  and  what  to  do  about  it.  The  reader  should  pay  close 
attention  to  the  material  as  variants  of  Gauss  elimination  are  covered  in  Sec.  20.2  and, 
furthermore,  the  general  problem  of  solving  linear  systems  is  the  focus  of  the  first  half  of 
this  chapter. 

A linear  system  of  n equations  in  n unknowns  x±,  ■ ■ • , xn  is  a set  of  equations 
Ei,  • ■ • , En  of  the  form 


(1) 


Ei: 

anxi  + ' ' 

"i"  b i 

E2: 

«2Ul  + ' ' 

+ Cl2nX  n ^2 

E^,.  flwlAl  U * T t^nn^n  ^ n 


where  the  coefficients  ayj,  and  the  h3  are  given  numbers.  The  system  is  called  homogeneous 
if  all  the  bj  are  zero;  otherwise  it  is  called  nonhomogeneous.  Using  matrix  multiplication 
(Sec.  7.2),  we  can  write  (1)  as  a single  vector  equation 

(2)  Ax  = b 

where  the  coefficient  matrix  A = [ ajk]  is  the  n X n matrix 


an 

a12 

d\n 

xi 

1 

1 

«21 

a22 

a2n 

, and  x = 

and  b = 

@nl 

an2 

dnn 

xn 

^71 

are  column  vectors.  The  following  matrix  A is  called  the  augmented  matrix  of  the 
system  (1): 


A = [A  b] 


an 

al  n 

h 

021 

a2n 

^2 

®nl 

Clyin 

A solution  of  (1)  is  a set  of  numbers  x\,  ■ ■ ■ , xn  that  satisfy  all  the  n equations,  and  a 
solution  vector  of  (1)  is  a vector  x whose  components  constitute  a solution  of  (1). 

The  method  of  solving  such  a system  by  determinants  (Cramer’s  rule  in  Sec.  7.7)  is 
not  practical,  even  with  efficient  methods  for  evaluating  the  determinants. 

A practical  method  for  the  solution  of  a linear  system  is  the  so-called  Gauss  elimination, 
which  we  shall  now  discuss  ( proceeding  independently  of  Sec.  7.3). 
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EXAMPLE  1 


Gauss  Elimination 

This  standard  method  for  solving  linear  systems  (1)  is  a systematic  process  of  elimination 
that  reduces  (1)  to  triangular  form  because  the  system  can  then  be  easily  solved  by  back 
substitution.  For  instance,  a triangular  system  is 

3xi  + 5x2  + 2x3  = 8 

8x2  + 2x3  = ~ 7 
6x3  = 3 

3 1 

and  back  substitution  gives  X3  = g = 2 from  the  third  equation,  then 

*2  = s(— 7 - 2x3)  = -1 

from  the  second  equation,  and  finally  from  the  first  equation 

xi  = 3(8  - 5x2  - 2x3)  = 4. 

How  do  we  reduce  a given  system  (1)  to  triangular  form?  In  the  first  step  we  eliminate 
X\  from  equations  E2  to  E,„  in  (1).  We  do  this  by  adding  (or  subtracting)  suitable  multi- 
ples of  Ei  to  (from)  equations  E2,  ■ • ■ , E„  and  taking  the  resulting  equations,  call  them 
El,  • • • , E'n  as  the  new  equations.  The  first  equation,  Ei,  is  called  the  pivot  equation  in 
this  step,  and  an  is  called  the  pivot.  This  equation  is  left  unaltered.  In  the  second  step 
we  take  the  new  second  equation  E|  (which  no  longer  contains  xi)  as  the  pivot  equation 
and  use  it  to  eliminate  x%  from  E3  to  E (,.  And  so  on.  After  n — 1 steps  this  gives  a 
triangular  system  that  can  be  solved  by  back  substitution  as  just  shown.  In  this  way  we 
obtain  precisely  all  solutions  of  the  given  system  (as  proved  in  Sec.  7.3). 

The  pivot  ap±  (in  step  k)  must  be  different  from  zero  and  should  be  large  in  absolute 
value  to  avoid  roundoff  magnification  by  the  multiplication  in  the  elimination.  For  this 
we  choose  as  our  pivot  equation  one  that  has  the  absolutely  largest  ajk  in  column  k on  or 
below  the  main  diagonal  (actually,  the  uppermost  if  there  are  several  such  equations).  This 
popular  method  is  called  partial  pivoting.  It  is  used  in  CASs  (e.g.,  in  Maple). 

Partial  pivoting  distinguishes  it  from  total  pivoting,  which  involves  both  row  and 
column  interchanges  but  is  hardly  used  in  practice. 

Let  us  illustrate  this  method  with  a simple  example. 

Gauss  Elimination.  Partial  Pivoting 

Solve  the  system 

Ei:  8*2  + 2*3  = -7 

E2:  3*i  + 5^2  + 2x3  = 8 

E3:  6x1  + 2x2  + 8x3  = 26. 

Solution.  We  must  pivot  since  Ei  has  no  xi-term.  In  Column  1,  equation  E3  has  the  largest  coefficient. 
Hence  we  interchange  Ei  and  E3, 


6x1  + 2x2  + 8x3  = 26 
3xi  + 5x2  + 2x3  = 8 

8x2  + 2x3  = -7. 
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Step  1.  Elimination  of  X\ 

It  would  suffice  to  show  the  augmented  matrix  and  operate  on  it.  We  show  both  the  equations  and  the  augmented 
matrix.  In  the  first  step,  the  first  equation  is  the  pivot  equation.  Thus 


Pivot  6 — 
Eliminate 


>xi)+  2^2  + 8x3  = 26 

5x2  + 2x3  = 8 

8x2  + 2x3  = -7 


3x 


6 2 8 I 26 

I 

3 5 2 | 8 

I 

0 8 21-7 


To  eliminate  x\  from  the  other  equations  (here,  from  the  second  equation),  do: 

Subtract  § = \ times  the  pivot  equation  from  the  second  equation. 

The  result  is 


6xi  + 2x2  + 8x3  = 

26 

6 

2 

8 I 

l 

26 

4x2  - 2*3  = 

-5 

0 

4 

”2  | 
I 

-5 

8x2  + 2x3  = 

-7 

0 

8 

1 

2 1 

-7 

Step  2.  Elimination  of  x% 

The  largest  coefficient  in  Column  2 is  8.  Hence  we  take  the  new  third  equation  as  the  pivot  equation,  interchanging 
equations  2 and  3, 


6xi  + 2x2 

+ 8x3  = 

26 

6 

2 

8 1 26 

Pivot  8 

— * 

S 

)+  2.X3  = 

-7 

0 

8 

2 -7 

Eliminate  — 

> 

4x2 

- 2x3  = 

-5 

_0 

4 

1 

K> 

1 

Gn 

1 

To  eliminate  x2  from  the  third  equation,  do: 

Subtract  | times  the  pivot  equation  from  the  third  equation. 


The  resulting  triangular  system  is  shown  below.  This  is  the  end  of  the  forward  elimination.  Now  comes  the  back 
substitution. 

Back  substitution.  Determination  of  x3,  x2,  Xj 
The  triangular  system  obtained  in  Step  2 is 


6x1  + 2x2 

+ 8x3  = 26 

~6 

2 

8 1 26 
1 

00 

X 

to 

+ 2x3  = -7 

0 

8 

2 | -7 

“ 3x3  = -1 

0 

0 

3 ' 2J 

From  this  system,  taking  the  last  equation,  then  the  second  equation,  and  finally  the  first  equation,  we  compute 
the  solution 


*2  = g(-7  - 2x3)  = -1 
*1  = s (26  — 2x2  — 8x3)  = 4. 

This  agrees  with  the  values  given  above,  before  the  beginning  of  the  example. 


The  general  algorithm  for  the  Gauss  elimination  is  shown  in  Table  20.1.  To  help  explain 
the  algorithm,  we  have  numbered  some  of  its  lines,  h is  denoted  by  cun+ 1,  for  uniformity. 
In  lines  1 and  2 we  look  for  a possible  pivot.  [For  k = 1 we  can  always  find  one;  otherwise 
X\  would  not  occur  in  (1).]  In  line  2 we  do  pivoting  if  necessary,  picking  an  ajk  of  greatest 
absolute  value  (the  one  with  the  smallest  j if  there  are  several)  and  interchange  the 
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EXAMPLE  2 


EXAMPLE  3 


corresponding  rows.  If  |(7/c/c|  is  greatest,  we  do  no  pivoting,  in.jp-  in  line  4 suggests 
multiplier,  since  these  are  the  factors  by  which  we  have  to  multiply  the  pivot  equation 
in  Step  k before  subtracting  it  from  an  equation  E*  below  Ep  from  which  we  want  to 
eliminate  xp.  Here  we  have  written  E%  and  E*  to  indicate  that  after  Step  1 these  are  no 
longer  the  equations  given  in  (1),  but  these  underwent  a change  in  each  step,  as  indicated 
in  line  5.  Accordingly,  a3p  etc.  in  all  lines  refer  to  the  most  recent  equations,  and  j = k 
in  line  1 indicates  that  we  leave  untouched  all  the  equations  that  have  served  as  pivot 
equations  in  previous  steps.  For  p = k in  line  5 we  get  0 on  the  right,  as  it  should  be  in 
the  elimination. 


tljk 

ajk  mjk®kk  @jk  ..  akk  0- 
l,kk 

In  line  3,  if  the  last  equation  in  the  triangular  system  is  0 = bn  A 0,  we  have  no 
solution.  If  it  is  0 = = 0,  we  have  no  unique  solution  because  we  then  have  fewer 

equations  than  unknowns. 


Gauss  Elimination  in  Table  20.1,  Sample  Computation 

In  Example  1 we  had  an  = 0,  so  that  pivoting  was  necessary.  The  greatest  coefficient  in  Column  1 was  a^i- 
Thus  j = 3 in  line  2,  and  we  interchanged  Ei  and  E3.  Then  in  lines  4 and  5 we  computed  m 21  — 5 = 2 and 

c/22  = 5 — 2 * 2 = 4,  <223  — 2 — 2 * 8 = — 2,  <224  ~ 8 — 2 * 26  = — 5, 

and  then  m^i  = § =0,  so  that  the  third  equation  8x2  + 2x3  = — 7 did  not  change  in  Step  1.  In  Step  2 (k  = 2) 
we  had  8 as  the  greatest  coefficient  in  Column  2,  hence  j = 3.  We  interchanged  equations  2 and  3,  computed 
m32  = — | in  line  5,  and  the  c/33  = —2  — ^ • 2 = —3,  c/34  = — 5 — g(— 7)  — — §.  This  produced  the 

triangular  form  used  in  the  back  substitution. 


If  app  = 0 in  Step  k,  we  must  pivot.  If  | app  \ is  small,  we  should  pivot  because  of  roundoff 
error  magnification  that  may  seriously  affect  accuracy  or  even  produce  nonsensical 
results. 


Difficulty  with  Small  Pivots 

The  solution  of  the  system 


0.0004X!  + 1.402x2  = 1.406 
0.4003X!  - 1.502x2  = 2.501 

is  xi  = 10,  X2  — 1.  We  solve  this  system  by  the  Gauss  elimination,  using  four-digit  floating-point  arithmetic. 
(4D  is  for  simplicity.  Make  an  8D-arithmetic  example  that  shows  the  same.) 

(a)  Picking  the  first  of  the  given  equations  as  the  pivot  equation,  we  have  to  multiply  this  equation  by 
m = 0.4003/0.0004  = 1001  and  subtract  the  result  from  the  second  equation,  obtaining 

-1405x2  - -1404. 

Hence  X2  — — 1404/(— 1405)  = 0.9993,  and  from  the  first  equation,  instead  of  Xi  = 10,  we  get 

1 0.005 

x 1 = (1.406  - 1.402  • 0.9993)  = - 12.5. 

0.0004  0.0004 


This  failure  occurs  because  |<2n|  is  small  compared  with  Ic/^l,  so  that  a small  roundoff  error  in  x'2  leads  to  a 
large  error  in  xi. 
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(b)  Picking  the  second  of  the  given  equations  as  the  pivot  equation,  we  have  to  multiply  this  equation  by 
0.0004/0.4003  = 0.0009993  and  subtract  the  result  from  the  first  equation,  obtaining 

1.404*2  = 1-404. 

Hence  x2  = 1,  and  from  the  pivot  equation  x i = 10.  This  success  occurs  because  |a2il  is  not  very  small 
compared  to  |a22|,  so  that  a small  roundoff  error  in  x2  would  not  lead  to  a large  error  in  X\.  Indeed,  for 
instance,  if  we  had  the  value  x2  = 1.002,  we  would  still  have  from  the  pivot  equation  the  good  value 
= (2.501  + 1.505)/0.4003  = 10.01.  ■ 


Table  20  Gauss  Elimination 


ALGORITHM  GAUSS  (A  = [ajk]  = [A  b]) 

This  algorithm  computes  a unique  solution  x = [x,]  of  the  system  (1)  or  indicates  that 
(1)  has  no  unique  solution. 

INPUT:  Augmented  n X (n  + 1)  matrix  A = [cijk],  where  fljn+1  = bj 

OUTPUT:  Solution  x = [x,]  of  (1)  or  message  that  the  system  (1)  has  no 

unique  solution 
For  k = 1 1,  do: 

m = k 

Fory  = k + 1,  • • ■ , n,  do: 

If  ( | dink.  I ' ~ I @jk  I ) then  in  j 

End 

If  amk  = 0 then  OUTPUT  “No  unique  solution  exists” 

Stop 

[ Procedure  completed  unsuccessfully \ 

Else  exchange  row  k and  row  m 

If  ann  = 0 then  OUTPUT  “No  unique  solution  exists.” 

Stop 
Else 

For  j = k + 1,  • • • , n,  do: 
ajk 

m*:  = Okk 

For  p = k + 1,  • • • , n + 1,  do: 

&jp-  Cljp  Wljk^kp 

End 


End 


End 


a, 


X v). 


n,n+ 1 


Clyi 


[Start  back  substitution ] 


■*nn 

For  i = n — 1,  • • • , 1,  do: 

- =J_f  _ v 

x%  l &i  n+ 1 


End 

OUTPUT  x = [Xj\.  Stop 
End  GAUSS 


j=i+ 1 
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Error  estimates  for  the  Gauss  elimination  are  discussed  in  Ref.  [E5]  listed  in  App.  1. 

Row  scaling  means  the  multiplication  of  each  Row  j by  a suitable  scaling  factor  Sj.  It  is 
done  in  connection  with  partial  pivoting  to  get  more  accurate  solutions.  Despite  much 
research  (see  Refs.  [E9],  [E24]  in  App.  1)  and  the  proposition  of  several  principles,  scaling 
is  still  not  well  understood.  As  a possibility,  one  can  scale  for  pivot  choice  only  (not  in 
the  calculation,  to  avoid  additional  roundoff)  and  take  as  first  pivot  the  entry  aji  for  which 
c/j!  | / 1 Aj  is  largest;  here  Aj  is  an  entry  of  largest  absolute  value  in  Row  j.  Similarly  in 
the  further  steps  of  the  Gauss  elimination. 

For  instance,  for  the  system 

4.0000.ri  + 14020x2  = 14060 
0.4003X!  - 1.502x2  = 2.501 

we  might  pick  4 as  pivot,  but  dividing  the  first  equation  by  104  gives  the  system  in 
Example  3,  for  which  the  second  equation  is  a better  pivot  equation. 

Operation  Count 

Quite  generally,  important  factors  in  judging  the  quality  of  a numeric  method  are 
Amount  of  storage 

Amount  of  time  (=  number  of  operations) 

Effect  of  roundoff  error 

For  the  Gauss  elimination,  the  operation  count  for  a full  matrix  (a  matrix  with  relatively 
many  nonzero  entries)  is  as  follows.  In  Step  k we  eliminate  x j.  from  n — k equations. 
This  needs  n — k divisions  in  computing  the  (line  3)  and  ( n — k){n  — k + 1) 
multiplications  and  as  many  subtractions  (both  in  line  4).  Since  we  do  n — 1 steps,  k 
goes  from  1 to  n — 1 and  thus  the  total  number  of  operations  in  this  forward 
elimination  is 

77,-1  77,-1 

f(n)  = 2 (n  ~ k)  + 2 2 («  — k)(n  — 7+1)  (write  n — k = s) 

k= 1 k= 1 

77—1  77—1 

= 2s  + 2^s(s+  l)  = j(n-  1 )n  + § («2  - 1 )n  ~ § n3 

S= 1 S=1 

where  2h3/3  is  obtained  by  dropping  lower  powers  of  n.  We  see  that  fin)  grows  about 
3 3 

proportional  to  n . We  say  that/fn)  is  of  order  n and  write 

f(fi)  = 0(n3) 

where  O suggests  order.  The  general  definition  of  O is  as  follows.  We  write 

f(n)  = 0(h(n)) 

if  the  quotients  \f(n)/h(n)\  and  \h(n)/f(n)\  remain  bounded  (do  not  trail  off  to  infinity) 
as  oo.  In  our  present  case,  h(n)  = n and,  indeed.  f(n)/n  — » 3 because  the  omitted 

Q 

terms  divided  by  n go  to  zero  as  n — > °o. 
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In  the  back  substitution  of  x j we  make  n — i multiplications  and  as  many  subtractions, 
as  well  as  1 division.  Hence  the  number  of  operations  in  the  back  substitution  is 

n n 

b(n)  = 2 ^ (n  — i)  + n = 2^  s + n = n(n  + 1)  + n = nz  + 2 n = 0(n2). 

i= 1 s=  1 

We  see  that  it  grows  more  slowly  than  the  number  of  operations  in  the  forward  elimination 
of  the  Gauss  algorithm,  so  that  it  is  negligible  for  large  systems  because  it  is  smaller  by 
a factor  n,  approximately.  For  instance,  if  an  operation  takes  10-9  sec,  then  the  times 
needed  are: 


Algorithm 

n = 1000 

n = 10000 

Elimination 

0.7  sec 

1 1 min 

Back  substitution 

0.001  sec 

0.1  sec 

APPLICATIONS  of  linear  systems  see  Secs.  7.1  and  8.2.  7.  — 3xi  + 6x2  — 9x3  = —46.725 


1-3  GEOMETRIC  INTERPRETATION 

Solve  graphically  and  explain  geometrically. 

1.  xi  — 4x2  = 20.1 
3xi  + 5x2  = 5.9 

2.  — 5.00.x  x + 8.40x2  = 0 
10.25X!  - 17.22x2  = 0 

3.  1.2x1  - 3.5x2  = 16.0 
— 14.4x ! + 7.0x2  = 31.0 


4-16 


GAUSS  ELIMINATION 


Solve  the  following  linear  systems  by  Gauss  elimination, 
with  partial  pivoting  if  necessary  (but  without  scaling).  Show 
the  intermediate  steps.  Check  the  result  by  substitution.  If  no 
solution  or  more  than  one  solution  exists,  give  a reason. 


4.  6x!  + x2  = — 3 

4xj  — 2x2  = 6 

5.  2xi  — 8x2  = —4 

3xi  + x2  = 7 

6.  25.38.X!  - 15.48.xa  = 30.60 

-14.10x2+  8.60x2  = -17.00 


x 2 — 4x2  + 3x3  = 19.571 

2x2  + 5x2  — 7x3  = —20.073 

8.  5x2  + 3x2  + x3  = 2 

— 4x2  + 8x3  = —3 
IO.X2  — 6x2  + 26x3  = 0 

9.  6x2  + 13x3  = 137.86 

6x2  8x3  = 85.88 

13xi  - 8x2  = 178.54 

10.  4xi  + 4x2  + 2x3  = 0 
3xi  — x2  + 2x3  = 0 
3xi  + 7x2  + x3  = 0 

11.  3.4xi  - 6.12x2  - 2.72x3  = 0 

— Xi  + 1.80x2  + 0.80x3  = 0 
2.7xi  — 4.86x2  + 2.16x3  = 0 

12.  5x’i  + 3x2  + x3  = 2 

— 4x2  + 8x3  = —3 
10xi  — 6x2  + 26x3  = 0 
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13.  3*2  + 5x3  = 1.20736 

3*,  - 4v2  = -2.34066 

5*i  + 6*3  = -0.329193 

14.  —47*i  + 4*2  — 7*3  = — 118 

19*i  — 3*2  + 2*3  = 43 

— 15*1  + 5*2  = —25 

15.  2.2*2  + 1-5*3  — 3.3*4  = —9.30 

0.2*i  + 1.8*2  + 4.2*4  = 9-24 

— *i  — 3.1*2  + 2.5*3  = —8.70 

0.5*i  — 3.8*3  + 1-5*4  = 11-94 

16.  3.2*i  + 1.6*2  = —0-8 

1.6*i  — 0.8*2  + 2.4*3  = 16-0 

2.4*2  — 4.8*3  + 3.6*4  = —39.0 
3.6*3  + 2.4*4  = 10-2 

17.  CAS  EXPERIMENT.  Gauss  Elimination.  Write  a 
program  for  the  Gauss  elimination  with  pivoting. 
Apply  it  to  Probs.  13-16.  Experiment  with  systems 
whose  coefficient  determinant  is  small  in  absolute 
value.  Also  investigate  the  performance  of  your 
program  for  larger  systems  of  your  choice,  including 
sparse  systems. 

18.  TEAM  PROJECT.  Linear  Systems  and  Gauss 
Elimination,  (a)  Existence  and  uniqueness.  Find  a 

and  b such  that  a*i  + *2  = b,  X\  + *2  = 3 has  (i)  a 
unique  solution,  (ii)  infinitely  many  solutions,  (iii)  no 
solutions. 

(b)  Gauss  elimination  and  nonexistence.  Apply  the 
Gauss  elimination  to  the  following  two  systems  and 


compare  the  calculations  step  by  step.  Explain  why  the 
elimination  fails  if  no  solution  exists. 

*i  + *2  + *3  = 3 

4*i  + 2*2  — *3  = 5 

9*i  + 5*2  — *3  = 13 

*i  + *2  + *3  = 3 
4*i  + 2*2  — *3  = 5 
9*i  + 5*2  — *3  = 12. 

(c)  Zero  determinant.  Why  may  a computer  program 
give  you  the  result  that  a homogeneous  linear  system 
has  only  the  trivial  solution  although  you  know  its 
coefficient  determinant  to  be  zero? 

(d)  Pivoting.  Solve  System  (A)  (below)  by  the  Gauss 
elimination  first  without  pivoting.  Show  that  for  any 
fixed  machine  word  length  and  sufficiently  small  e > 0 
the  computer  gives  *2  = 1 and  then  X\  = 0.  What 
is  the  exact  solution?  Its  limit  as  e — » 0?  Then  solve 
the  system  by  the  Gauss  elimination  with  pivoting. 
Compare  and  comment. 

(e)  Pivoting.  Solve  System  (B)  by  the  Gauss  elimination 
and  three-digit  rounding  arithmetic,  choosing  (i)  the  first 
equation,  (ii)  the  second  equation  as  pivot  equation. 
(Remember  to  round  to  3S  after  each  operation  before 
doing  the  next,  just  as  would  be  done  on  a computer!) 
Then  use  four-digit  rounding  arithmetic  in  those  two 
calculations.  Compare  and  comment. 

(A)  e*i  +*2=1 

*i  + *2  = 2 

(B)  4.03*i  + 2.16*2  = -4.61 
6.21*i  + 3.35*2  = -7.19 


20.2  Linear  Systems:  LU-Factorization, 
Matrix  Inversion 


We  continue  our  discussion  of  numeric  methods  for  solving  linear  systems  of  n equations 
in  n unknowns  X\,  ■ ■ • , xn, 

(1)  Ax  = b 

where  A = [o^ J is  the  n X n given  coefficient  matrix  and  xT  = [x1;  • • ■ , xn]  and 
bT  = , bn].  We  present  three  related  methods  that  are  modifications  of  the  Gauss 
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EXAMPLE  1 


elimination,  which  require  fewer  arithmetic  operations.  They  are  named  after  Doolittle, 
Crout,  and  Cholesky  and  use  the  idea  of  the  LU-factorization  of  A,  which  we  explain 
first. 

An  LU-factorization  of  a given  square  matrix  A is  of  the  form 
(2)  A = LU 

where  L is  lower  triangular  and  U is  upper  triangular.  For  example. 


’2  3' 

= LU  = 

1 

o' 

’2 

3' 

8 5 

4 

1 

0 

-7 

It  can  be  proved  that  for  any  nonsingular  matrix  (see  Sec.  7.8)  the  rows  can  be  reordered 
so  that  the  resulting  matrix  A has  an  LU-factorization  (2)  in  which  L turns  out  to  be  the 
matrix  of  the  multipliers  m .jk  of  the  Gauss  elimination,  with  main  diagonal  1 , • • • , 1 , and 
U is  the  matrix  of  the  triangular  system  at  the  end  of  the  Gauss  elimination.  (See  Ref. 
[E5],  pp.  155-156,  listed  in  App.  1.) 

The  crucial  idea  now  is  that  L and  U in  (2)  can  be  computed  directly,  without  solving 
simultaneous  equations  (thus,  without  using  the  Gauss  elimination).  As  a count  shows, 
this  needs  about  «3/ 3 operations,  about  half  as  many  as  the  Gauss  elimination,  which 
needs  about  2«3/3  (see  Sec.  20.1).  And  once  we  have  (2),  we  can  use  it  for  solving  Ax  = b 
in  two  steps,  involving  only  about  nz  operations,  simply  by  noting  that  Ax  = LUx  = b 
may  be  written 

(3)  (a)  Ly  = b where  (b)  Ux  = y 

and  solving  first  (3a)  for  y and  then  (3b)  for  x.  Here  we  can  require  that  L have  main 
diagonal  1,  • • • , 1 as  stated  before;  then  this  is  called  Doolittle’s  method.1  Both  systems 
(3a)  and  (3b)  are  triangular,  so  we  can  solve  them  as  in  the  back  substitution  for  the  Gauss 
elimination. 

A similar  method,  Crout’s  method,2  is  obtained  from  (2)  if  U (instead  of  L)  is  required 
to  have  main  diagonal  1,  • • ■ , 1.  In  either  case  the  factorization  (2)  is  unique. 


Doolittle’s  Method 

Solve  the  system  in  Example  1 of  Sec.  20.1  by  Doolittle’s  method. 
Solution.  The  decomposition  (2)  is  obtained  from 


aii 

al2 

a13 

"3 

5 

2 

1 

0 

0 

»11 

"12 

“13 

a21 

a22 

a23 

= 

0 

8 

2 

= 

>»21 

1 

0 

0 

11 22 

"23 

_a31 

a32 

a33_ 

_6 

2 

8_ 

m 31 

>"32 

1 

_0 

0 

u33_ 

^YRICK  H.  DOOLITTLE  (1830-1913).  American  mathematician  employed  by  the  U.S.  Coast  and  Geodetic 
Survey  Office.  His  method  appeared  in  U.S.  Coast  and  Geodetic  Survey,  1878,  115-120. 

2PRESCOTT  DURAND  CROUT  (1907-1984),  American  mathematician,  professor  at  MIT,  also  worked  at 
General  Electric. 
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by  determining  the  and  Uj^,  using  matrix  multiplication.  By  going  through  A row  by  row  we  get  successively 


°11  “3  — 1 ■ Mil  — “11 

a12 

— 5 — 1 * Ui2  — «12 

eo 

a 

II 

(N 

II 

eo 

a 

“ “13 

o2!  = 0 = m21»n 

a22 

= 8 = m21u12  + u22 

a23  = 2 = m21u13 

+ »23 

m2i  = 0 

U22  ~ 8 

u 23  = 2 

a3i  = 6 = m3i»n 

a32 

= 2 = m 3ii(  12  + m32u22 

a33  = 8 = m31u13 

+ m32U23  + «33 

= '«3i  • 3 

= 2 • 5 + m32  ■ 8 

1 

(N 

(N 

II 

1 ’ 2 + U33 

m31  = 2 

m32  = -1 

u33  = 6 

Thus  the  factorization  (2)  is 


3 5 2~ 

1 0 0 

’352 

0 8 2 

= LU  = 

0 1 0 

0 8 2 

1 

ON 

to 

00 

1 

2 -t  1 

1 

O 

O 

We  first  solve  Ly  = b,  determining  vy  = 8,  then  y2  = —7,  then  y3  from  2\y  — y2  + y3  = 16  + 7 + y3  = 26; 
thus  (note  the  interchange  in  b because  of  the  interchange  in  A!) 


~1  0 0" 

>'1 

8~ 

8~ 

0 1 0 

yz 

= 

-7 

Solution  y = 

-7 

2 -1  1 

26_ 

3 

Then  we  solve  Ux  = y,  determining  x3  = § then  .t2,  then  x\,  that  is, 


3 5 2 

Xi 

8~ 

4 

0 8 2 

x 2 

= 

-7 

Solution  x = 

-1 

1 

O 

O 

ON 

1 

* 3_ 

3_ 

1 

2 _ 

This  agrees  with  the  solution  in  Example  1 of  Sec.  20. 1 . 


Our  formulas  in  Example  1 suggest  that  for  general  n the  entries  of  the  matrices  L = [ /«,;,] 
(with  main  diagonal  1 , • • • , 1 and  ntjk  suggesting  “multiplier”)  and  U = [iijk]  in  the 
Doolittle  method  are  computed  from 


Mi k ~ «i k 
Ojl 


M11 


(4)  _ _ ^ 

Ujk  ajk  mjs^sk 


s= 1 


mjk  Ufcfc  l 21  mjsusk 


k- 1 

2 

s= 1 


k = 1, • • • , n 
j = 2,  ■ ■ • , n 

k=j,-",n;  j§2 

j = k + 1,  • • • , n\  tg2. 


SEC.  20.2  Linear  Systems:  LU-Factorization,  Matrix  Inversion 


855 


EXAMPLE  2 


Row  Interchanges.  Matrices,  such  as 


"o 

1 

0 

l" 

or 

1 

1 

1 

0 

have  no  LU-factorization  (try!).  This  indicates  that  for  obtaining  an  LU-factorization,  row 
interchanges  of  A (and  corresponding  interchanges  in  b)  may  be  necessary. 

Cholesky’s  Method 

For  a symmetric,  positive  definite  matrix  A (thus  A = AT,  xTAx  > 0 for  all  x =£  0)  we 
can  in  (2)  even  choose  U = LT,  thus  uj ^ = m (but  cannot  impose  conditions  on  the 
main  diagonal  entries).  For  example, 


4 2 14 

” 2 0 0 

” 2 1 7 

LA 

> 

II 

2 17  -5 

= LLt  = 

1 4 0 

0 4-3 

14  -5  83 

7-3  5 

0 0 5 

The  popular  method  of  solving  Ax  = b based  on  this  factorization  A = LLT  is  called 
Cholesky’s  method.3  In  terms  of  the  entries  of  L = [l^]  the  formulas  for  the  factorization 
are 

In  = Vfln 


ciji 


If  A is  symmetric  but  not  positive  definite,  this  method  could  still  be  applied,  but  then 
leads  to  a complex  matrix  L,  so  that  the  method  becomes  impractical. 

Cholesky’s  Method 

Solve  by  Cholesky’s  method: 

4x  i + 2x2  + 14^3  = 14 

2*i  + 17x2  — 5x3  = -101 
14*i  — 5x2  + 83x3  = 155. 


3ANDRE-LOUIS  CHOLESKY  (1875-1918),  French  military  officer,  geodecist,  and  mathematician.  Surveyed 
Crete  and  North  Africa.  Died  in  World  War  I.  His  method  was  published  posthumously  in  Bulletin  Geodesique 
in  1924  but  received  little  attention  until  JOHN  TODD  (1911-2007)  — Irish- American  mathematician,  numerical 
analysist,  and  early  pioneer  of  computer  methods  in  numerics,  professor  at  Caltech,  and  close  personal  friend 
and  collaborator  of  ERWIN  KREYSZIG,  see  [E20] — taught  Cholesky’s  method  in  his  analysis  course  at  King’s 
College,  London,  in  the  1940s. 
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PROOF 


Solution.  From  (6)  or  from  the  form  of  the  factorization 


4 

2 

14~ 

"/ll 

0 

0 

hi 

hi 

hi 

2 

17 

-5 

= 

hi 

1 22 

0 

0 

h2 

h2 

14 

-5 

83_ 

hi 

h2 

1 33_ 

_0 

0 

h2_ 

we  compute,  in  the  given  order, 


. «21  2 «31  14 

/n  - Von  - 2 hi  ~ ~ - ~ - 1 hi  ~ ~ ~ 7 

hi  2 hi  2 

h 2 = V^/22  — 1 21  = Vl7  — 1 = 4 

h2  = T1-  («32  - hihi)  = 7 (-5  - 7 ■ 1)  = -3 
l23  4 

/33  = Va33  - /§!  - /§2  = V83  - 72  - f-3)2  = 5. 

This  agrees  with  (5).  We  now  have  to  solve  Ly  = b,  that  is. 


yi 

14 

7 

y2 

= 

-101 

Solution  y = 

-27 

Jk 

155 

5 

As  the  second  step,  we  have  to  solve  Ux  = LTx  = y,  that  is, 


2 1 7 

Xi 

7 

3" 

0 4-3 

*2 

= 

-27 

Solution  x = 

-6 

_0  0 5_ 

_*3_ 

5_ 

1 

Stability  of  the  Cholesky  Factorization 

The  Cholesky  Ll7  -factorization  is  numerically  stable  (as  defined  in  Sec.  19.1). 


2 2 2 r 

We  have  cijj  = Iji  + l j2  + ■ ■ • + Ijj  by  squaring  the  third  formula  in  (6)  and  solving  it 
for  djj.  Hence  for  all  Ijp  (note  that  l:jp-  = 0 for  k > j ) we  obtain  (the  inequality  being  trivial) 

tjk  = tj\  ' lj2  ' ‘ ' ’ ' ‘'jj  "jjm 


That  is,  ifp-  is  bounded  by  an  entry  of  A,  which  means  stability  against  rounding. 


Gauss-Jordan  Elimination.  Matrix  Inversion 

Another  variant  of  the  Gauss  elimination  is  the  Gauss-Jordan  elimination,  introduced 
by  W.  Jordan  in  1920,  in  which  back  substitution  is  avoided  by  additional  computations 
that  reduce  the  matrix  to  diagonal  form,  instead  of  the  triangular  form  in  the  Gauss 
elimination.  But  this  reduction  from  the  Gauss  triangular  to  the  diagonal  form  requires 
more  operations  than  back  substitution  does,  so  that  the  method  is  disadvantageous  for 
solving  systems  Ax  = b.  But  it  may  be  used  for  matrix  inversion,  where  the  situation  is 
as  follows. 
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The  inverse  of  a nonsingular  square  matrix  A may  be  determined  in  principle  by  solving 
the  n systems 

(7)  Ax  = bj  O'  = 1,  ■■•,«) 

where  bj  is  the  /th  column  of  the  n X n unit  matrix. 

However,  it  is  preferable  to  produce  A-1  by  operating  on  the  unit  matrix  I in  the  same 
way  as  the  Gauss-Jordan  algorithm,  reducing  A to  I.  A typical  illustrative  example  of  this 
method  is  given  in  Sec.  7.8. 


TO0=B^E=W^E:T=gFff=g 


1-5  DOOLITTLE’S  METHOD 

Show  the  factorization  and  solve  by  Doolittle’s  method. 

1.  4.x1  + 5x2  = 14 
12.X!  + 14x2  = 36 

2.  2x1  + 9x2  = 82 

3Xl  — 5x2  = —62 


3.  5x!  + 4x2  + x3  = 6.8 
Kkj  + 9x2  + 4x3  = 17.6 
lOx  i + 13x2  + 15x3  = 38.4 

4.  2xi  + x2  + 2x3  = 0 
— 2xi  + 2x2  + x3  = 0 

Xi  + 2x2  — 2x3  = 18 

5.  3x!  + 9x2  + 6x3  = 4.6 
18xj  + 48x2  + 39x3  = 27.2 


9xi  — 27x2  + 42x3  = 9.0 

6.  TEAM  PROJECT.  Crout’s  method  factorizes 
A = LU,  where  L is  lower  triangular  and  U is  upper 
triangular  with  diagonal  entries  Ujj  = 1 ,j  = 1,  ■ ■ ■ , n. 

(a)  Formulas.  Obtain  formulas  for  Crout’s  method 
similar  to  (4). 

(b)  Examples.  Solve  Prob.  5 by  Crout’s  method. 

(c)  Factor  the  following  matrix  by  the  Doolittle, 
Crout,  and  Cholesky  methods. 

1 -4  2 

-4  25  4 

2 4 24_ 

(d)  Give  the  formulas  for  factoring  a tridiagonal 
matrix  by  Crout’s  method. 


(e)  When  can  you  obtain  Crout’s  factorization  from 
Doolittle’s  by  transposition? 


7-12 


CHOLESKY’S  METHOD 


Show  the  factorization  and  solve. 

7.  9xi  + 6x2  + 12x3  = 17.4 
6xi  + 13x2  + 1 lx3  = 23.6 

12xi  + 1 l.v2  + 26x3  = 30.8 

8.  4xi  + 6x2  + 8x3  = 0 

6xi  + 34x2  + 52x3  = —160 
8xi  + 52x2  + 129x3  = —452 

9.  0.0 lx i + 0.03x3  = 0.14 


0.16x2  + 0.08x3  = 0.16 


0.03xi  + 0.08x2  + 0.14x3  = 0.54 
10.  4xi  + 2x3  = 1.5 
4x2  + x3  = 4.0 


2xi  + x2  + 2x3  = 2.5 
11.  xi  — x2  + 3x3  + 2x4  = 15 


— xi  + 5x2  — 5x3  — 2x4  = —35 


3xi  — 5x2  + 19x3  + 3x4  = 94 

2xj  — 2x2  + 3x3  + 2I.X4  = 1 

12.  4xi  + 2x2  + 4x3  = 20 

2xi  + 2x2  + 3x3  + 2x4  = 36 
4xi  + 3x2  + 6x3  + 3x4  = 60 


2x2  + 3x3  + 9x4  = 122 

13.  Definiteness.  Let  A,  B be  n X n and  positive  definite. 
Are  —A,  AT,  A + B,  A — B positive  definite? 
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14.  CAS  PROJECT.  Cholesky’s  Method,  (a)  Write  a 
program  for  solving  linear  systems  by  Cholesky’s 
method  and  apply  it  to  Example  2 in  the  text,  to  Probs. 
7-9,  and  to  systems  of  your  choice. 


(b)  Splines.  Apply  the  factorization  part  of  the 
program  to  the  following  matrices  (as  they  occur  in 
(9),  Sec.  19.4  (with  Cj  = 1),  in  connection  with 
splines). 


2 

1 

0 


1 

4 

1 


0 

1 

2 


2 

1 

0 

0 


1 

4 

1 

0 


0 

1 

4 

1 


0 

0 

1 

2 


15-19 


INVERSE 


Find  the  inverse  by  the  Gauss-Jordan  method,  showing  the 
details. 


15.  In  Prob.  1 16.  In  Prob.  4 

17.  In  Team  Project  6(c)  18.  In  Prob.  9 

19.  In  Prob.  12 

20.  Rounding.  For  the  following  matrix  A find  det  A. 
What  happens  if  you  roundoff  the  given  entries  to 
(a)  5S,  (b)  4S,  (c)  3S,  (d)  2S,  (e)  IS?  What  is  the 
practical  implication  of  your  work? 


_JL  13 
63  28  49 


20.3  Linear  Systems:  Solution  by  Iteration 


The  Gauss  elimination  and  its  variants  in  the  last  two  sections  belong  to  the  direct  methods 
for  solving  linear  systems  of  equations;  these  are  methods  that  give  solutions  after  an 
amount  of  computation  that  can  be  specified  in  advance.  In  contrast,  in  an  indirect  or 
iterative  method  we  start  from  an  approximation  to  the  true  solution  and,  if  successful, 
obtain  better  and  better  approximations  from  a computational  cycle  repeated  as  often  as 
may  be  necessary  for  achieving  a required  accuracy,  so  that  the  amount  of  arithmetic 
depends  upon  the  accuracy  required  and  varies  from  case  to  case. 

We  apply  iterative  methods  if  the  convergence  is  rapid  (if  matrices  have  large  main 
diagonal  entries,  as  we  shall  see),  so  that  we  save  operations  compared  to  a direct  method. 
We  also  use  iterative  methods  if  a large  system  is  sparse,  that  is,  has  very  many  zero 
coefficients,  so  that  one  would  waste  space  in  storing  zeros,  for  instance,  9995  zeros  per 
equation  in  a potential  problem  of  104  equations  in  104  unknowns  with  typically  only  5 
nonzero  terms  per  equation  (more  on  this  in  Sec.  21.4). 


Gauss-Seidel  Iteration  Method4 

This  is  an  iterative  method  of  great  practical  importance,  which  we  can  simply  explain  in 
terms  of  an  example. 


EXAMPLE  Gauss-Seidel  Iteration 

We  consider  the  linear  system 


(1) 


*i  — 0.25x2  ~ 0.25^3  = 50 

— 0.25xi  + *2  — 0.25*4  = 50 

—0.25*i  + *3  — 0.25*4  = 25 

— 0.25*2  — 0.25*3  + *4  — 25. 


4PHILIPP  LUDWIG  VON  SEIDEL  (1821-1896),  German  mathematician.  For  Gauss  see  footnote  5 in 
Sec.  5.4. 
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(Equations  of  this  form  arise  in  the  numeric  solution  of  PDEs  and  in  spline  interpolation.)  We  write  the  system 
in  the  form 

*i  — 0.25x2  + 0.25*3  + 50 

*2  — 0.25*!  + 0.25*4  + 50 

(2) 

*3  = 0.25*i  + 0.25*4  + 25 

*4  = 0.25*2  + 0.25*3  + 25. 

These  equations  are  now  used  for  iteration;  that  is,  we  start  from  a (possibly  poor)  approximation  to  the  solution, 
say  *i0)  = 100,  *20)  — 100,  *30)  = 100,  *4>)  = 100,  and  compute  from  (2)  a perhaps  better  approximation 


+ 50.00  = 100.00 
+ 50.00  = 100.00 
+ 25.00  = 75.00 
+ 25.00  = 68.75 


Use  “old”  values 

(“New”  values  here  not  yet  available) 


(3) 


x™  -- 

rd) . 
x2  - 

r(l) - 
x3  - 

r(l)  . 


0.25x™  + 

0.254® 

0.25X*1* 

0.254® 

0.25x^ 

0.254® 

0.2541’  + 

0.25411 

Use  “new”  values 


These  equations  (3)  are  obtained  from  (2)  by  substituting  on  the  right  the  most  recent  approximation  for  each 
unknown.  In  fact,  corresponding  values  replace  previous  ones  as  soon  as  they  have  been  computed,  so  that  in 
the  second  and  third  equations  we  use  *i1}  (not  *i0)),  and  in  the  last  equation  of  (3)  we  use  xgP  and  *31)  (not 
*20)  and  *30)).  Using  the  same  principle,  we  obtain  in  the  next  step 

0.25x21)  + 0.25*§°  + 50.00  = 93.750 

0.25x‘i2)  + 0.25x4  J + 50.00  = 90.625 

0.25* i2)  + 0.25x4  ’ + 25.00  = 65.625 

0.25x1°  + 0.25x32>  + 25.00  = 64.062 

Further  steps  give  the  values 


r(2)  _ 
*1  — 


r(2)  _ 
x2  ~ 


r(2)  _ 
*3  — 


r(2)  _ 
*4  — 


X1 

*2 

*3 

*4 

89.062 

88.281 

63.281 

62.891 

87.891 

87.695 

62.695 

62.598 

87.598 

87.549 

62.549 

62.524 

87.524 

87.512 

62.512 

62.506 

87.506 

87.503 

62.503 

62.502 

Hence  convergence  to  the  exact  solution  X\  = *2  — 87.5,  *3  = *4  — 62.5  (verify!)  seems  rather  fast. 


An  algorithm  for  the  Gauss-Seidel  iteration  is  shown  in  Table  20.2.  To  obtain  the 
algorithm,  let  us  derive  the  general  formulas  for  this  iteration. 

We  assume  that  ajj  = 1 for  j = 1,  • • • , n.  (Note  that  this  can  be  achieved  if  we  can 
rearrange  the  equations  so  that  no  diagonal  coefficient  is  zero;  then  we  may  divide  each 
equation  by  the  corresponding  diagonal  coefficient.)  We  now  write 
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(4)  A = I + L + U (ajj  = 1) 

where  I is  the  n X n unit  matrix  and  L and  U are,  respectively,  lower  and  upper  triangular 
matrices  with  zero  main  diagonals.  If  we  substitute  (4)  into  Ax  = b,  we  have 

Ax  = (I  + L + U)x  = b. 

Taking  Lx  and  Ux  to  the  right,  we  obtain,  since  lx  = x, 

(5)  x = b — Lx  — Ux. 

Remembering  from  (3)  in  Example  1 that  below  the  main  diagonal  we  took  “new” 
approximations  and  above  the  main  diagonal  “old”  ones,  we  obtain  from  (5)  the  desired 
iteration  formulas 

“New”  “Old” 

i i 

(6)  x(m+1)  = b - Lx(m+1)  - Ux(m)  (ajj  = 1) 


where  x(m)  = [xjm)]  is  the  m\h  approximation  and  x(m+1)  = [xjm+1)]  is  the  (m  + l)st 
approximation.  In  components  this  gives  the  formula  in  line  1 in  Table  20.2.  The  matrix 
A must  satisfy  a:D  0 for  all  j.  In  Table  20.2  our  assumption  arj  = 1 is  no  longer  required, 

but  is  automatically  taken  care  of  by  the  factor  1 / a jj  in  line  1 . 


Table  20.2  Gauss-Seidel  Iteration 


ALGORITHM  GAUSS-SEIDEL  (A,  b,  x(0),  e,  N) 

This  algorithm  computes  a solution  x of  the  system  Ax  = b given  an  initial  approximation 
x<0),  where  A = [aj/c]  is  an  n X n matrix  with  J=  0,  j = 1,  • • • , n. 

INPUT:  A,  b,  initial  approximation  x(0),  tolerance  e > 0,  maximum  number 

of  iterations  N 

OUTPUT:  Approximate  solution  x(m)  = [xjm)]  or  failure  message  that  x(m  does 

not  satisfy  the  tolerance  condition 


For  m = 0,  • • • , N — 1,  do: 
For  j = 1,  • • • , n,  do: 


(m+1)  _ 


i 


j-i 


= - U - 2 ajkx/r+V  - 2 Ojkx £ 


(m) 


k= 1 


k=j+l 


End 

If  max  |x‘m+1) 


- xf0]  < e |xf+1)|  then  OUTPUT  xCm+1).  Stop 


[Procedure  completed  successfully] 


End 


OUTPUT:  “No  solution  satisfying  the  tolerance  condition  obtained  after  N 

iteration  steps.”  Stop 
[Procedure  completed  unsuccessfully ] 

End  GAUSS-SEIDEL 
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Convergence  and  Matrix  Norms 

An  iteration  method  for  solving  Ax  = b is  said  to  converge  for  an  initial  x(0>  if  the 
corresponding  iterative  sequence  xc0),  x(1),  x(2),  • • • converges  to  a solution  of  the  given 
system.  Convergence  depends  on  the  relation  between  x(m)  and  x(m+1>.  To  get  this  relation 
for  the  Gauss-Seidel  method,  we  use  (6).  We  first  have 

(I  + L)  x(m+1)  = b - Ux(m) 

and  by  multiplying  by  (I  + L)-1  from  the  left, 

(7)  x(m+1)  = Cx(m)  + (I  + L)_1b  where  C=-(I  + L)_1U. 

The  Gauss-Seidel  iteration  converges  for  every  x<0)  if  and  only  if  all  the  eigenvalues 
(Sec.  8.1)  of  the  “iteration  matrix”  C = [cpj  have  absolute  value  less  than  1.  (Proof  in 
Ref.  [E5],  p.  191,  listed  in  App.  1.) 

CAUTION ! If  you  want  to  get  C,  first  divide  the  rows  of  A by  aL1  to  have  main  diagonal 
1,  • ■ • , 1.  If  the  spectral  radius  of  C (=  maximum  of  those  absolute  values)  is  small,  then 
the  convergence  is  rapid. 

Sufficient  Convergence  Condition.  A sufficient  condition  for  convergence  is 

(8)  ||C  ||  < 1. 

Here  ||C||  is  some  matrix  norm,  such  as 


/ n n 

(9)  \\C\\  = /jr  jr  Cjl  (Frobenius  norm) 

^ j=ifc  = i 

or  the  greatest  of  the  sums  of  the  | c,jp  | in  a column  of  C 

n 

(10)  ||C||  = max  ^ cjk  (Column  “sum”  norm) 

* j=  i 

or  the  greatest  of  the  sums  of  the  | CpJ  in  a row  of  C 

n 

(11)  ||C  ||  = max  ^ I Cjk  I (Row  “sum”  norm). 

' fc  = i 

These  are  the  most  frequently  used  matrix  norms  in  numerics. 

In  most  cases  the  choice  of  one  of  these  norms  is  a matter  of  computational  convenience. 
However,  the  following  example  shows  that  sometimes  one  of  these  norms  is  preferable 
to  the  others. 
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EXAMPLE  2 


Test  of  Convergence  of  the  Gauss-Seidel  Iteration 

Test  whether  the  Gauss-Seidel  iteration  converges  for  the  system 
2x  + y + z = 4 
x + 2y  + z — 4 written 

x + y + 2z  = 4 

Solution.  The  decomposition  (multiply  the  matrix  by  ^ - why?)  is 


x = 2 2,y  — 2 Z 

y = 2 — \x  — \z 
z — 2 — \x  — \y. 


r i I 11 

1 2 2 

"0  0 (f 

G)  1 !l 

u 2 2 

1 l I 

2 1 2 

=I+L+U=I+ 

z 0 0 

+ 

o 0 \ 

I 1 1 

|_2  2 

J 1 0_ 

1 

o 

O 

O 

1 

It  shows  that 


C = 


-(I  + L)-1  U - - 


1 0 0 

~0  z z" 

"o  -1  -f 

2 1 0 

0 0 | 

= 

O 

WH 

1 

WH 

L 4 2 

1 

o 

o 

O 

1 

0 I 3 

Lu  8 8_| 

We  compute  the  Frobenius  norm  of  C 

lie  II  = (!  + ! + ^+  ^ + ^ + i)1/2  = (i)1/2  = 0.884  < 1 

and  conclude  from  (8)  that  this  Gauss-Seidel  iteration  converges.  It  is  interesting  that  the  other  two  norms  would 
permit  no  conclusion,  as  you  should  verify.  Of  course,  this  points  to  the  fact  that  (8)  is  sufficient  for  convergence 
rather  than  necessary. 


Residual.  Given  a system  Ax  = b,  the  residual  r of  x with  respect  to  this  system  is 
defined  by 

(12)  r = b — Ax. 

Clearly,  r = 0 if  and  only  if  x is  a solution.  Hence  r # 0 for  an  approximate  solution.  In 
the  Gauss-Seidel  iteration,  at  each  stage  we  modify  or  relax  a component  of  an 
approximate  solution  in  order  to  reduce  a component  of  r to  zero.  Hence  the  Gauss-Seidel 
iteration  belongs  to  a class  of  methods  often  called  relaxation  methods.  More  about  the 
residual  follows  in  the  next  section. 

Jacobi  Iteration 

The  Gauss-Seidel  iteration  is  a method  of  successive  corrections  because  for  each 
component  we  successively  replace  an  approximation  of  a component  by  a corresponding 
new  approximation  as  soon  as  the  latter  has  been  computed.  An  iteration  method  is  called 
a method  of  simultaneous  corrections  if  no  component  of  an  approximation  x(m)  is  used 
until  all  the  components  of  xCm)  have  been  computed.  A method  of  this  type  is  the  Jacobi 
iteration,  which  is  similar  to  the  Gauss-Seidel  iteration  but  involves  not  using  improved 
values  until  a step  has  been  completed  and  then  replacing  x(m)  by  x(  m ' 1 ' at  once,  directly 
before  the  beginning  of  the  next  step.  Hence  if  we  write  Ax  = b ( with  ajj  = 1 as  before!) 
in  the  form  x = b + (I  — A)x,  the  Jacobi  iteration  in  matrix  notation  is 

(13)  x(m+1)  = b + (I  - A)x(m)  (ajj  = 1). 
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This  method  converges  for  every  choice  of  x(0)  if  and  only  if  the  spectral  radius  of  I — A 
is  less  than  1 . It  has  recently  gained  greater  practical  interest  since  on  parallel  processors 
all  n equations  can  be  solved  simultaneously  at  each  iteration  step. 

For  Jacobi,  see  Sec.  10.3.  For  exercises,  see  the  problem  set. 


FR  OB  L£M=SET— 


1.  Verify  the  solution  in  Example  1 of  the  text. 

2.  Show  that  for  the  system  in  Example  2 the  Jacobi 
iteration  diverges.  Hint.  Use  eigenvalues. 

3.  Verify  the  claim  at  the  end  of  Example  2. 

GAUSS-SEIDEL  ITERATION 

Do  5 steps,  starting  from  xo  = [1  1 1]T  and  using  6S  in 

the  computation.  Hint.  Make  sure  that  you  solve  each  equation 
for  the  variable  that  has  the  largest  coefficient  (why?).  Show 
the  details. 

4.  4x1  — x2  =21 

-Xi  + 4x2  - x3  = -45 

- x2  + 4x3  = 33 

5.  IOjci  + x2  + x3  = 6 

Xi  + 10X2  + x3  = 6 

Xi  + x2  + 10x3  = 6 

6.  x2  + 7x3  = 25.5 

5xi  + x2  =0 

xi  + 6x2  + x3  = —10.5 


11.  Apply  the  Gauss-Seidel  iteration  (3  steps)  to  the  system 
in  Prob.  5,  starting  from  (a)  0,  0,  0 (b)  10,  10,  10. 
Compare  and  comment. 

12.  In  Prob.  5,  compute  C (a)  if  you  solve  the  first  equation 
for  X\,  the  second  for  X2,  the  third  for  x3,  proving 
convergence;  (b)  if  you  nonsensically  solve  the  third 
equation  forxi,  the  first  forx2,  the  second  forx3,  proving 
divergence. 

13.  CAS  Experiment.  Gauss-Seidel  Iteration,  (a)  Write 
a program  for  Gauss-Seidel  iteration. 

(b)  Apply  the  program  A (t)x  = b,  to  starting  from 
[0  0 Of,  where 


1 

t 

t 

2 

t 

1 

t 

, b = 

2 

t 

t 

1 

2 

For  t = 0.2,  0.5,  0.8,  0.9  determine  the  number  of 
steps  to  obtain  the  exact  solution  to  6S  and  the 
corresponding  spectral  radius  of  C.  Graph  the  number 
of  steps  and  the  spectral  radius  as  functions  of  t and 
comment. 


7.  5xj  — 2x2  = 18 

— 2xi  + 10x2  “ 2x3  = — 60 


(c)  Successive  overrelaxation  (SOR).  Show  that  by 

adding  and  subtracting  x(c) * * * * * * * * * (m)  on  the  right,  formula  (6) 

can  be  written 


— 2x2  + 15x3  = 128 
8.  3.x  x + 2x2  + x3  = 7 
Xi  + 3x2  + 2x3  = 4 
2xi  + x2  + 3x3  = 7 


x(m+l)  = x(m)  + b _ LxCm+l>  _ (U  + I)x<™> 

(°jj  — !)• 

Anticipation  of  further  corrections  motivates  the 

introduction  of  an  overrelaxation  factor  to  > 1 to  get 
the  SOR  formula  for  Gauss-Seidel 


9.  5x!  + x2  + 2x3  = 19 
Xi  + 4x2  — 2x3  = —2 
2xi  + 3x2  + 8x3  = 39 
10.  4.x ! + 5x3  = 12.5 

xi  + 6x2  + 2x3  = 18.5 

8xi  + 2x2  + x3  = — 11.5 


(14) 


x(m+l)  = x(m)  + w(b  _ Lx(m+1) 

- (U  + I)x(m))  ( Cljj  = 1) 


intended  to  give  more  rapid  convergence.  A rec- 
ommended value  is  co  = 2/(1  + Vl  — p),  where  p is 
the  spectral  radius  of  C in  (7).  Apply  SOR  to  the  matrix 
in  (b)  for  t = 0.5  and  0.8  and  notice  the  improvement  of 
convergence.  (Spectacular  gains  are  made  with  larger 

systems.) 
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14-17 


JACOBI  ITERATION 


Do  5 steps,  starting  from  x0  = [1  1 1],  Compare  with 

the  Gauss-Seidel  iteration.  Which  of  the  two  seems  to 
converge  faster?  Show  the  details  of  your  work. 


14.  The  system  in  Prob.  4 


15.  The  system  in  Prob.  9 


16.  The  system  in  Prob.  10 


17.  Show  convergence  in  Prob.  16  by  verifying  that  I — A, 
where  A is  the  matrix  in  Prob.  16  with  the  rows  divided 
by  the  corresponding  main  diagonal  entries,  has  the 
eigenvalues  —0.519589  and  0.259795  ± 0.2466031. 


18-20 


NORMS 


Compute  the  norms  (9),  (10),  (11)  for  the  following  (square) 
matrices.  Comment  on  the  reasons  for  greater  or  smaller 
differences  among  the  three  numbers. 


18.  The  matrix  in  Prob.  10 

19.  The  matrix  in  Prob.  5 


2k  —k  —k 


20. 


k -2k 


k 


—k  —k  2k 


20.4  Linear  Systems:  Ill-Conditioning,  Norms 

One  does  not  need  much  experience  to  observe  that  some  systems  Ax  = b are  good, 
giving  accurate  solutions  even  under  roundoff  or  coefficient  inaccuracies,  whereas  others 
are  bad,  so  that  these  inaccuracies  affect  the  solution  strongly.  We  want  to  see  what  is 
going  on  and  whether  or  not  we  can  “trust”  a linear  system.  Let  us  first  formulate  the  two 
relevant  concepts  (ill-  and  well-conditioned)  for  general  numeric  work  and  then  turn  to 
linear  systems  and  matrices. 

A computational  problem  is  called  ill-conditioned  (or  ill-posed)  if  “small”  changes  in 
the  data  (the  input)  cause  “large”  changes  in  the  solution  (the  output).  On  the  other  hand, 
a problem  is  called  well-conditioned  (or  well-posed)  if  “small”  changes  in  the  data  cause 
only  “small”  changes  in  the  solution. 

These  concepts  are  qualitative.  We  would  certainly  regard  a magnification  of  inaccuracies 
by  a factor  100  as  “large,”  but  could  debate  where  to  draw  the  line  between  “large”  and 
“small,”  depending  on  the  kind  of  problem  and  on  our  viewpoint.  Double  precision  may 
sometimes  help,  but  if  data  are  measured  inaccurately,  one  should  attempt  changing  the 
mathematical  setting  of  the  problem  to  a well-conditioned  one. 

Let  us  now  turn  to  linear  systems.  Figure  445  explains  that  ill-conditioning  occurs  if 
and  only  if  the  two  equations  give  two  nearly  parallel  lines,  so  that  their  intersection  point 
(the  solution  of  the  system)  moves  substantially  if  we  raise  or  lower  a line  just  a little. 
For  larger  systems  the  situation  is  similar  in  principle,  although  geometry  no  longer  helps. 
We  shall  see  that  we  may  regard  ill-conditioning  as  an  approach  to  singularity  of  the 
matrix. 


Fig.  445.  (a)  Well-conditioned  and  (b)  ill-conditioned 

linear  system  of  two  equations  in  two  unknowns 
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EXAMPLE  2 


An  Ill-Conditioned  System 

You  may  verify  that  the  system 

0.9999*  - l.OOOly  = 1 

* — y I 

has  the  solution  * = 0.5,  y = —0.5,  whereas  the  system 

0.9999*  - l.OOOly  = 1 

* — y = 1 + e 

has  the  solution  * = 0.5  + 5000. 5e,  y = —0.5  + 4999. 5e.  This  shows  that  the  system  is  ill-conditioned  because 
a change  on  the  right  of  magnitude  e produces  a change  in  the  solution  of  magnitude  5000e,  approximately. 
We  see  that  the  lines  given  by  the  equations  have  nearly  the  same  slope. 


Well-conditioning  can  be  asserted  if  the  main  diagonal  entries  of  A have  large  absolute 
values  compared  to  those  of  the  other  entries.  Similarly  if  A-1  and  A have  maximum 
entries  of  about  the  same  absolute  value. 

Ill-conditioning  is  indicated  if  A-1  has  entries  of  large  absolute  value  compared  to  those 
of  the  solution  (about  5000  in  Example  1)  and  if  poor  approximate  solutions  may  still 
produce  small  residuals. 

Residual.  The  residual  r of  an  approximate  solution  x of  Ax  = b is  defined  as 

(1)  r = b — Ax. 

Now  b = Ax,  so  that 

(2)  r = A(x  — Ax). 

Hence  r is  small  if  x has  high  accuracy,  but  the  converse  may  be  false: 

Inaccurate  Approximate  Solution  with  a Small  Residual 

The  system 


1.000  1a  i + a2  = 2.0001 

x ! + 1.0001*2  = 2.0001 

has  the  exact  solution  *i  = l,x2  = 1-  Can  you  see  this  by  inspection?  The  very  inaccurate  approximation 
*i  = 2.0000,  a 2 = 0.0001  has  the  very  small  residual  (to  4D) 


2.0001 

1.0001  1.0000 

2.0000 

2.0001 

2.0003 

-0.0002 

2.0001 

1.0000  1.0001 

0.0001 

2.0001 

2.0001 

0.0000 

From  this,  a naive  person  might  draw  the  false  conclusion  that  the  approximation  should  be  accurate  to  3 or  4 
decimals. 

Our  result  is  probably  unexpected,  but  we  shall  see  that  it  has  to  do  with  the  fact  that  the  system  is 
ill-conditioned. 


Our  goal  is  to  show  that  ill-conditioning  of  a linear  system  and  of  its  coefficient  matrix  A 
can  be  measured  by  a number,  the  condition  number  /c(A).  Other  measures  for  ill-conditioning 
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have  also  been  proposed,  but  k( A)  is  probably  the  most  widely  used  one.  k( A)  is  defined  in 
terms  of  norm,  a concept  of  great  general  interest  throughout  numerics  (and  in  modern 
mathematics  in  general!).  We  shall  reach  our  goal  in  three  steps,  discussing 

1.  Vector  norms 

2.  Matrix  norms 

3.  Condition  number  k of  a square  matrix 

Vector  Norms 

A vector  norm  for  column  vectors  x = \xj\  with  n components  (n  fixed)  is  a generalized 
length  or  distance.  It  is  denoted  by  ||x||  and  is  defined  by  four  properties  of  the  usual 
length  of  vectors  in  three-dimensional  space,  namely, 

x ||  is  a nonnegative  real  number, 
x ||  = 0 if  and  only  if  x = 0. 

&x||  = |^|  || x ||  for  all  k. 

(d)  ||  x + y ||  ^ ||  x ||  + ||y||  (Triangle  inequality). 

If  we  use  several  norms,  we  label  them  by  a subscript.  Most  important  in  connection  with 
computations  is  the  p-norm  defined  by 


(a) 

(b) 


(4) 

|x  |p  - (Uilp  + \x2 p + • • • ■ 

+ UJP)1/P 

where  p is  a 
third  norm,  | 

fixed  number  and  p § 1 . In  practice,  one 
| x ||oo  (the  latter  as  defined  below),  that  is. 

usually  takes  p = 1 or  2 and,  as  a 

(5) 

||x||i  = M + • • • + \xn\ 

(“/rnorm”) 

(6) 

II X ||2  = Vxi  + ■ • • + Xn 

(“Euclidean”  or  “/2-norm”) 

(7) 

||x||oo  = max  \xj\ 

(“/co-norm”). 

For  n = 3 the  /2-norm  is  the  usual  length  of  a vector  in  three-dimensional  space.  The 
/j-norm  and  /^-norm  are  generally  more  convenient  in  computation.  But  all  three  norms 
are  in  common  use. 

Vector  Norms 

IfxT  = [2  -3  0 1 -4],  then  II x Hi  = 10,  ||x||2  = V30,  Jx||„  = 4. 

In  three-dimensional  space,  two  points  with  position  vectors  x and  x have  distance  |x  — x| 
from  each  other.  For  a linear  system  Ax  = b,  this  suggests  that  we  take  ||x  — x||  as  a 
measure  of  inaccuracy  and  call  it  the  distance  between  an  exact  and  an  approximate 
solution,  or  the  error  of  x. 

Matrix  Norm 

If  A is  an  n X n matrix  and  x any  vector  with  n components,  then  Ax  is  a vector  with  n 
components.  We  now  take  a vector  norm  and  consider  ||x||  and  ||Ax||.  One  can  prove  (see 
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Ref.  [E17],  pp.  77,  92-93,  listed  in  App.  1)  that  there  is  a number  c (depending  on  A) 
such  that 

(8)  II  Ax  ||  g c||x||  for  all  x. 

Let  x A 0.  Then  ||x||  > 0 by  (3b)  and  division  gives  ||  Ax ||/||  x ||  = c.  We  obtain  the  smallest 
possible  c valid  for  all  x (A  0)  by  taking  the  maximum  on  the  left.  This  smallest  c is 
called  the  matrix  norm  of  A corresponding  to  the  vector  norm  we  picked  and  is  denoted 
by  ||  A ||.  Thus 


||  Ax  1 1 

(9)  ||A||  = max  — - (x  A 0), 

|| x || 

the  maximum  being  taken  over  all  x A 0.  Alternatively  [see  (c)  in  Team  Project  24], 

(10)  || A ||  = max  ||Ax||. 

M=i 

The  maximum  in  (10)  and  thus  also  in  (9)  exists.  And  the  name  “matrix  norm”  is 
justified  because  ||  A||  satisfies  (3)  with  x and  y replaced  by  A and  B.  (Proofs  in  Ref.  [E17] 
pp.  77,  92-93.) 

Note  carefully  that  ||  A||  depends  on  the  vector  norm  that  we  selected.  In  particular,  one 
can  show  that 

for  the  /j-norm  (5)  one  gets  the  column  “sum”  norm  (10),  Sec.  20.3, 
for  the  /co-norm  (7)  one  gets  the  row  “sum”  norm  (11),  Sec.  20.3. 

By  taking  our  best  possible  (our  smallest)  c = ||  A ||  we  have  from  (8) 

(11)  ||  Ax ||  g ||  A ||  || x ||  . 

This  is  the  formula  we  shall  need.  Formula  (9)  also  implies  for  two  n X n matrices  (see 
Ref.  [E17],  p.  98) 

(12)  ||  AB||  g ||  A||  ||B||,  thus  ||  An||  g ||  Af  . 

See  Refs.  [E9]  and  [E17]  for  other  useful  formulas  on  norms. 

Before  we  go  on,  let  us  do  a simple  illustrative  computation. 


Matrix  Norms 

Compute  the  matrix  norms  of  the  coefficient  matrix  A in  Example  1 and  of  its  inverse  A-1,  assuming  that  we 
use  (a)  the  ^ -vector  norm,  (b)  the  /^-vector  norm. 

Solution.  We  use  (4*),  Sec.  7.8,  for  the  inverse  and  then  (10)  and  (11)  in  Sec.  20.3.  Thus 


"0.9999 

- 1.000  r 

”—5000.0 

5000.5” 

A = 

-1.0000 

-1.0000- 

A-1  = 

_— 5000.0 

4999.5- 

(a)  The  /] -vector  norm  gives  the  column  “sum”  norm  (10),  Sec.  20.3;  from  Column  2 we  thus  obtain 
|| A ||  = | — 1.0001 1 + | — 1 .0000 1 = 2.0001.  Similarly,  ||A_1||  = 10,000. 
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(b)  The  /, .-vector  norm  gives  the  row  “sum”  norm  (11),  Sec.  20.3;  thus  ||A||  = 2,  ||A_1||  = 10000.5  from 
Row  1.  We  notice  that  ||A-1||  is  surprisingly  large,  which  makes  the  product  ||A||  ||A-1|[  large  (20,001).  We  shall 
see  below  that  this  is  typical  of  an  ill-conditioned  system. 

Condition  Number  of  a Matrix 

We  are  now  ready  to  introduce  the  key  concept  in  our  discussion  of  ill-conditioning,  the 
condition  number  k(A)  of  a (nonsingular)  square  matrix  A,  defined  by 

(13)  k(A)  = ||  A.  ||  1 1 A- 1 1 1 . 

The  role  of  the  condition  number  is  seen  from  the  following  theorem. 


Condition  Number 

A linear  system  of  equations  Ax  = b and  its  matrix  A whose  condition  number  (13) 
is  small  are  well-conditioned.  A large  condition  number  indicates  ill-conditioning. 


b = Ax  and  (11)  give  ||bj|  Si  ||  A|  ||x||.  Let  b =£  0 and  x # 0.  Then  division  by  ||b||  ||x 
gives 


Multiplying  (2)  r = A(x  — x)  by  A 1 from  the  left  and  interchanging  sides,  we  have 
x — x = A-1r.  Now  (11)  with  A-1  and  r instead  of  A and  x yields 

|| x — x ||  = ||A_1r||  Si  ||A_1||||r||  . 


Division  by  ||x||  [note  that  ||x||  0 by  (3b)]  and  use  of  (14)  finally  gives 


Hence  if  k(A)  is  small,  a small  ||r||/||b||  implies  a small  relative  error  ||x  — x||/||x||,  so 
that  the  system  is  well-conditioned.  However,  this  does  not  hold  if  k( A)  is  large;  then  a 
small  ||r||/||b||  does  not  necessarily  imply  a small  relative  error  ||x  — x||/||x||. 


Condition  Numbers.  Gauss-Seidel  Iteration 


~5 

1 

1 

~ 12 

-2 

— 2 

A = 

1 

4 

2 

, 1 

has  the  inverse  A = — 

-2 

19 

-9 

56 

1 

2 

4 

-2 

-9 

19 

Since  A is  symmetric,  (10)  and  (11)  in  Sec.  20.3  give  the  same  condition  number 

k(A)  = || A ||  || A-1 1|  = 7 ■ bb  -30  = 3.75. 

We  see  that  a linear  system  Ax  = b with  this  A is  well-conditioned. 
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For  instance,  if  b = [14  0 28]T,  the  Gauss  algorithm  gives  the  solution  x = [2  —5  9]T,  (confirm 

this).  Since  the  main  diagonal  entries  of  A are  relatively  large,  we  can  expect  reasonably  good  convergence  of 
the  Gauss-Seidel  iteration.  Indeed,  starting  from,  say,  Xq  = [1  1 1]T,  we  obtain  the  first  8 steps  (3D  values) 


Xi 

% 

*3 

1.000 

1.000 

1.000 

2.400 

-1.100 

6.950 

1.630 

-3.882 

8.534 

1.870 

-4.734 

8.900 

1.967 

-4.942 

8.979 

1.993 

-4.988 

8.996 

1.998 

-4.997 

8.999 

2.000 

-5.000 

9.000 

2.000 

-5.000 

9.000 

lU-Conditioned  Linear  System 

Example  4 gives  by  (10)  or  (11),  Sec.  20.3,  for  the  matrix  in  Example  1 the  very  large  condition  number 
/c(A)  = 2.0001  • 10000  = 2 • 10000.5  = 200001.  This  confirms  that  the  system  is  very  ill-conditioned. 
Similarly  in  Example  2,  where  by  (4*),  Sec.  7.8  and  6D-computation, 


1 

1.0001 

-1.0000 

5000.5 

-5.000.0 

0.0002 

-1.0000 

1.0001 

-5000.0 

5000.5 

so  that  (10),  Sec.  20.3,  gives  a very  large  k(A),  explaining  the  surprising  result  in  Example  2, 

k(  A)  = (1.0001  + 1.0000)(5000.5  + 5000.0)  = 20,002. 

In  practice.  A-1  will  not  be  known,  so  that  in  computing  the  condition  number  k( A),  one 
must  estimate  ||  A-1||.  A method  for  this  (proposed  in  1979)  is  explained  in  Ref.  [E9]  listed 
in  App.  1. 

Inaccurate  Matrix  Entries.  k( A)  can  be  used  for  estimating  the  effect  8x  of  an  inaccuracy 
<5A  of  A (errors  of  measurements  of  the  ajk,  for  instance).  Instead  of  Ax  = b we  then  have 

(A  + SA)(x  + Sx)  = b. 

Multiplying  out  and  subtracting  Ax  = b on  both  sides,  we  obtain 

A5x  + 5A(x  + Sx)  = 0. 

Multiplication  by  A-1  from  the  left  and  taking  the  second  term  to  the  right  gives 

8x  = — A_15A(x  + <5x). 

Applying  (11)  with  A-1  and  vector  <5A(x  + Sx)  instead  of  A and  x,  we  get 
||  Sx ||  = ||A_1SA(x  + <5x)||  Si  ||  A-1 1|  ||SA(x  + Sx)|| . 


Applying  (11)  on  the  right,  with  <5  A and  x — 8x  instead  of  A and  x,  we  obtain 
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Now  ||  A_1||  = k(A)/||  A ||  by  the  definition  of  k( A),  so  that  division  by  ||x  + <5x||  shows 
that  the  relative  inaccuracy  of  x is  related  to  that  of  A via  the  condition  number  by  the 
inequality 

||  dx  ||  ||  5x  ||  ||  SA  || 

(16)  vv  ” I,  ^ g „ g II A-1H  IlfiAll  = *(A)  Y-rr  ■ 

x x + ox  A 


Conclusion.  If  the  system  is  well-conditioned,  small  inaccuracies  ||8A||/||A||  can  have 
only  a small  effect  on  the  solution.  However,  in  the  case  of  ill-conditioning,  if  ||5A||/||A|| 
is  small,  ||5x||/||x||  may  be  large. 

Inaccurate  Right  Side.  You  may  show  that,  similarly,  when  A is  accurate,  an  inaccuracy 
Sb  of  b causes  an  inaccuracy  <5x  satisfying 


(17) 


Hence  ||<5x||/||x||  must  remain  relatively  small  whenever  k( A)  is  small. 


Inaccuracies.  Bounds  (16)  and  (17) 

If  each  of  the  nine  entries  of  A in  Example  5 is  measured  with  an  inaccuracy  of  0.1,  then  ||5A||  = 9-0.1  and 
(16)  gives 

|[<5x||  3-01 

£7.5 — = 0.321  thus  IISxll  £ 0.321  llxll  = 0.321  ■ 16  = 5.14. 

||x||  7 

By  experimentation  you  will  find  that  the  actual  inaccuracy  ||Sx||  is  only  about  30%  of  the  bound  5.14.  This  is 
typical. 

Similarly,  if  Sb  = [0.1  0.1  0. 1 )T,  then  ||Sb||  = 0.3  and  ||b||  = 42  in  Example  5,  so  that  (17)  gives 

||5x||  0 3 

£ 7.5  • — = 0.0536,  hence  II  5x||  £ 0.0536  • 16  = 0.857 

IWI  42 


but  this  bound  is  again  much  greater  than  the  actual  inaccuracy,  which  is  about  0.15. 


Further  Comments  on  Condition  Numbers.  The  following  additional  explanations 
may  be  helpful. 

1.  There  is  no  sharp  dividing  line  between  “well-conditioned”  and  “ill-conditioned,” 
but  generally  the  situation  will  get  worse  as  we  go  from  systems  with  small  k( A)  to  systems 
with  larger  k{ A).  Now  always  k( A)  §£  1,  so  that  values  of  10  or  20  or  so  give  no  reason 
for  concern,  whereas  k( A)  = 100,  say,  calls  for  caution,  and  systems  such  as  those  in 
Examples  1 and  2 are  extremely  ill-conditioned. 

2.  If  k{ A)  is  large  (or  small)  in  one  norm,  it  will  be  large  (or  small,  respectively)  in 
any  other  norm.  See  Example  5. 

3.  The  literature  on  ill-conditioning  is  extensive.  For  an  introduction  to  it,  see  [E9] . 

This  is  the  end  of  our  discussion  of  numerics  for  solving  linear  systems.  In  the  next  section 
we  consider  curve  fitting,  an  important  area  in  which  solutions  are  obtained  from  linear  systems. 
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1-6 


VECTOR  NORMS 


Compute  the  norms  (5),  (6),  (7).  Compute  a corresponding 
unit  vector  (vector  of  norm  1 ) with  respect  to  the  Zc„-norm. 


19-20 


ILL-CONDITIONED  SYSTEMS 


Solve  Ax  = bi,  Ax  = b2-  Compare  the  solutions  and 
comment.  Compute  the  condition  number  of  A. 


1.  [1  -3  8 0 -6  0] 

2.  [4  -1  8] 


4.50 

3.55' 

’5.2' 

’5.2" 

19.  A = 

3.55 

2.80 

, bi  = 

4.1 

, b2  = 

4.0 

3.  [0.2  0.6  -2.1  3.0] 

4.  \k2,  4k,  k3],  k > 4 

5.  [1  1 1 1 1] 

6.  [0  0 0 1 0] 

7.  For  what  x = [a  b c]  will  ||x||i  = ||x||2? 

8.  Show  that  ||x||oo  £ ||x||2  S ||x||i. 


9-16 


MATRIX  NORMS, 
CONDITION  NUMBERS 


Compute  the  matrix  norm  and  the  condition  number 
corresponding  to  the  Zi-ve ctor  norm. 


2 1 

2.1 

4.5" 

10. 

0 4_ 

0.5 

1.8 

3.0  \.l 

4.7' 

’4.7 

, bx  = 

> b2  — 

1.7  1.0 

2.7 

2.71 

21.  Residual.  For  Ax  = bi  in  Prob.  19  guess  what  the 
residual  of  x = [ — 10.0  14. 1]T,  very  poorly  approx- 
imating [—2  4]T,  might  be.  Then  calculate  and 

comment. 


22.  Show  that  k( A)  £ 1 for  the  matrix  norms  (10),  (11), 
Sec.  20.3,  and  /c(A)  £ Vn  for  the  Frobenius  norm  (9), 
Sec.  20.3. 

23.  CAS  EXPERIMENT.  Hilbert  Matrices.  The  3 X 3 

Hilbert  matrix  is 


H 


3 — 


1 

2 

1 

3 


1 

2 

1 

3 

1 

4 


1 

3 

1 

4 

1 

5 


'Vs 

5' 

’7  6" 

11. 

0 

-V5 

12. 

6 5 

0.01 


13. 


4 -1 
3 0 


14. 


1 0.01 
0.01  1 


7 -12  2 


0 0.01  1 


-20  0 0 


15. 


0 


0.05  0 


0 0 20 


The  n X n Hilbert  matrix  is  H„  = [ZtjjJ,  where 
hjk  — 1 /(j  + k — 1).  (Similar  matrices  occur  in 
curve  fitting  by  least  squares.)  Compute  the  condition 
number  k(H„)  for  the  matrix  norm  corresponding  to 
the  Zoo-  (or  l\-)  vector  norm,  for  n = 2,  3,  ■ ■ ■ , 6 (or 
further  if  you  wish).  Try  to  find  a formula  that  gives 
reasonable  approximate  values  of  these  rapidly 
growing  numbers. 

Solve  a few  linear  systems  of  your  choice,  involving 
an  Hn. 

24.  TEAM  PROJECT.  Norms,  (a)  Vector  norms  in  our 
text  are  equivalent,  that  is,  they  are  related  by  double 
inequalities;  for  instance, 


21 

10.5 

7 

5.25 

10.5 

7 

5.25 

4.2 

7 

5.25 

4.2 

3.5 

5.25 

4.2 

3.5 

3 

17.  Verify  (11)  for  x = [3  15  — 4]T  taken  with  the 

Zoo-norm  and  the  matrix  in  Prob.  13. 


(a)  ||x||„  fi  HxIIj  fi  m||x||„ 

(18)  l 

(b)  ^ II x II,  S || x ||oo  S || x Hi- 

Hence  if  for  some  x,  one  norm  is  large  (or  small),  the 
other  norm  must  also  be  large  (or  small).  Thus  in  many 
investigations  the  particular  choice  of  a norm  is  not 
essential.  Prove  (18). 

(b)  The  Cauchy-Schwarz  inequality  is 


18.  Verify  (12)  for  the  matrices  in  Probs.  9 and  10. 


|xTy|  £ 


xlb  l|y||2- 
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It  is  very  important.  (Proof  in  Ref.  [GenRef7]  listed 
in  App.  1.)  Use  it  to  prove 

(19a)  ||x||2  S ||x||i  S Vn||x||2 

(19b)  II  x Hi  = ||  x ||2  = ||  x ||i - 

wn 

(c)  Formula  (10)  is  often  more  practical  than  (9). 
Derive  (10)  from  (9). 

(d)  Matrix  norms.  Illustrate  ( 1 1 ) with  examples . Give 
examples  of  (12)  with  equality  as  well  as  with  strict 


inequality.  Prove  that  the  matrix  norms  (10),  (11)  in 
Sec.  20.3  satisfy  the  axioms  of  a norm 

II A||  £ 0. 

||  A ||  = 0 if  and  only  if  A = 0, 

||*-A||  = \k\  ||  A ||, 

II A + B ||  S || A ||  + ||B||. 

25.  WRITING  PROJECT.  Norms  and  Their  Use  in 
This  Section.  Make  a list  of  the  most  important  of  the 
many  ideas  covered  in  this  section  and  write  a two- 
page  report  on  them. 


20.5  Least  Squares  Method 

Having  discussed  numerics  for  linear  systems,  we  now  turn  to  an  important  application, 
curve  fitting,  in  which  the  solutions  are  obtained  from  linear  systems. 

In  curve  fitting  we  are  given  n points  (pairs  of  numbers)  (x±,  jq),  • • • , (xn,  yn)  and  we 
want  to  determine  a function  fix)  such  that 


f(x  i)  ~ yi,  ■ ' ■ , f(xn)  ~ yn, 


approximately.  The  type  of  function  (for  example,  polynomials,  exponential  functions, 
sine  and  cosine  functions)  may  be  suggested  by  the  nature  of  the  problem  (the  underlying 
physical  law,  for  instance),  and  in  many  cases  a polynomial  of  a certain  degree  will  be 
appropriate. 

Let  us  begin  with  a motivation. 

If  we  require  strict  equality  fix  \ ) = yi,  • • • , f(xn)  = yn  and  use  polynomials  of 
sufficiently  high  degree,  we  may  apply  one  of  the  methods  discussed  in  Sec.  19.3  in 
connection  with  interpolation.  However,  in  certain  situations  this  would  not  be  the 
appropriate  solution  of  the  actual  problem.  For  instance,  to  the  four  points 

(1)  (-1.3,0.103),  (-0.1,1.099),  (0.2,0.808),  (1.3,1.897) 

there  corresponds  the  interpolation  polynomial /(x)  = x3  — x + 1 (Fig.  446),  but  if  we 
graph  the  points,  we  see  that  they  lie  nearly  on  a straight  line.  Hence  if  these  values 
are  obtained  in  an  experiment  and  thus  involve  an  experimental  error,  and  if  the  nature 
of  the  experiment  suggests  a linear  relation,  we  better  fit  a straight  line  through 
the  points  (Fig.  446).  Such  a line  may  be  useful  for  predicting  values  to  be  expected 
for  other  values  of  x.  A widely  used  principle  for  fitting  straight  lines  is  the  method 


Fig.  446  Approximate  fitting  of  a straight  line 
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of  least  squares  by  Gauss  and  Legendre.  In  the  present  situation  it  may  be  formulated 
as  follows. 


Method  of  Least  Squares.  The  straight  line 
(2)  y = a + bx 

should  be  fitted  through  the  given  points  (x\,  yf),  • • • , (xn,  yn ) so  that  the  sum  of  the 
squares  of  the  distances  of  those  points  from  the  straight  line  is  minimum,  where 
the  distance  is  measured  in  the  vertical  direction  ( the  y-direction). 


The  point  on  the  line  with  abscissa  x,  has  the  ordinate  a + hxy  Hence  its  distance  from 
( Xj , yf)  is  \}'j  — a — bxf  (Fig.  447)  and  that  sum  of  squares  is 

n 

q = ^(yj  ~ a - bxf)2. 

j= 1 

q depends  on  a and  b.  A necessary  condition  for  q to  be  minimum  is 

2 ^(yj  ~ a — bxf)  = 0 
2 2 xj  ( yj  — a — bxf)  = 0 


(3) 


dq 

da 

df ; 

db 


(where  we  sum  over  j from  1 to  n).  Dividing  by  2,  writing  each  sum  as  three  sums,  and 
taking  one  of  them  to  the  right,  we  obtain  the  result 


(4) 


an  + Xj  = 2 yj 

a^Xj  + b^  xf  = ^ xpj. 


These  equations  are  called  the  normal  equations  of  our  problem. 


Fig.  447.  Vetrical  distance  of  a point  [Xj,y.) 
from  a straight  line  y = a + bx 


Straight  Line 

Using  the  method  of  least  squares,  fit  a straight  line  to  the  four  points  given  in  formula  (1). 
Solution.  We  obtain 

n = 4,  2-U  ~ 0.1,  2-vf  = 3-43’  Eft  = 3-907-  XUU  = 2.3839. 
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Hence  the  normal  equations  are 


4a  + 0.1  0*  = 3.9070 
0.1a  + 3.43  * = 2.3839. 

The  solution  (rounded  to  4D)  is  a = 0.9601,  b = 0.6670,  and  we  obtain  the  straight  line  (Fig.  446) 

y = 0.9601  + 0.6670a-. 


Curve  Fitting  by  Polynomials  of  Degree  m 

Our  method  of  curve  fitting  can  be  generalized  from  a polynomial  y = a + bx  to  a 
polynomial  of  degree  m 

(5)  p(x ) = bo  + b\x  + • • • + bmxm 
where  m Si  n — 1 . Then  q takes  the  form 

n 

<?  = 2 (yj  ~ p(xj)f 

j=  1 

and  depends  on  m + 1 parameters  bo,  ■ ■ • , bm.  Instead  of  (3)  we  then  have  m + 1 
conditions 

dq  dq 

(6)  — = 0,  • • • , — = 0 

db0  dbm 

which  give  a system  of  m + 1 normal  equations. 

In  the  case  of  a quadratic  polynomial 

(7)  p(x)  = bo  + b\X  + b^x2, 

the  normal  equations  are  (summation  from  1 to  n) 

bon  + &i2  xj  + bz^j  xf 

(8)  bo^  Xj  + /ji2  xf  + b2^j  xf 

bo^j  xf  + bi^  xf  + b2^  xf 

The  derivation  of  (8)  is  left  to  the  reader. 

Quadratic  Parabola  by  Least  Squares 

Fit  a parabola  through  the  data  (0,  5),  (2,  4),  (4,  1),  (6,  6),  (8,  7). 

Solution.  For  the  normal  equations  we  need  n = 5,  — 20,  ^Zxf  = 120,  — 800,  IZxf  = 5664, 

2yj  = 23,  2 <Xjyj  = 104,  2 ixfyj  = 696.  Hence  these  equations  are 

5b0  + 20Z?i  + 120/72  = 23 


= 2* 

= 2 xjyj 
= 2 xjyj- 


20  b0  + 120hi  + 800*2  = 104 
120*o  + 800*!  + 5664*2  = 696. 
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Solving  them  we  obtain  the  quadratic  least  squares  parabola  (Fig.  448) 

y = 5.11429  - 1.41429a-  + 0.21429a2. 


Fig.  448  Least  squares  parabola  in  Example  2 

For  a general  polynomial  (5)  the  normal  equations  form  a linear  system  of  equations  in 
the  unknowns  b0,  ■ ■ ■ , bm.  When  its  matrix  M is  nonsingular,  we  can  solve  the  system 
by  Cholesky’s  method  (Sec.  20.2)  because  then  M is  positive  definite  (and  symmetric). 
When  the  equations  are  nearly  linearly  dependent,  the  normal  equations  may  become 
ill-conditioned  and  should  be  replaced  by  other  methods;  see  [E5],  Sec.  5.7,  listed  in 
App.  1. 

The  least  squares  method  also  plays  a role  in  statistics  (see  Sec.  25.9). 


PR  OBL  EM  S ET20^5 


1-6 


FITTING  A STRAIGHT  LINE 


Fit  a straight  line  to  the  given  points  ( x , y)  by  least  squares. 
Show  the  details.  Check  your  result  by  sketching  the  points 
and  the  line.  Judge  the  goodness  of  fit. 


1.  (0,  2),  (2,  0),  (3,  -2),  (5,  -3) 


2.  How  does  the  line  in  Prob.  1 change  if  you  add  a point 
far  above  it,  say,  (1,  3)?  Guess  first. 

3.  (0,1.8),  (1,1.6),  (2,1.1),  (3,1.5),  (4,2.3) 


4.  Hooke’s  law  F = ks.  Estimate  the  spring  modulus  k 
from  the  force  F [lb]  and  the  elongation  s [cm],  where 
(F,  s ) = (1,  0.3),  (2,  0.7),  (4,  1.3),  (6,  1.9),  (10,  3.2), 
(20,  6.3). 


5.  Average  speed.  Estimate  the  average  speed  <;av  of  a 
car  traveling  according  to  s = v • t [km]  ( s = distance 
traveled,  t [hr]  = time)  from  ( t , s)  = (9, 140),  (10,  220), 
(11,310),  (12,410). 


6.  Ohm’s  law  U = Ri.  Estimate  R from  (/,  U)  — (2, 104), 
(4,  206),  (6,314),  (10,  530). 


8-11 


FITTING  A QUADRATIC  PARABOLA 


Fit  a parabola  (7)  to  the  points  (x,  y).  Check  by  sketching. 


8.  (-1,5),  (1,3),  (2,4),  (3,8) 

9.  (2,  -3),  (3, 0),  (5,  1),  (6,  0)  (7,  -2) 

10.  t [hr]  = Worker’s  time  on  duty,  y [sec]  = His/her 
reaction  time,  (t,y)  = ( 1,2.0),  (2,1.78),  (3,1.90), 
(4,  2.35),  (5,  2.70) 

11.  The  data  in  Prob.  3.  Plot  the  points,  the  line,  and  the 
parabola  jointly.  Compare  and  comment. 

12.  Cubic  parabola.  Derive  the  formula  for  the  normal 
equations  of  a cubic  least  squares  parabola. 

13.  Fit  curves  (2)  and  (7)  and  a cubic  parabola  by  least  squares 
to  (x,  y)  = (-2,  -30),  (-1,  -4),  (0,4),  (1,4),  (2,22), 
(3,  68).  Graph  these  curves  and  the  points  on  common 
axes.  Comment  on  the  goodness  of  fit. 

14.  TEAM  PROJECT.  The  least  squares  approximation 
of  a function  f(x)  on  an  interval  a = x = b by  a 
function 


7.  Derive  the  normal  equations  (8). 


F m(x)  = aoJ’oM  + a\}'\(x)  + • • • + amym(x) 
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(b)  Polynomial.  What  form  does  (10)  take  if 
Fm(x)  = a o + ape  + ■ ■ ■ + amxm?  What  is  the 
coefficient  matrix  of  (10)  in  this  case  when  the  interval 
is  0 £ x £ 1? 

(c)  Orthogonal  functions.  What  are  the  solutions  of 
(10)  if  yo(x),  ■ ■ ■ , ym(x)  are  orthogonal  on  the  interval 
flSxSW  (For  the  definition,  see  Sec.  1 1.5.  See  also 
Sec.  11.6.) 

15.  CAS  EXPERIMENT.  Least  Squares  versus  Inter- 
polation. For  the  given  data  and  for  data  of  your 
choice  find  the  interpolation  polynomial  and  the  least 
squares  approximations  (linear,  quadratic,  etc.). 
Compare  and  comment. 

(a)  (-2,0),  (-1,0),  (0,1),  (1,0),  (2,0) 

(b)  (-4,0),  (-3,0),  (-2,0),  (-1,0),  (0,1), 

(1,0),  (2,0),  (3,0),  (4,0) 

(c)  Choose  five  points  on  a straight  line,  e.g.,  (0,  0), 
(1,  1),  ■ ■ ■ , (4,  4).  Move  one  point  1 unit  upward  and 
find  the  quadratic  least  squares  polynomial.  Do  this  for 
each  point.  Graph  the  five  polynomials  on  common 
axes.  Which  of  the  five  motions  has  the  greatest  effect? 


20.6  Matrix  Eigenvalue  Problems:  Introduction 

We  now  come  to  the  second  part  of  our  chapter  on  numeric  linear  algebra.  In  the  first 
part  of  this  chapter  we  discussed  methods  of  solving  systems  of  linear  equations,  which 
included  Gauss  elimination  with  backward  substitution.  This  method  is  known  as  a direct 
method  since  it  gives  solutions  after  a prescribed  amount  of  computation.  The  Gauss 
method  was  modified  by  Doolittle’s  method,  Crout’s  method,  and  Cholesky’s  method, 
each  requiring  fewer  arithmetic  operations  than  Gauss.  Finally  we  presented  indirect 
methods  of  solving  systems  of  linear  equations,  that  is,  the  Gauss-Seidel  method  and  the 
Jacobi  iteration.  The  indirect  methods  require  an  undetermined  number  of  iterations.  That 
number  depends  on  how  far  we  start  from  the  true  solution  and  what  degree  of  accuracy 
we  require.  Moreover,  depending  on  the  problem,  convergence  may  be  fast  or  slow  or  our 
computation  cycle  might  not  even  converge.  This  led  to  the  concepts  of  ill-conditioned 
problems  and  condition  numbers  that  help  us  gain  some  control  over  difficulties  inherent 
in  numerics. 

The  second  part  of  this  chapter  deals  with  some  of  the  most  important  ideas  and  numeric 
methods  for  matrix  eigenvalue  problems.  This  very  extensive  part  of  numeric  linear  algebra 
is  of  great  practical  importance,  with  much  research  going  on,  and  hundreds,  if  not 
thousands,  of  papers  published  in  various  mathematical  journals  (see  the  references  in 
[E8],  [E9],  [Ell],  [E29]).  We  begin  with  the  concepts  and  general  results  we  shall  need 
in  explaining  and  applying  numeric  methods  for  eigenvalue  problems.  (For  typical  models 
of  eigenvalue  problems  see  Chap.  8.) 


where  y0(A),  ■ ■ ■ , ym(x)  are  given  functions,  requires  the 
determination  of  the  coefficients  a o,  ■ ■ • , am  such  that 

(9)  | [f(x)  - Fm(x)f  dx 

a 

becomes  minimum.  This  integral  is  denoted  by 
||/-  Fj2,  and  ||/-  Fm\\  is  called  the  L2-norm  of 
/ — Fm  ( L suggesting  Lebesgue5).  A necessary  condition 
for  that  minimum  is  given  by  D||  / — Fm\\2/daj  = 0, 
j = 0,  ■ ■ • , m [the  analog  of  (6)].  (a)  Show  that  this 
leads  to  m + 1 normal  equations  {j  — 0,  • ■ ■ , m) 

m 

2 hjkak  = bj  where 
k = 0 

fb 

(10)  hjk=  yj(x)yk(x)  dx, 

a 

bj  = f(x)yj(x ) dx. 


5HENRI  LEBESGUE  (1875-1941),  great  French  mathematician,  creator  of  a modem  theory  of  measure  and 
integration  in  his  famous  doctoral  thesis  of  1902. 
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THEOREM  1 


An  eigenvalue  or  characteristic  value  (or  latent  root)  of  a given  n X n matrix  A = [a?/J 
is  a real  or  complex  number  A such  that  the  vector  equation 

(1)  Ax  = Ax 

has  a nontrivial  solution,  that  is,  a solution  x A 0,  which  is  then  called  an  eigenvector  or 
characteristic  vector  of  A corresponding  to  that  eigenvalue  A.  The  set  of  all  eigenvalues 
of  A is  called  the  spectrum  of  A.  Equation  (1)  can  be  written 

(2)  (A  - AI)x  = 0 

where  I is  the  n X n unit  matrix.  This  homogeneous  system  has  a nontrivial  solution  if 
and  only  if  the  characteristic  determinant  det  (A  — AI)  is  0 (see  Theorem  2 in  Sec.  7.5). 
This  gives  (see  Sec.  8.1) 


Eigenvalues 

The  eigenvalues  of  X are  the  solutions  A of  the  characteristic  equation 

an  A 

aVZ 

aln 

(3)  det  (A  - AI)  = 

fl21 

a22  ~ A 

a2n 

= 0. 

&nl 

an2 

Clnn  ^ 

Developing  the  characteristic  determinant,  we  obtain  the  characteristic  polynomial  of  A, 
which  is  of  degree  n in  A.  Hence  A has  at  least  one  and  at  most  n numerically  different 
eigenvalues.  If  A is  real,  so  are  the  coefficients  of  the  characteristic  polynomial.  By  familiar 
algebra  it  follows  that  then  the  roots  (the  eigenvalues  of  A)  are  real  or  complex  conjugates 
in  pairs. 

To  give  you  some  orientation  of  the  underlying  approaches  of  numerics  for  eigenvalue 
problems,  note  the  following.  For  large  or  very  large  matrices  it  may  be  very  difficult  to 
determine  the  eigenvalues,  since,  in  general,  it  is  difficult  to  find  the  roots  of  characteristic 
polynomials  of  higher  degrees.  We  will  discuss  different  numeric  methods  for  finding 
eigenvalues  that  achieve  different  results.  Some  methods,  such  as  in  Sec.  20.7,  will  give 
us  only  regions  in  which  complex  eigenvalues  lie  (Geschgorin’s  method)  or  the  intervals 
in  which  the  largest  and  smallest  real  eigenvalue  lie  (Collatz  method).  Other  methods 
compute  all  eigenvalues,  such  as  the  Householder  tridiagonalization  method  and  the 
QR-method  in  Sec.  20.9. 

To  continue  our  discussion,  we  shall  usually  denote  the  eigenvalues  of  A by 

Ai,  A2,  ■ • • , An 

with  the  understanding  that  some  (or  all)  of  them  may  be  equal. 

The  sum  of  these  n eigenvalues  equals  the  sum  of  the  entries  on  the  main  diagonal  of 
A,  called  the  trace  of  A;  thus 

n n 

trace  A = ^ ajj  = 2 Ak- 

3=  1 fc=l 


(4) 
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Also,  the  product  of  the  eigenvalues  equals  the  determinant  of  A, 

(5)  det  A = AiA2  • ■ ■ An. 

Both  formulas  follow  from  the  product  representation  of  the  characteristic  polynomial, 
which  we  denote  by  /(A), 

/(A)  = (— 1)W(A  - AiXA  - A2)  - • • (A  - AJ. 

If  we  take  equal  factors  together  and  denote  the  numerically  distinct  eigenvalues  of  A by 
Ai,  • • ■ , Ar  (r  = n),  then  the  product  becomes 

(6)  /(A)  = (- l)n(A  - Ai)mi(A  - A2)m2 • • • (A  - Ar)mC 

The  exponent  nij  is  called  the  algebraic  multiplicity  of  A,-.  The  maximum  number  of 
linearly  independent  eigenvectors  corresponding  to  A,  is  called  the  geometric  multiplicity 
of  A j.  It  is  equal  to  or  smaller  than  mr 

A subspace  S of  Rn  or  Cn  (if  A is  complex)  is  called  an  invariant  subspace  of  A if 
for  every  v in  S the  vector  Av  is  also  in  S.  Eigenspaces  of  A (spaces  of  eigenvectors; 
Sec.  8.1)  are  important  invariant  subspaces  of  A. 

An  n X n matrix  B is  called  similar  to  A if  there  is  a nonsingular  n X n matrix  T such  that 

(7)  B = T_1AT. 

Similarity  is  important  for  the  following  reason. 


THEOREM  2 


Similar  Matrices 


Similar  matrices  have  the  same  eigenvalues.  If  x is  an  eigenvector  of  A,  then 
y = T - 1 x is  an  eigenvector  of  B in  (7)  corresponding  to  the  same  eigenvalue.  (Proof 
in  Sec.  8.4.) 


Another  theorem  that  has  various  applications  in  numerics  is  as  follows. 


THEOREM  3 


Spectral  Shift 

If  A has  the  eigenvalues  Ai,  ■ ■ • , An,  then  A — kl  with  arbitrary  k has  the  eigenvalues 
Ai  /c,  * * ■ , \n  k. 


This  theorem  is  a special  case  of  the  following  spectral  mapping  theorem. 


Polynomial  Matrices 

If  A is  an  eigenvalue  of  A,  then 

q{X)  = as  As  + as_iAs  1 + ■ 

■ • + ctqA  + ao 

is  an  eigenvalue  of  the  polynomial  matrix 

q(  A)  = «SAS  + as_1As_1  + • 

■ + oqA  + a0 1. 
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PROOF  Ax  = Ax  implies  A2x  = AAx  = AAx  = A2x,  A3x  = A3x,  etc.  Thus 

q( A)x  = (as As  + as_iAs_1  + ■ ■ • ) x 
= crsAsx  + as_1As-1x  + • • • 

= asAsx  + as_1As_1x  + • • ■ = q{ A)  x. 


The  eigenvalues  of  important  special  matrices  can  be  characterized  as  follows. 


THEOREM  5 


Special  Matrices 

The  eigenvalues  ofHermitian  matrices  ( i.e.,  A T = A),  hence  of  real  symmetric  matrices 
(i.e..  AT  = A),  are  real.  The  eigenvalues  of  skew-Hermitian  matrices  (i.e.,  A = —A), 
hence  of  real  skew-symmetric  matrices  (i.e.,  AT  = —A),  are  pure  imaginary  or  0.  The 
eigenvalues  of  unitary  matrices  (i.e.,  A = A-1),  hence  of  orthogonal  matrices  (i.e., 
At  = A-1),  have  absolute  value  1.  (Proofs  in  Secs.  8.3  and  8.5.) 


The  choice  of  a numeric  method  for  matrix  eigenvalue  problems  depends  essentially  on 
two  circumstances,  on  the  kind  of  matrix  (real  symmetric,  real  general,  complex,  sparse, 
or  full)  and  on  the  kind  of  information  to  be  obtained,  that  is,  whether  one  wants  to  know 
all  eigenvalues  or  merely  specific  ones,  for  instance,  the  largest  eigenvalue,  whether 
eigenvalues  and  eigenvectors  are  wanted,  and  so  on.  It  is  clear  that  we  cannot  enter  into 
a systematic  discussion  of  all  these  and  further  possibilities  that  arise  in  practice,  but  we 
shall  concentrate  on  some  basic  aspects  and  methods  that  will  give  us  a general 
understanding  of  this  fascinating  field. 


20.i  Inclusion  of  Matrix  Eigenvalues 

The  whole  of  numerics  for  matrix  eigenvalues  is  motivated  by  the  fact  that,  except  for  a 
few  trivial  cases,  we  cannot  determine  eigenvalues  exactly  by  a finite  process  because  these 
values  are  the  roots  of  a polynomial  of  nth  degree.  Hence  we  must  mainly  use  iteration. 

In  this  section  we  state  a few  general  theorems  that  give  approximations  and  error 
bounds  for  eigenvalues.  Our  matrices  will  continue  to  be  real  (except  in  formula  (5)  below), 
but  since  (nonsymmetric)  matrices  may  have  complex  eigenvalues,  complex  numbers  will 
play  a (very  modest)  role  in  this  section. 

The  important  theorem  by  Gerschgorin  gives  a region  consisting  of  closed  circular  disks 
in  the  complex  plane  and  including  all  the  eigenvalues  of  a given  matrix.  Indeed,  for  each 
j = 1 ,•••,«  the  inequality  (1)  in  the  theorem  determines  a closed  circular  disk  in  the 
complex  A-plane  with  center  ajj  and  radius  given  by  the  right  side  of  (1);  and  Theorem  1 
states  that  each  of  the  eigenvalues  of  A lies  in  one  of  these  n disks. 


THEOREM  1 


Gerschgorin’s  Theorem6 

Let  A be  an  eigenvalue  of  an  arbitrary  n X n matrix  A = [aji J.  Then  for  some 
integer  j (l  Si  j Si  n)  we  have 

(1)  | Ojj  — A|  Si  + \aj2\  + ■■■  + + IfljJ  + ll  + ■■•  + \djn\- 


'SEMYON  ARANOVICH  GERSCHGORIN  (1901-1933),  Russian  mathematician. 
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PROOF 


EXAMPLE  1 


Let  x be  an  eigenvector  corresponding  to  an  eigenvalue  A of  A.  Then 
(2)  Ax  = Ax  or  (A  — AI)x  = 0. 

Let  xj  be  a component  of  x that  is  largest  in  absolute  value.  Then  we  have  \xmJx:j\  g 1 
for  m = 1,  • • • , n.  The  vector  equation  (2)  is  equivalent  to  a system  of  n equations  for  the 
n components  of  the  vectors  on  both  sides.  The  jth  of  these  n equations  with  j as  just 
indicated  is 


cij-[X  i + • ■ • + cij ' j — \Xj  — i + ( djj  — A )xj  + aj  j + iXj  + 1 + • ■ • + (ijnxn  — 0. 


Division  by  Xj  (which  cannot  be  zero;  why?)  and  reshuffling  terms  gives 

%1  Xj—i  Xj+i  xn 

ajj  — ^ — ■■■  — CljJ-  1 . — aj,j+ 1 r.  — . 

Aj  Aj  Jx  A J 

By  taking  absolute  values  on  both  sides  of  this  equation,  applying  the  triangle  inequality 
\a  + b\  g \a\  + \b\  (where  a and  b are  any  complex  numbers),  and  observing  that 
because  of  the  choice  of  j (which  is  crucial!),  |xi/xJ  ^ 1,  ■ ■ • , |xn/x,-|  ^ 1,  we  obtain  (1), 
and  the  theorem  is  proved. 


Gerschgorin’s  Theorem 

For  the  eigenvalues  of  the  matrix 

1 i“ 

2 2 

5 1 

1 1 


we  get  the  Gerschgorin  disks  (Fig.  449) 

D Center  0,  radius  1,  D2'.  Center  5,  radius  1.5,  D3:  Center  1,  radius  1.5. 

The  centers  are  the  main  diagonal  entries  of  A.  These  would  be  the  eigenvalues  of  A if  A were  diagonal.  We 
can  take  these  values  as  crude  approximations  of  the  unknown  eigenvalues  (3D- values)  Ai  = —0.209, 
A2  = 5.305,  A3  = 0.904  (verify  this);  then  the  radii  of  the  disks  are  corresponding  error  bounds. 

Since  A is  symmetric,  it  follows  from  Theorem  5,  Sec.  20.6,  that  the  spectrum  of  A must  actually  lie  in  the 
intervals  [—1,  2.5]  and  [3.5,  6.5]. 

It  is  interesting  that  here  the  Gerschgorin  disks  form  two  disjoint  sets,  namely,  D\  U D3,  which  contains  two 
eigenvalues,  and  £>2,  which  contains  one  eigenvalue.  This  is  typical,  as  the  following  theorem  shows. 


Fig.  449<  Gerschgorin  disks  in  Example  1 
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THEOREM  2 


EXAMPLE  2 


Extension  of  Gerschgorin’s  Theorem 

If  p Gerschgorin  disks  form  a set  S that  is  disjoint  from  the  n — p other  disks  of  a 
given  matrix  A,  then  S contains  precisely  p eigenvalues  of  A ( each  counted  with  its 
algebraic  multiplicity,  as  defined  in  Sec.  20.6). 


Idea  of  Proof.  Set  A = B + C,  where  B is  the  diagonal  matrix  with  entries  ajj,  and 
apply  Theorem  1 to  At  = B + tC  with  real  t growing  from  0 to  1. 


Another  Application  of  Gerschgorin’s  Theorem.  Similarity 

Suppose  that  we  have  diagonalized  a matrix  by  some  numeric  method  that  left  us  with  some  off-diagonal  entries 
of  size  10-5,  say, 


A = 


2 

10“5 

to-5 


to-5 

2 

to-5 


to-5 

10“5 

4 


What  can  we  conclude  about  deviations  of  the  eigenvalues  from  the  main  diagonal  entries? 

Solution.  By  Theorem  2,  one  eigenvalue  must  lie  in  the  disk  of  radius  2 • 10-5  centered  at  4 and  two 
eigenvalues  (or  an  eigenvalue  of  algebraic  multiplicity  2)  in  the  disk  of  radius  2 • 10-5  centered  at  2.  Actually, 
since  the  matrix  is  symmetric,  these  eigenvalues  must  lie  in  the  intersections  of  these  disks  and  the  real  axis, 
by  Theorem  5 in  Sec.  20.6. 

We  show  how  an  isolated  disk  can  always  be  reduced  in  size  by  a similarity  transformation.  The  matrix 


B = T_1AT  = 


1 
0 
0 

2 

10“5 

io-10 


0 

0 

10 

10 

2 

10 


-5 

-5 


2 

10“5 
10“5 
1 
1 

4 


10“ 

2 

10“ 


10“J 

10“5 

4 


0 

0 

105 


is  similar  to  A.  Hence  by  Theorem  2,  Sec.  20.6,  it  has  the  same  eigenvalues  as  A.  From  Row  3 we  get  the 
smaller  disk  of  radius  2 • 10-  . Note  that  the  other  disks  got  bigger,  approximately  by  a factor  of  10.  And  in 
choosing  T we  have  to  watch  that  the  new  disks  do  not  overlap  with  the  disk  whose  size  we  want  to  decrease. 
For  further  interesting  facts,  see  the  book  [E28]. 


By  definition,  a diagonally  dominant  matrix  A = [ajj J is  an  n X n matrix  such  that 
(3)  \ajj\  = ^ \cijk\  j =l,"-,n 

k*j 

where  we  sum  over  all  off-diagonal  entries  in  Row  j.  The  matrix  is  said  to  be  strictly 
diagonally  dominant  if  > in  (3)  for  all  j.  Use  Theorem  1 to  prove  the  following  basic 
property. 


Strict  Diagonal  Dominance 

Strictly  diagonally  dominant  matrices  are  nonsingular. 


THEOREM  3 
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THEOREM  4 


EXAMPLE  3 


THEOREM  5 


Further  Inclusion  Theorems 

An  inclusion  theorem  is  a theorem  that  specifies  a set  which  contains  at  least  one 
eigenvalue  of  a given  matrix.  Thus,  Theorems  1 and  2 are  inclusion  theorems;  they  even 
include  the  whole  spectrum.  We  now  discuss  some  famous  theorems  that  yield  further 
inclusions  of  eigenvalues.  We  state  the  first  two  of  them  without  proofs  (which  would 
exceed  the  level  of  this  book). 


Schur’s  Theorem7 

Let  A = be  a n X n matrix.  Then  for  each  of  its  eigenvalues  A1;  • • ■ , \n, 

n n n 

(4)  |Aj2  ^ 2 lA*l2  = 2 2 l«7fcl2  (Schur’s  inequality). 

i= 1 j=lfc=l 

In  (4)  the  second  equality  sign  holds  if  and  only  if  A is  such  that 

(5)  AtA  = AAt. 


Matrices  that  satisfy  (5)  are  called  normal  matrices.  It  is  not  difficult  to  see  that  Hermitian, 
skew-Hermitian,  and  unitary  matrices  are  normal,  and  so  are  real  symmetric,  skew- symmetric, 
and  orthogonal  matrices. 

Bounds  for  Eigenvalues  Obtained  from  Schur’s  Inequality 

For  the  matrix 


26  -2  2 


A = 2 


21 


4 


4 2 28 


we  obtain  from  Schur’s  inequality  |A|  § V 1949  = 44.1475.  You  may  verify  that  the  eigenvalues  are  30,  25, 
and  20.  Thus  302  + 252  + 202  = 1925  < 1949;  in  fact,  A is  not  normal. 


The  preceding  theorems  are  valid  for  every  real  or  complex  square  matrix.  Other  theorems 
hold  for  special  classes  of  matrices  only.  Famous  is  the  following  one,  which  has  various 
applications,  for  instance,  in  economics. 


Perron’s  Theorem8 

Let  A be  a real  n X n matrix  whose  entries  are  all  positive.  Then  A has  a positive 
real  eigenvalue  A = p of  multiplicity  1 . The  corresponding  eigenvector  can  be 
chosen  with  all  components  positive.  ( The  other  eigenvalues  are  less  than  p in 
absolute  value.) 


7ISSAI  SCHUR  (1875-1941),  German  mathematician,  also  known  by  his  important  work  in  group  theory. 

80SKAR  PERRON  (1880-1975)  and  GEORG  FROBENIUS  (1849-1917),  German  mathematicians,  known 
for  their  work  in  potential  theory,  ODEs  (Sec.  5.4),  and  group  theory. 
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THEOREM  6 


PROOF 


For  a proof  see  Ref.  [B3],  vol.  II,  pp.  53-62.  The  theorem  also  holds  for  matrices  with 
nonnegative  real  entries  (“Perron-Frobenius  Theorem”8)  provided  A is  irreducible,  that 
is,  it  cannot  be  brought  to  the  following  form  by  interchanging  rows  and  columns;  here 
B and  F are  square  and  0 is  a zero  matrix. 

B C 

0 F 

Perron’s  theorem  has  various  applications,  for  instance,  in  economics.  It  is  interesting 
that  one  can  obtain  from  it  a theorem  that  gives  a numeric  algorithm: 


Collatz  Inclusion  Theorem9 

Let  A = [ajfj  be  a real  n X n matrix  whose  elements  are  all  positive.  Let  x be  any 
real  vector  whose  components  • ■ • , xn  are  positive,  and  let  Vi,  • • • , yn  be  the 
components  of  the  vector  y = Ax.  Then  the  closed  interval  on  the  real  axis  bounded 
by  the  smallest  and  the  largest  of  the  n quotients  qj  = yJxj  contains  at  least  one 
eigenvalue  of  A. 


We  have  Ax  = y or 

(6)  y - Ax  = 0. 

The  transpose  AT  satisfies  the  conditions  of  Theorem  5.  Hence  AT  has  a positive  eigenvalue 
A and,  corresponding  to  this  eigenvalue,  an  eigenvector  u whose  components  w,  are  all 
positive.  Thus  ATu  = Au  and  by  taking  the  transpose  we  obtain  uTA  = AuT.  From  this 
and  (6)  we  have 

uT(y  — Ax)  = uTy  — uTAx  = uTy  — AuTx  = uT(y  — Ax)  = 0 
or  written  out 

n 

2 “/At  _ Ax?)  = °- 

3 = 1 

Since  all  the  components  Uj  are  positive,  it  follows  that 

y,  — Aa,  § 0,  that  is,  § A for  at  least  one  /, 

(7)  3 3 3 and 

y.j  — A Xj  = 0,  that  is,  qj  ==  A for  at  least  one  j. 

Since  A and  AT  have  the  same  eigenvalues,  A is  an  eigenvalue  of  A,  and  from  (7)  the 
statement  of  the  theorem  follows. 


'LOTHAR  COLLATZ  (1910-1990),  German  mathematician  known  for  his  work  in  numerics. 
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EXAMPLE  4 Bounds  for  Eigenvalues  from  Collatz’s  Theorem.  Iteration 

For  a given  matrix  A with  positive  entries  we  choose  an  x = Xq  and  iterate,  that  is,  we  compute  Xi  = Axq, 
X2  = Axi,  • • • , X20  — Axig.  In  each  step,  taking  x = Xj  and  y = A Xj  = Xj  + i we  compute  an  inclusion  interval 
by  Collatz’s  theorem.  This  gives  (6S) 


0.49 

0.02 

0.22_ 

l" 

~0.73~ 

0.548 1” 

A = 

0.02 

0.28 

0.20 

,X0  = 

1 

Al  = 

0.50 

• *2  = 

0.3186 

0.22 

0.20 

0.40 

1 

0.82 

0.5886 

_ 

r 

0.002 16309~ 

0.00155743 

x19  “ 

0.00108155 

,*20  = 

0.000778713 

0.00216309 

0.00155743 

and  the  intervals  0.5  ^ A ^ 0.82,  0.3186/0.50  = 0.6372  ^ A ^ 0.5481/0.73  = 0.750822,  etc.  These  intervals 
have  length 


j 

1 

2 

3 

10 

15 

20 

Length 

0.32 

0.113622 

0.0539835 

0.0004217 

0.0000132 

0.0000004 

Using  the  characteristic  polynomial,  you  may  verify  that  the  eigenvalues  of  A are  0.72,  0.36,  0.09,  so  that  those 
intervals  include  the  largest  eigenvalue,  0.72.  Their  lengths  decreased  with  j,  so  that  the  iteration  was  worthwhile. 
The  reason  will  appear  in  the  next  section,  where  we  discuss  an  iteration  method  for  eigenvalues. 


PRQBLE~W=SET~-2QF7 


1-6 


GERSCHGORIN  DISKS 


Find  and  sketch  disks  or  intervals  that  contain  the 
eigenvalues.  If  you  have  a CAS,  find  the  spectrum  and 
compare. 


5 

2 4 

" 5 

10-2 

10-2 

1. 

-2 

0 2 

2. 

10-2 

8 

10-2 

2 

4 7 

10-2 

10~2 

9 

0 

0.4 

-0.1 

1 

0 1 

3. 

-0.4 

0 

0.3 

4. 

0 

4 3 

0.1 

-0.3 

0 

1 

3 12 

2 

i 1 + i 

10 

0.1  - 

0.2~ 

5. 

— i 

3 

0 

6. 

0.1 

6 

0 

1 - i 

0 

8 

-0.2 

0 

3 

7.  Similarity.  In  Prob.  2,  find  T_tAT  such  that  the  radius 
of  the  Gerschgorin  circle  with  center  5 is  reduced  by  a 
factor  1/100. 


8.  By  what  integer  factor  can  you  at  most  reduce  the 
Gerschgorin  circle  with  center  3 in  Prob.  6? 


9.  If  a symmetric  n X n matrix  A = [o^]  has  been 
diagonalized  except  for  small  off-diagonal  entries  of 
size  10-5.  what  can  you  say  about  the  eigenvalues? 

10.  Optimality  of  Gerschgorin  disks.  Illustrate  with  a 
2X2  matrix  that  an  eigenvalue  may  very  well  lie  on 
a Gerschgorin  circle,  so  that  Gerschgorin  disks  can 
generally  not  be  replaced  with  smaller  disks  without 
losing  the  inclusion  property. 

11.  Spectral  radius  p( A).  Using  Theorem  1,  show  that 
p( A)  cannot  be  greater  than  the  row  sum  norm  of  A. 


12-16 


SPECTRAL  RADIUS 


Use  (4)  to  obtain  an  upper  bound  for  the  spectral  radius: 
12.  In  Prob.  4 13.  In  Prob.  1 

14.  In  Prob.  6 15.  In  Prob.  3 

16.  In  Prob.  5 

17.  Verify  that  the  matrix  in  Prob.  5 is  normal. 

18.  Normal  matrices.  Show  that  Hermitian,  skew- 
Hermitian,  and  unitary  matrices  (hence  real  symmetric, 
skew-symmetric,  and  orthogonal  matrices)  are  normal. 
Why  is  this  of  practical  interest? 

19.  Prove  Theorem  3 by  using  Theorem  1 . 

20.  Extended  Gerschgorin  theorem.  Prove  Theorem  2. 
Hint.  Let  A = B + C,  B = diag  (ajj),  At  = B + tC, 
and  let  t increase  continuously  from  0 to  1 . 
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20.8  Power  Method  for  Eigenvalues 

A simple  standard  procedure  for  computing  approximate  values  of  the  eigenvalues  of  an 
n X n matrix  A = [ojk]  is  the  power  method.  In  this  method  we  start  from  any  vector 
x0  (A  0)  with  n components  and  compute  successively 


X!  = Ax0,  x2  = Ax1;  • • • , xs  — Axs_i. 

For  simplifying  notation,  we  denote  xs_i  by  x and  xs  by  y,  so  that  y = Ax. 

The  method  applies  to  any  n X n matrix  A that  has  a dominant  eigenvalue  (a  A such 
that  | A | is  greater  than  the  absolute  values  of  the  other  eigenvalues).  If  A is  symmetric,  it 
also  gives  the  error  bound  (2),  in  addition  to  the  approximation  (1). 


THEOREM  1 


Power  Method,  Error  Bounds 

Let  A be  an  n X n real  symmetric  matrix.  Let  x (=£  0)  be  any  real  vector  with  n 
components.  Furthermore,  let 

y = Ax,  ;n0  = xTx,  m1  = xTy,  m2  = yTy. 

Then  the  quotient 


(1) 


m \ 
m0 


(Rayleigh10  quotient) 


is  an  approximation  for  an  eigenvalue  A of  A (usually  that  which  is  greatest  in 
absolute  value,  but  no  general  statements  are  possible). 

Furthermore,  if  we  set  q = A — e,  so  that  e is  the  error  of  q,  then 


(2) 


|e|  = 8 = 


PROOF  S2  denotes  the  radicand  in  (2).  Since  m\  = qm0  by  (1),  we  have 

(3)  (y  - qx)T(y  - qx)  = m2  ~ 2qm1  + q2m0  = m2  - q2m0  = S2m0. 

Since  A is  real  symmetric,  it  has  an  orthogonal  set  of  n real  unit  eigenvectors  z1;  • • ■ , zn 
corresponding  to  the  eigenvalues  A1;  • • ■ , Ar(,  respectively  (some  of  which  may  be  equal). 
(Proof  in  Ref.  [B3],  vol.  1,  pp.  270-272,  listed  in  App.  1.)  Then  x has  a representation  of 
the  form 


x — apL\  + ■ ■ • + anzn. 


10LORD  RAYLEIGH  (JOHN  WILLIAM  STRUTT)  (1842-1919),  great  English  physicist  and  mathematician, 
professor  at  Cambridge  and  London,  known  for  his  important  contributions  to  various  branches  of  applied 
mathematics  and  theoretical  physics,  in  particular,  the  theory  of  waves,  elasticity,  and  hydrodynamics.  In  1904 
he  received  a Nobel  Prize  in  physics. 
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EXAMPLE  1 


Now  Az  | = A | z | . etc.,  and  we  obtain 

y Ax  ajAiZ!  ~b  ■ ~b  an\nzn 
and,  since  the  z ,■  are  orthogonal  unit  vectors, 

(4)  m0  = xTx  = af  + • ■ ■ + an- 

il follows  that  in  (3), 


y qx  <?)zi  ~b  ■ ~b  r/n(An  q)zn. 

Since  the  z j are  orthogonal  unit  vectors,  we  thus  obtain  from  (3) 

(5)  S2m0  = (y  - <?x)T(y  - qx)  = a^Ay  - q)2  + • ■ • + a2{ \n  - qf . 

Now  let  Ac  be  an  eigenvalue  of  A to  which  q is  closest,  where  c suggests  “closest.”  Then 
(Ac  — q)2  A (Aj  — q)2  for  j = 1,  ■ ■ ■ , n.  From  this  and  (5)  we  obtain  the  inequality 

S2m0  g (Ac  - qf(a\  + ■ ■ • + = (Ac  - q)2m0. 

Dividing  by  m0,  taking  square  roots,  and  recalling  the  meaning  of  82  gives 


8 = 


This  shows  that  8 is  a bound  for  the  error  e of  the  approximation  q of  an  eigenvalue  of 
A and  completes  the  proof.  ■ 

The  main  advantage  of  the  method  is  its  simplicity.  And  it  can  handle  sparse  matrices 
too  large  to  store  as  a full  square  array.  Its  disadvantage  is  its  possibly  slow  convergence. 
From  the  proof  of  Theorem  1 we  see  that  the  speed  of  convergence  depends  on  the  ratio 
of  the  dominant  eigenvalue  to  the  next  in  absolute  value  (2:1  in  Example  1,  below). 

If  we  want  a convergent  sequence  of  eigenvectors,  then  at  the  beginning  of  each  step 
we  scale  the  vector,  say,  by  dividing  its  components  by  an  absolutely  largest  one,  as  in 
Example  1,  as  follows. 


Application  of  Theorem  1.  Scaling 

For  the  symmetric  matrix  A in  Example  4,  Sec.  20.7,  and  x0  = [1  1 1]T  we  obtain  from  (1)  and  (2)  and  the 

indicated  scaling 


0.49 

0.02 

0.22 

1 

0.890244 

0.931193 

A = 

0.02 

0.28 

0.20 

. xo  = 

1 

■ Xj  = 

0.609756 

■ x2  = 

0.541284 

0.22 

0.20 

0.40 

1 

1 

1 

~0.990663~ 

0.999707 

~0.99999 1" 

*5  = 

0.504682 

> xio  — 

0.500146 

. x15  ~ 

0.500005 

1 

1 

1 
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Here  Axq  = [0.73  0.5  0.82]T,  scaled  to  Xi  = [0.73/0.82  0.5/0.82  1]T,  etc.  The  dominant  eigenvalue  is 

0.72,  an  eigenvector  [1  0.5  1]T.  The  corresponding  q and  8 are  computed  each  time  before  the  next  scaling. 

Thus  in  the  first  step, 


m l = XqAxq 
m0  XqX0 


2.05 

= 0.683333 

3 


S = 


/»>2 

V"o 


(Ax0)tAx0 


xjx0 


0.134743. 


This  gives  the  following  values  of  q,  8,  and  the  error  e = 0.72  — q (calculations  with  10D,  rounded  to  6D): 


j 

i 

2 

5 

10 

q 

0.683333 

0.716048 

0.719944 

0.720000 

s 

0.134743 

0.038887 

0.004499 

0.000141 

e 

0.036667 

0.003952 

0.000056 

00 

1 

O 

m 

The  eiTor  bounds  are  much  larger  than  the  actual  errors.  This  is  typical,  although  the  bounds  cannot  be  improved; 
that  is,  for  special  symmetric  matrices  they  agree  with  the  errors. 

Our  present  results  are  somewhat  better  than  those  of  Collatz’s  method  in  Example  4 of  Sec.  20.7,  at  the 
expense  of  more  operations. 


Spectral  shift,  the  transition  from  A to  A — kl,  shifts  every  eigenvalue  by  —k.  Although 
finding  a good  k can  hardly  be  made  automatic,  it  may  be  helped  by  some  other  method 
or  small  preliminary  computational  experiments.  In  Example  1,  Gerschgorin’s  theorem 
gives  —0.02  ^ A ^ 0.82  for  the  whole  spectrum  (verify!).  Shifting  by  —0.4  might  be  too 
much  (then  —0.42  ^ A ^ 0.42),  so  let  us  try  —0.2. 

EXAMPLE  2 Power  Method  with  Spectral  Shift 

For  A — 0.21  with  A as  in  Example  1 we  obtain  the  following  substantial  improvements  (where  the  index  1 
refers  to  Example  1 and  the  index  2 to  the  present  example). 


1 2 5 10 


0.134743 

0.038887 

0.004499 

0.000141 

0.134743 

0.034474 

0.000693 

1.8  ■ 10~6 

0.036667 

0.003952 

0.000056 

5 • 10“8 

0.036667 

0.002477 

o 

1 

05 

VO 

O 

1 

to 

■ 

PROBLEMSET20 . 8 
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POWER  METHOD  WITHOUT  SCALING 


Apply  the  power  method  without  scaling  (3  steps),  using 
x0  = [1,  l]Tor[l  1 1]T.  Give  Rayleigh  quotients  and 

error  bounds.  Show  the  details  of  your  work. 


9 4 

7 -3" 

2. 

_4  3 

-3  -1 

2 

-1 

1 

3.6 

-1.8 

1.8~ 

-1 

3 

2 

4. 

-1.8 

2.8 

—2.6 

1 

2 

3 

1.8 

-2.6 

2.8 

5-8 


POWER  METHOD  WITH  SCALING 


1. 


Apply  the  power  method  (3  steps)  with  scaling,  using 
x0  = [1  1 1]T  or  [1  1 1 1]T,  as  applicable.  Give 
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Rayleigh  quotients  and  error  bounds.  Show  the  details  of 
your  work. 

5.  The  matrix  in  Prob.  3 


6. 


12.  CAS  EXPERIMENT.  Power  Method  with 
Scaling.  Shifting,  (a)  Write  a program  for  n X n 
matrices  that  prints  every  step.  Apply  it  to  the 


7. 


(nonsymmetric!) 

matrix 

(20 

steps), 

starting  from 

4 

2 

3" 

[1  1 1]T. 

2 

7 

6 

15 

12 

3’ 

3 

6 

4 

A = 

18 

44 

18 

~5 

1 

0 

0’ 

-19 

-36 

-7 

1 

3 

1 

0 

(b)  Experiment  in 

(a)  with  shifting.  Which  shift  do  you 

0 

1 

3 

1 

find  optimal? 

8. 


(c)  Write  a program  as  in  (a)  but  for  symmetric  matrices 
that  prints  vectors,  scaled  vectors,  q,  and  8.  Apply  it  to 
the  matrix  in  Prob.  8. 


(d).  Optimality  of  8.  Consider  A = 


0.6 

0.8 


0.8 

~0.6 


and 


take  x0  = 


3 

-1 


. Show  that  q = 0,  8 = 1 for  all  steps 


9.  Prove  that  if  x is  an  eigenvector,  then  8 = 0 in  (2). 
Give  two  examples. 

10.  Rayleigh  quotient.  Why  does  q generally  approximate 
the  eigenvalue  of  greatest  absolute  value?  When  will 
q be  a good  approximation? 

11.  Spectral  shift,  smallest  eigenvalue.  In  Prob.  3 set 

B = A — 31  (as  perhaps  suggested  by  the  diagonal 
entries)  and  see  whether  you  may  get  a sequence  of  q’s 
converging  to  an  eigenvalue  of  A that  is  smallest  (not 
largest)  in  absolute  value.  Use  x0  = [1  1 1]T.  Do 

8 steps.  Verify  that  A has  the  spectrum  {0,  3,  5). 


and  the  eigenvalues  are  ±1,  so  that  the  interval 
[q  — 8,  q + 5]  cannot  be  shortened  (by  omitting  ±1) 
without  losing  the  inclusion  property.  Experiment  with 
other  x0’s. 

(e)  Find  a (nonsymmetric)  matrix  for  which  8 in  (2)  is 
no  longer  an  error  bound. 

(f)  Experiment  systematically  with  speed  of  conver- 
gence by  choosing  matrices  with  the  second  greatest 
eigenvalue  (i)  almost  equal  to  the  greatest,  (ii)  some- 
what different,  (iii)  much  different. 


20.«  Tridiagonalization  and  QR-Factorization 

We  consider  the  problem  of  computing  all  the  eigenvalues  of  a real  symmetric  matrix 
A = [fljfc],  discussing  a method  widely  used  in  practice.  In  the  first  stage  we  reduce  the 
given  matrix  stepwise  to  a tridiagonal  matrix,  that  is,  a matrix  having  all  its  nonzero 
entries  on  the  main  diagonal  and  in  the  positions  immediately  adjacent  to  the  main  diagonal 
(such  as  A3  in  Fig.  450,  Third  Step).  This  reduction  was  invented  by  A.  S.  Householder11 
(J.  Assn.  Comput.  Machinery  5 (1958),  335-342).  See  also  Ref.  [E29]  in  App.  1. 

This  Householder  tridiagonalization  will  simplify  the  matrix  without  changing  its 
eigenvalues.  The  latter  will  then  be  determined  (approximately)  by  factoring  the  tridiago- 
nalized  matrix,  as  discussed  later  in  this  section. 


"ALSTON  SCOTT  HOUSEHOLDER  (1904-1993),  American  mathematician,  known  for  his  work  in 
numerical  analysis  and  mathematical  biology.  He  was  head  of  the  mathematics  division  at  Oakridge  National 
Laboratory  and  later  professor  at  the  University  of  Tennessee.  He  was  both  president  of  ACM  (Association  for 
Computing  Machinery)  1954-1956  and  SIAM  (Society  for  Industrial  and  Applied  Mathematics)  1963-1964. 
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Householder’s  Tridiagonalization  Method11 

An  n X n real  symmetric  matrix  A = \a,jp-\  being  given,  we  reduce  it  by  n — 2 successive 
similarity  transformations  (see  Sec.  20.6)  involving  matrices  P1;  • • • , Pn_2  to  tridiagonal 
form.  These  matrices  are  orthogonal  and  symmetric.  Thus  P{" 1 = P J = P,  and  similarly 
for  the  others.  These  transformations  produce,  from  the  given  A o = A = [ajk],  the  matrices 
Ai  = [djkl  A2  = [a™  ],  • ' ‘ , An_2  = [ajfc_2)]  in  the  form 

Ai  = PiA0Pi 
A2  = I*2AiP2 

(1) 


® A 77, — 2 P/l. —2  A n— l-jP/i. — 2- 

The  transformations  (1)  create  the  necessary  zeros,  in  the  first  step  in  Row  1 and  Column  1, 
in  the  second  step  in  Row  2 and  Column  2,  etc.,  as  Fig.  450  illustrates  for  a 5 X 5 matrix. 
B is  tridiagonal. 


* * 

* * 

* * * * * 

* * * 

* * * * 

* * * * 

* * * 

* * * * 

* * * 

* * * 

First  Step 


Second  Step 


Third  Step 

^3  = 


Fig.  450.  Householder’s  method  for  a 5 X 5 matrix. 
Positions  left  blank  are  zeros  created  by  the  method. 


How  do  we  determine  P1;  P2,  • • • , Pn_2?  Now,  all  these  P,.  are  of  the  form 
(2)  Pr  = I — 2vrvJ  (r  =!,■■•,  7i-  2) 


where  I is  the  n X n unit  matrix  and  vr  = [ vjr  ] is  a unit  vector  with  its  first  r components 
0;  thus 


0 

0 

0 

* 

0 

0 

* 

II 

i? 

* 

II 

<N 

1 

* 

* 

* 

* 

where  the  asterisks  denote  the  other  components  (which  will  be  nonzero  in  general). 
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EXAMPLE  1 


Step  1.  Vi  has  the  components 


(4) 


un  = 0 


if 

a21  \ 

(a) 

^21  — 

\M‘+' 

sj 

flji  sgn  a2i 

(b) 

Vjl  = 

2^21^1 

j = 3,4,-- 

• , n 

where 

(c)  5!  - Va|i  + all  + ~ + a-ni 


where  Si  > 0,  and  sgn  a2 1 = + 1 if  ei2i  = 0 and  sgn  a2 1 = — 1 if  a21  < 0-  With  this  we 
compute  Pi  by  (2)  and  then  A | by  (1).  This  was  the  first  step. 

Step  2.  We  compute  v2  by  (4)  with  all  subscripts  increased  by  1 and  the  a.jp  replaced 
by  ajk,  the  entries  of  Ai  just  computed.  Thus  [see  also  (3)] 


(4*) 


^12  v22  ~ 0 


V32  — 


Vj2 


cij2  sgn  a 32 
2v32S2 


j = 4,  5,  ■ ■ ■ , n 


where 


S2  - + a22  + ■ ■ • + aCn2  ■ 

With  this  we  compute  P2  by  (2)  and  then  A2  by  (1). 

Step  3.  We  compute  V3  by  (4*)  with  all  subscripts  increased  by  1 and  the  ajjj  replaced 
by  the  entries  of  A2,  and  so  on. 


Householder  Tridiagonalization 

Tridiagonalize  the  real  symmetric  matrix 


6 4 11 


A = A0 


4 6 11 

115  2 


112  5 


Solution.  Step  1.  We  compute  S2  = 42  + l2  + l2  = 1 8 from  (4c).  Since  a2i  = 4 > 0,  we  have  sgn  a2i  = +1 
in  (4b)  and  get  from  (4)  by  straightforward  computation 
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0 

"0 

^21 

0.98559856 

^31 

0.11957316 

U4I 

0.11957316 

From  this  and  (2), 


0 

-0.94280904 

-0.23570227 

-0.23570227 


0 

-0.23570227 

0.97140452 

-0.02859548 


0 

-0.23570227 

-0.02859548 

0.97140452 


From  the  first  line  in  (1)  we  now  get 


6 -Vl8  0 0 


Ai  — PiAnPi  — 


Step  2.  From  (4*)  we  compute  S2  — 2 and 


-Vl8  7 

0 -1 

0 -1 


-1  -1 


9 3 

2 2 

3 9 

2 2 


From  this  and  (2), 


0 

0 

0 

0 

v2  = 

= 

^32 

0.92387953 

^42 

0.38268343 

’l 

0 

0 

0 

0 

1 

0 

0 

0 

0 

- 

1/V2  -1/V2 

0 

0 

- 

1/V2  -1/V2 

The  second  line  in  (1)  now  gives 


O2  — A2  — P2A  1 1*2  — 


6 

-VI8 

0 

0 


-Vl8 

7 

V2 

0 


0 

V2 

6 

0 


This  matrix  B is  tridiagonal.  Since  our  given  matrix  has  order  n = 4.  we  needed  n — 2 = 2 steps  to  accomplish 
this  reduction,  as  claimed.  (Do  you  see  that  we  got  more  zeros  than  we  can  expect  in  general?) 

B is  similar  to  A,  as  we  now  show  in  general.  This  is  essential  because  B thus  has  the  same  spectrum  as  A, 
by  Theorem  2 in  Sec.  20.6.  I 


B Similar  to  A.  We  assert  that  B in  ( 1)  is  similar  to  A = Ao-  The  matrix  Pr  is  symmetric; 
indeed, 

PrT  = (I  - 2vrv,T)T  = IT  - 2(vrv/)T  = I - 2yr\r  = Pr 
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Also,  Pr  is  orthogonal  because  \r  is  a unit  vector,  so  that  v,Tvr  = 1 and  thus 

PrPrT  = Pr2  = (I  - 2vrvrT)2  = I - 4vrvrT  + 4vrvrTvrvrT 

= I — 4vrvrT  + 4vr(vrTvr)vrT  = I. 

Hence  1*7 1 = PrT  = Pr  and  from  (1)  we  now  obtain 

® = ^n-2^-n-3Pn-2  = 

‘ ‘ = ®n-2Pn-3  ' ’ ' PlAPi  ■ ' ' Pn_3Pn_2 

= Pni2Pn-3'”Pl  1AP1  • • • Pm_3Pn_2 
= P_1AP 

where  P = Pj  P2  • ■ • Pn_2.  This  proves  our  assertion. 

QR-Factorization  Method 

In  1958  H.  Rutishauser12  of  Switzerland  proposed  the  idea  of  using  the  LU-factorization 
(Sec.  20.2;  he  called  it  LR-factorization)  in  solving  eigenvalue  problems.  An  improved 
version  of  Rutishauser’ s method  (avoiding  breakdown  if  certain  submatrices  become 
singular,  etc.;  see  Ref.  [E29])  is  the  QR-method,  independently  proposed  by  the  American 
J.  G.  F.  Francis  ( Computer  J.  4 (1961-62),  265-271,  332-345)  and  the  Russian  V.  N. 
Kublanovskaya  ( Zhurnal  Vych.  Mat.  i Mat.  Fiz.  1 (1961),  555-570).  The  QR-method  uses 
the  factorization  QR  with  orthogonal  Q and  upper  triangular  R.  We  discuss  the  QR  -method 
for  a real  symmetric  matrix.  (For  extensions  to  general  matrices  see  Ref.  [E29]  in  App.  1.) 

In  this  method  we  first  transform  a given  real  symmetric  n X n matrix  A into  a 
tridiagonal  matrix  B0  = B by  Householder’s  method.  This  creates  many  zeros  and  thus 
reduces  the  amount  of  further  work.  Then  we  compute  B^,  B2,  • • ■ stepwise  according  to 
the  following  iteration  method. 

Step  1.  Factor  Bo  = QqRo  with  orthogonal  Qo  and  upper  triangular  Ro-  Then  compute 

®i  = RoQo- 

Step  2.  Factor  Bj  = QiRi-  Then  compute  B2  = RiQi- 
General  Step  s + 1. 


(5) 


(a)  Factor  Bs  = QSRS. 

(b)  Compute  Bs+i  = RSQS. 


Here  Qs  is  orthogonal  and  Rs  upper  triangular.  The  factorization  (5a)  will  be  explained 
below. 

Bs+1  Similar  to  B.  Convergence  to  a Diagonal  Matrix.  From  (5a)  we  have  Rs  = Q “ 1 Bs. 

Substitution  into  (5b)  gives 

(6)  Bs+1  = RSQS  = QJ1BSQS. 


12HEINZ  RUTISHAUSER  (1918-1970).  Swiss  mathematician,  professor  at  ETH  Zurich.  Known  for  his 
pioneering  work  in  numerics  and  computer  science. 
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Thus  Bs+1  is  similar  to  Bs.  Hence  Bs+1  is  similar  to  B0  = B for  all  ,v.  By  Theorem  2,  Sec. 
20.6,  this  implies  that  Bs+1  has  the  same  eigenvalues  as  B. 

Also,  Bs+1  is  symmetric.  This  follows  by  induction.  Indeed,  B0  = B is  symmetric. 
Assuming  Bs  to  be  symmetric,  that  is,  BST  = Bs,  and  using  QJ1  = QST  (since  Qs  is 
orthogonal),  we  get  from  (6)  the  symmetry, 

Bs+iT  = (QstbsQs)t  = QsVQs  = QsTBsQs  = Bs+1. 

If  the  eigenvalues  of  B are  different  in  absolute  value,  say,  lAj  > | A2|  > ■ • ■ > | A,,  | , 
then 


lim  Bs  = D 

.S' — > cc 

where  D is  diagonal,  with  main  diagonal  entries  Ai,  A2,  • • ■ , An.  (Proof  in  Ref.  [E29]  listed 
in  App.  1.) 

How  to  Get  the  QR-Factorization,  say,  B = B0  = [/;);£]  = Q0Ro- The  tri diagonal  matrix 
B has  n — 1 generally  nonzero  entries  below  the  main  diagonal.  These  are 
/?2i,  /?32,  • ■ ■ , bn  n_  | . We  multiply  B from  the  left  by  a matrix  C2  such  that  C2B  = [bffl] 
has  b(2\  = 0.  We  multiply  this  by  a matrix  C3  such  that  C3C2B  = [bjf:’]  has  b 32  = 0, 
etc.  After  n — 1 such  multiplications  we  are  left  with  an  upper  triangular  matrix  R0, 
namely, 

(7)  CnCn_i  ■ ■ • C3C2B0  = Ro 


These  n X n matrices  C j are  very  simple.  C j has  the  2X2  submatrix 


cos  0j  sin  0j 

— sin  0j  cos  0j 


(6j  suitable) 


in  Rows  j — 1 and  j and  Columns  j — 1 and  /;  everywhere  else  on  the  main  diagonal  the 
matrix  C j has  entries  1 ; and  all  its  other  entries  are  0.  (This  submatrix  is  the  matrix  of  a 
plane  rotation  through  the  angle  Of,  see  Team  Project  30,  Sec.  7.2.)  For  instance,  if  n = 4, 
writing  Cj  = cos  0j,  Sj  = sin  0j,  we  have 


C2 

s2 

0 

()" 

"1 

0 

0 

o” 

"1 

0 

0 

0" 

C2 

0 

0 

, C3  = 

0 

C3 

S3 

0 

, c4  = 

0 

1 

0 

0 

0 

0 

1 

0 

0 

-S3 

C3 

0 

0 

0 

c4 

S4 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

-s4 

c4_ 

These  Cj  are  orthogonal.  Hence  their  product  in  (7)  is  orthogonal,  and  so  is  the  inverse 
of  this  product.  We  call  this  inverse  Q0.  Then  from  (7), 


(8)  B0  — Q0R0 

where,  with  Cj-1  = C jT, 

Qo  = (CnCn_!  • • ■ C3C2)-1  = C2tC3t  ■ • • Cn^TCnT. 


(9) 
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EXAMPLE  2 


This  is  our  QR-factorization  of  B0.  From  it  we  have  by  (5b)  with  s = 0 
(10)  B!  = R0Qo  = R0C2TC3t  ■ ■ ■ Cn_  1TCnT. 

We  do  not  need  Q0  explicitly,  but  to  get  Bj  from  (10),  we  first  compute  R0C2T,then 
(RoCj)Cj,  etc.  Similarly  in  the  further  steps  that  produce  B2,  B3, 

Determination  of  cos  0j  and  sin  0j.  We  finally  show  how  to  find  the  angles  of  rotation, 
cos  02  and  sin  02  in  C2  must  be  such  that  fe22)  = 0 in  the  product 


c2 

s2 

0 

fen 

b\2 

fel3 

-S2 

C2 

0 

b2\ 

b22 

b23 

Now  b ‘21  is  obtained  by  multiplying  the  second  row  of  C2  by  the  first  column  of  B, 
b 22i  = -s2*ii  + c2b2i  = — (sin  02)bn  + (cos  02)b21  = 0. 


Hence  tan  02  = s2/c2  = h2\jb\ and 


(ID 


cos  02  = 


sin  02  = 


1 


Vl  + tan2  02  Vl  + (W^n)2 


tan  02 


b2\!  fen 


vT  + tan2  e2  Vi  + (fe2i/fen)2' 
Similarly  for  03,  04,  ■ • • . The  next  example  illustrates  all  this. 


QR-Factorization  Method 


Compute  all  the  eigenvalues  of  the  matrix 

_6  4 1 l" 

4 6 11 

A = 

115  2 

112  5 


Solution.  We  first  reduce  A to  tridiagonal  form.  Applying  Householder’s  method,  we  obtain  (see  Example  1) 


6 — Vl8  0 0 


A2  — 


-VT8 

0 


7 

V2 


0 0 


V2  0 
6 0 
0 3 
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From  the  characteristic  determinant  we  see  that  A2,  hence  A,  has  the  eigenvalue  3.  (Can  you  see  this  directly 
from  A2?)  Hence  it  suffices  to  apply  the  QR-method  to  the  tridiagonal  3X3  matrix 


6 — Vl8  0 


B„  = B = 


-Vl8 

0 


7 V2 
V2  6 


Step  1.  We  multiply  B from  the  left  by 


cos  02 

sin  62 

o- 

1 

0 

0 

—sin  02 

cos  62 

0 

and  then  C2B  by 

C3  = 

0 

cos  03 

sin  02 

0 

0 

1 

0 

—sin  03 

COS  02 

Here  (—sin  02)  • 6 + (cos  02)(  — V I 8)  = 0 gives  (11)  cos  02  = 0.81649658  and  sin  02  = —0.57735027.  With 
these  values  we  compute 


C2B  = 


7.34846923 

0 

0 


-7.50555350 

3.26598632 

1.41421356 


-0.81649658 

1.15470054 

6.00000000 


In  C3  we  get  from  (—sin  03)  ■ 3.26598632  + (cos  03)  ■ 1.41421356  = 0 the  values  cos  03  = 0.91766294  and 
sin  03  = 0.39735971.  This  gives 


7.34846923 


R0  = C3C2B  = 0 


0 


-7.50555350 

3.55902608 

0 


-0.81649658 

3.44378413 

5.04714615 


From  this  we  compute 


Bi 


RoC2tC3t  = 


10.33333333 

-2.05480467 

0 


-2.05480467 

4.03508772 

2.00553251 


0 

2.00553251 

4.63157895 


which  is  symmetric  and  tridiagonal.  The  off-diagonal  entries  in  Bi  are  still  large  in  absolute  value.  Hence  we 
have  to  go  on. 

Step  2.  We  do  the  same  computations  as  in  the  first  step,  with  B0  = B replaced  by  Bi  and  C2  and  C3  changed 
accordingly,  the  new  angles  being  02  = —0.196291533  and  03  = 0.513415589.  We  obtain 


and  from  this 


10.53565375 

0 

_ 0 

10.87987988 

-0.79637918 

0 


-2.80232241 

4.08329584 

0 

-0.79637918 

5.44738664 

1.50702500 


-0.39114588 

3.98824028 

3.06832668 

0 

1.50702500 

2.67273348 


We  see  that  the  off-diagonal  entries  are  somewhat  smaller  in  absolute  value  than  those  of  B] . but  still  much  too 
large  for  the  diagonal  entries  to  be  good  approximations  of  the  eigenvalues  of  B. 
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Further  Steps.  We  list  the  main  diagonal  entries  and  the  absolutely  largest  off-diagonal  entry,  which  is 
\b(i'2\  = b 21'  in  all  steps.  You  may  show  that  the  given  matrix  A has  the  spectrum  1 1,  6,  3,  2. 


Stepy 

b<g 

b% 

maxjV)c  \b)k\ 

3 

10.9668929 

5.94589856 

2.08720851 

0.58523582 

5 

10.9970872 

6.00181541 

2.00109738 

0.12065334 

7 

10.9997421 

6.00024439 

2.00001355 

0.03591107 

9 

10.9999772 

6.00002267 

2.00000017 

0.01068477 

Looking  back  at  our  discussion,  we  recognize  that  the  purpose  of  applying  Householder’s 
tridiagonalization  before  the  QR-factorization  method  is  a substantial  reduction  of  cost  in 
each  QR-factorization,  in  particular  if  A is  large. 

Convergence  acceleration  and  thus  further  reduction  of  cost  can  be  achieved  by  a 
spectral  shift,  that  is,  by  taking  Bs  — ksI  instead  of  Bs  with  a suitable  ks.  Possible  choices 
of  ks  are  discussed  in  Ref.  [E29],  p.  510. 


PROBLEM  SET  2 P V9 


1-5 


HOUSEHOLDER  TRIDIAGONALIZATION 


Tridiagonalize.  Show  the  details. 


0.98 

0.04 

0.44 

0.04 

0.56 

0.40 

0.44 

0.40 

0.80 

0 

1 

1 

1 

0 

1 

1 

1 

0 

3 

52 

10 

42 

52 

59 

44 

80 

10 

44 

39 

42 

42 

80 

42 

35 

6-9 


QR-FACTORIZATION 


Do  three  QR-steps  to  find  approximations  of  the  eigen- 
values of: 

6.  The  matrix  in  the  answer  to  Prob.  1 

7.  The  matrix  in  the  answer  to  Prob.  3 


3. 


7 

2 

3 


2 3 

10  6 

6 7 


4. 


5 

4 

1 

1 


4 

5 

1 

1 


1 

1 

4 

2 


1 

1 

2 

4 


14.2 

-0.1 

0 " 

140 

10 

o' 

8. 

-0.1 

-6.3 

0.2 

9. 

10 

70 

2 

0 

0.2 

2.1 

0 

2 

-30 

10.  CAS  EXPERIMENT.  QR-Method.  Try  to  find  out 
experimentally  on  what  properties  of  a matrix  the  speed 
of  decrease  of  off-diagonal  entries  in  the  QR-method 
depends.  For  this  purpose  write  a program  that  first 
tridiagonalizes  and  then  does  QR-steps.  Try  the 
program  out  on  the  matrices  in  Probs.  1,  3,  and  4. 

Summarize  your  findings  in  a short  report. 


CHA  FT  ER2  0 REVJ  EW~QtJ  E S T I O N S AND  PROBLEMS 


1.  What  are  the  main  problem  areas  in  numeric  linear 
algebra? 

2.  When  would  you  apply  Gauss  elimination  and  when 
Gauss-Seidel  iteration? 


3.  What  is  pivoting?  Why  and  how  is  it  done? 

4.  What  happens  if  you  apply  Gauss  elimination  to  a 
system  that  has  no  solutions? 

5.  What  is  Cholesky’s  method?  When  would  you  apply  it? 
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6.  What  do  you  know  about  the  convergence  of  the 
Gauss-Seidel  iteration? 

7.  What  is  ill-conditioning?  What  is  the  condition  number 
and  its  significance? 

8.  Explain  the  idea  of  least  squares  approximation. 

9.  What  are  eigenvalues  of  a matrix?  Why  are  they 
important?  Give  typical  examples. 

10.  How  did  we  use  similarity  transformations  of  matrices 
in  designing  numeric  methods? 

11.  What  is  the  power  method  for  eigenvalues?  What  are 
its  advantages  and  disadvantages? 

12.  State  Gerschgorin's  theorem  from  memory.  Give  typical 
applications. 

13.  What  is  tridiagonalization  and  QR?  When  would  you 
apply  it? 


20. 


21-23 


GAUSS-SEIDEL  ITERATION 


Do  3 steps  without  scaling,  starting  from  [1 
21.  4xi  — x2  = 22.0 
4*2  - *3  = 13.4 


— *i  + 4*3  = —2.4 


22.  0.2*i  + 4.0*2  — O.4.X3  = 32.0 

0.5*i  “ 0.2*2  + 2.5*3  = “5.1 


14-17 

Solve 


GAUSS  ELIMINATION 


14.  3*2  — 6*3  = 0 

4*i  — *2  + 2*3  = 16 


7.5*i  + 0.1*2  — 1-5*3  = —12.7 
23.  10*i  + *2  — *3  = 17 

2*1  + 20*2+  *3=  28 

3*i  — *2  + 25*3  = 105 


If. 


— 5*i  + 2*2  — 4*3  = —20 

15.  8*2  — 6*3  = 23.6 
10*i  + 6*2  + 2*3  = 68.4 
12*i  “ 14*2  + 4*3  = —6.2 

16.  5*i  + *2  — 3*3  = 17 

— 5*2  + 15*3  = — 10 
2*i  — 3*2  + 9*3  = 0 

17.  42*i  + 74*2  + 36*3  = 96 
—46*i  — 12*2  — 2*3  = 82 

3*i  + 25*2  + 5*3  =19 


18-20 


INVERSE  MATRIX 


Compute  the  inverse  of: 


2.0 

0.1 

3.3 

18. 

1.6 

4.4 

0.5 

0.3 

-4.3 

2.8 

15 

20 

10~ 

19. 

20 

35 

15 

10 

15 

90 

24-26 


VECTOR  NORMS 


Compute  the  £\-,  €2--  and  £ ^-norms  of  the  vectors. 

24.  [0.2  -8.1  0.4  0 0 -1.3  2]T 

25.  [8  -21  13  Of 

26.  [0  0 0 -1  Of 


27-30 


MATRIX  NORM 


Compute  the  matrix  norm  corresponding  to  the  €0O-vector 
norm  for  the  coefficient  matrix: 

27.  In  Prob.  15 

28.  In  Prob.  17 

29.  In  Prob.  21 

30.  In  Prob.  22 


31-33 


CONDITION  NUMBER 


Compute  the  condition  number  (corresponding  to  the 
f^-vector  norm)  of  the  coefficient  matrix: 

31.  In  Prob.  19 

32.  In  Prob.  18 

33.  In  Prob.  21 


34-35 


FITTING  BY  LEAST  SQUARES 


Fit  and  graph: 

34.  A straight  line  to  (-1,0),  (0,2),  (1,2), 
(3,3) 

35.  A quadratic  parabola  to  the  data  in  Prob.  34. 


(2,  3), 
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EIGENVALUES 

Find  and  graph  three  circular  disks  that  must  contain  all  the 
eigenvalues  of  the  matrix: 

36.  In  Prob.  18 

37.  In  Prob.  19 


38.  In  Prob.  20 

39.  Of  the  coefficients  in  Prob.  14 

40.  Power  method.  Do  4 steps  with  scaling  for  the  matrix 

in  Prob.  19,  starting  for  [1  1 1]  and  computing  the 

Rayliegh  quotients  and  error  bounds. 


5 UMMARYOr  CH  APTER-2  0 

Numeric  Linear  Algebra 


Main  tasks  are  the  numeric  solution  of  linear  systems  (Secs.  20.1-20.4),  curve  fitting 
(Sec.  20.5),  and  eigenvalue  problems  (Secs.  20.6-20.9). 

Linear  systems  Ax  = b with  A = , written  out 

Ef  011*1  + ' • • + a\nXn  = bi 

E2:  a2i*i  + • • • + a2nxn  = b2 

(1) 


E n . flnl*  1 4"  * Arm*  n bn 

can  be  solved  by  a direct  method  (one  in  which  the  number  of  numeric  operations 
can  be  specified  in  advance,  e.g..  Gauss’s  elimination)  or  by  an  indirect  or  iterative 
method  (in  which  an  initial  approximation  is  improved  stepwise). 

The  Gauss  elimination  (Sec.  20.1)  is  direct,  namely,  a systematic  elimination 
process  that  reduces  (1)  stepwise  to  triangular  form.  In  Step  1 we  eliminate  x\  from 
equations  E2  to  En  by  subtracting  (fl2i/fln)Ei  from  E2,  then  (a3i/fln)Ei  from 
E3,  etc.  Equation  Ei  is  called  the  pivot  equation  in  this  step  and  «n  the  pivot.  In 
Step  2 we  take  the  new  second  equation  as  pivot  equation  and  eliminate  ;t2,  etc.  If 
the  triangular  form  is  reached,  we  get  xn  from  the  last  equation,  then  *K_i  from 
the  second  last,  etc.  Partial  pivoting  (=  interchange  of  equations)  is  necessary  if 
candidates  for  pivots  are  zero,  and  advisable  if  they  are  small  in  absolute  value. 

Doolittle’s,  Crout’s,  and  Cholesky’s  methods  in  Sec.  20.2  are  variants  of  the 
Gauss  elimination.  They  factor  A = LU  (L  lower  triangular,  U upper  triangular) 
and  solve  Ax  = LUx  = b by  solving  Ly  = b for  y and  then  Ux  = y for  x. 

In  the  Gauss-Seidel  iteration  (Sec.  20.3)  we  make  an  = a22  = • • • = ann  = 1 
(by  division)  and  write  Ax  = (I  + L + U)x  = b;  thus  x = b — (L  + U)x,  which 
suggests  the  iteration  formula 

(2)  x(m+1)  = b - Lx(m+1)  - Ux(m) 

in  which  we  always  take  the  most  recent  approximate  xfs  on  the  right.  If  ||C||  < 1, 
where  C = —(I  + L)-1U,  then  this  process  converges.  Here,  ||C||  denotes  any 
matrix  norm  (Sec.  20.3). 
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If  the  condition  number  k(A)  = ||A||  ||A_1[|  of  A is  large,  then  the  system  Ax  = b 
is  ill-conditioned  (Sec.  20.4),  and  a small  residual  r = b — Ax  does  not  imply 
that  x is  close  to  the  exact  solution. 

The  fitting  of  a polynomial  p(x)  = b0  + b\X  + ■ ■ ■ + bmxm  through  given  data 
(points  in  the  xy-plane)  (xq,  v i ) , • ■ ■ , (xn,  yn)  by  the  method  of  least  squares  is 
discussed  in  Sec.  20.5  (and  in  statistics  in  Sec.  25.9).  If  m = n,  the  least  squares 
polynomial  will  be  the  same  as  an  interpolating  polynomial  (uniqueness). 

Eigenvalues  A (values  A for  which  Ax  = Ax  has  a solution  x ¥=  0,  called  an 
eigenvector)  can  be  characterized  by  inequalities  (Sec.  20.7),  e.g.  in  Gerschgorin’s 
theorem,  which  gives  n circular  disks  which  contain  the  whole  spectrum  (all 
eigenvalues)  of  A,  of  centers  u:a  and  radii  X | (sum  over  k from  1 to  n,  k ¥=  j). 

Approximations  of  eigenvalues  can  be  obtained  by  iteration,  starting  from  an 
x0  ¥=  0 and  computing  Xj  = Ax0,  x2  = Axj,  ■ ■ • , xn  = Axn_j . In  this  power 
method  (Sec.  20.8)  the  Rayleigh  quotient 

(Ax)T)x 

(3)  q = (x  = xn) 

X X 

gives  an  approximation  of  an  eigenvalue  (usually  that  of  the  greatest  absolute  value) 
and,  if  A is  symmetric,  an  error  bound  is 


(4)  |e|  =§  /(Ax)TAx  - q2. 

V xTx 

Convergence  may  be  slow  but  can  be  improved  by  a spectral  shift. 

For  determining  all  the  eigenvalues  of  a symmetric  matrix  A it  is  best  to  first 
tridiagonalize  A and  then  to  apply  the  QR-method  (Sec.  20.9),  which  is  based  on  a 
factorization  A = QR  with  orthogonal  Q and  upper  triangular  R and  uses  similarity 
transformations. 


CHAPTER 


Numerics  for  ODEs  and  PDEs 


Ordinary  differential  equations  (ODEs)  and  partial  differential  equations  (PDEs)  play  a 
central  role  in  modeling  problems  of  engineering,  mathematics,  physics,  aeronautics, 
astronomy,  dynamics,  elasticity,  biology,  medicine,  chemistry,  environmental  science, 
economics,  and  many  other  areas.  Chapters  1-6  and  12  explained  the  major  approaches 
to  solving  ODEs  and  PDEs  analytically.  However,  in  your  career  as  an  engineer,  applied 
mathematicians,  or  physicist  you  will  encounter  ODEs  and  PDEs  that  cannot  be  solved 
by  those  analytic  methods  or  whose  solutions  are  so  difficult  that  other  approaches  are 
needed.  It  is  precisely  in  these  real-world  projects  that  numeric  methods  for  ODEs  and 
PDEs  are  used,  often  as  part  of  a software  package.  Indeed,  numeric  software  has  become 
an  indispensable  tool  for  the  engineer. 

This  chapter  is  evenly  divided  between  numerics  for  ODEs  and  numerics  for  PDEs. 
We  start  with  ODEs  and  discuss,  in  Sec.  21.1,  methods  for  first-order  ODEs.  The  main 
initial  idea  is  that  we  can  obtain  approximations  to  the  solution  of  such  an  ODE  at  points 
that  are  a distance  h apart  by  using  the  first  two  terms  of  Taylor’s  formula  from  calculus. 
We  use  these  approximations  to  construct  the  iteration  formula  for  a method  known  as 
Euler’s  method.  While  this  method  is  rather  unstable  and  of  little  practical  use,  it  serves 
as  a pedagogical  tool  and  a starting  point  toward  understanding  more  sophisticated  methods 
such  as  the  Runge-Kutta  method  and  its  variant  the  Runga-Kutta-Fehlberg  (RKF)  method, 
which  are  popular  and  useful  in  practice.  As  is  usual  in  mathematics,  one  tends  to 
generalize  mathematical  ideas.  The  methods  of  Sec.  21.1  are  one-step  methods,  that  is, 
the  current  approximation  uses  only  the  approximation  from  the  previous  step.  Multistep 
methods,  such  as  the  Adams-Bashforth  methods  and  Adams-Moulton  methods,  use  values 
computed  from  several  previous  steps.  We  conclude  numerics  for  ODEs  with  applying 
Runge-Kutta-Nystrom  methods  and  other  methods  to  higher  order  ODEs  and  systems  of 
ODEs. 

Numerics  for  PDEs  are  perhaps  even  more  exciting  and  ingenious  than  those  for  ODEs. 
We  first  consider  PDEs  of  the  elliptic  type  (Laplace,  Poisson).  Again,  Taylor’s  formula 
serves  as  a starting  point  and  lets  us  replace  partial  derivatives  by  difference  quotients. 
The  end  result  leads  to  a mesh  and  an  evaluation  scheme  that  uses  the  Gauss-Seidel 
method  (here  also  know  as  Liebmann’s  method).  We  continue  with  methods  that  use  grids 
to  solve  Neuman  and  mixed  problems  (Sec.  21.5)  and  conclude  with  the  important 
Crank-Nicholson  method  for  parabolic  PDEs  in  Sec.  21.6. 

Sections  21.1  and  21.2  may  be  studied  immediately  after  Chap.  1 and  Sec.  21.3 
immediately  after  Chaps.  2-4,  because  these  sections  are  independent  of  Chaps.  19  and  20. 

Sections  21.4-21.7  on  PDEs  may  be  studied  immediately  after  Chap.  12  if  students 
have  some  knowledge  of  linear  systems  of  algebraic  equations. 
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Prerequisite:  Secs.  1.1-1. 5 for  ODEs,  Secs.  12.1-12.3,  12.5,  12.10  for  PDEs. 
References  and  Answers  to  Problems:  App.  1 Part  E (see  also  Parts  A and  C),  App.  2. 
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21.1  Methods  for  First-Order  ODEs 

Take  a look  at  Sec.  1.2,  where  we  briefly  introduced  Euler’s  method  with  an  example. 
We  shall  develop  Euler’s  method  more  rigorously.  Pay  close  attention  to  the  derivation 
that  uses  Taylor’s  formula  from  calculus  to  approximate  the  solution  to  a first-order  ODE 
at  points  that  are  a distance  h apart.  If  you  understand  this  approach,  which  is  typical  for 
numerics  for  ODEs,  then  you  will  understand  other  methods  more  easily. 

From  Chap.  1 we  know  that  an  ODE  of  the  first  order  is  of  the  form  F(x,  y,y  ) = 0 
and  can  often  be  written  in  the  explicit  form  y = fix,  y).  An  initial  value  problem  for 
this  equation  is  of  the  form 

(1)  y' =f(x,y),  y(x0)  = y0 

where  x0  and  yo  are  given  and  we  assume  that  the  problem  has  a unique  solution  on  some 
open  interval  a < x < b containing  xo- 

In  this  section  we  shall  discuss  methods  of  computing  approximate  numeric  values  of 
the  solution  y(x)  of  (1)  at  the  equidistant  points  on  the  x-axis 


xi  = xo  + h,  x2  = xo  + 2 h,  X3  = xq  + 3 h. 


where  the  step  size  h is  a fixed  number,  for  instance,  0.2  or  0.1  or  0.01,  whose  choice  we 
discuss  later  in  this  section.  Those  methods  are  step-by-step  methods,  using  the  same 
formula  in  each  step.  Such  formulas  are  suggested  by  the  Taylor  series 

h2 

(2)  y(x  + h)  = y(x)  + hy'(x)  + — y"(x)  + • • • . 

Formula  (2)  is  the  key  idea  that  lets  us  develop  Euler’ s method  and  its  variant  called — 
you  guessed  it — improved  Euler  method,  also  known  as  Heim ’s  method.  Let  us  start  by 
deriving  Euler’s  method. 

For  small  h the  higher  powers  h2,  li’\  ■ ■ ■ in  (2)  are  very  small.  Dropping  all  of  them 
gives  the  crude  approximation 


y(x  + /?)  ~ y(x)  + hy\x) 

= y{x)  + hf(x,  y) 

and  the  corresponding  Euler  method  (or  Euler-Cauchy  method) 

(3)  yn+i  = yn  + hf(xn,  yn ) («  = 0, 1,  ■ ■ • ) 

discussed  in  Sec.  1.2.  Geometrically,  this  is  an  approximation  of  the  curve  of  y(x)  by  a 
polygon  whose  first  side  is  tangent  to  this  curve  at  xo  (see  Fig.  8 in  Sec.  1.2). 

Error  of  the  Euler  Method.  Recall  from  calculus  that  Taylor’s  formula  with 
remainder  has  the  form 


y(x  + h)  = y(x)  + hy'(x)  + \ h2y"(g) 
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(where  x £ ( i r + h).  It  shows  that,  in  the  Euler  method,  the  truncation  error  in  each 


(see  also  Sec.  20.1).  Now,  over  a fixed  x-interval  in  which  we  want  to  solve  an  ODE,  the 
number  of  steps  is  proportional  to  1 //?.  Hence  the  total  error  or  global  error  is  proportional 


addition,  there  are  roundoff  errors  in  this  and  other  methods,  which  may  affect  the 
accuracy  of  the  values  ylt  y2,  • • ■ more  and  more  as  n increases. 

Automatic  Variable  Step  Size  Selection  in  Modern  Software.  The  idea  of 
adaptive  integration,  as  motivated  and  explained  in  Sec.  19.5,  applies  equally  well  to  the 
numeric  solution  of  ODEs.  It  now  concerns  automatically  changing  the  step  size  h depending 
on  the  variability  of  y = f determined  by 


Accordingly,  modern  software  automatically  selects  variable  step  sizes  hn  so  that  the  error 
of  the  solution  will  not  exceed  a given  maximum  size  TOL  (suggesting  tolerance).  Now  for 


corresponds  to  maximum  h = H = V2  TOL/ K by  (4).  Thus,  V2  TOL  = H\rK.  We  can 
insert  this  into  (4b),  obtaining  by  straightforward  algebra 


For  other  methods,  automatic  step  size  selection  is  based  on  the  same  principle. 

Improved  Euler  Method.  Predictor,  Corrector.  Euler’s  method  is  generally  much 
too  inaccurate.  For  a large  h (0.2)  this  is  illustrated  in  Sec.  1.2  by  the  computation  for 


And  for  small  h the  computation  becomes  prohibitive;  also,  roundoff  in  so  many  steps 
may  result  in  meaningless  results.  Clearly,  methods  of  higher  order  and  precision  are 
obtained  by  taking  more  terms  in  (2)  into  account.  But  this  involves  an  important  practical 
problem.  Namely,  if  we  substitute  y = f(x,  y(x ))  into  (2),  we  have 


Now  y in/depends  on  x,  so  that  we  have/7  as  shown  in  (4*)  and  f",  f"  even  much  more 
cumbersome.  The  general  strategy  now  is  to  avoid  the  computation  of  these  derivatives 
and  to  replace  it  by  computing  / for  one  or  several  suitably  chosen  auxiliary  values  of 
(x,  y).  “Suitably”  means  that  these  values  are  chosen  to  make  the  order  of  the  method  as 


step  or  local  truncation  error  is  proportional  to  h2,  written  0(h2),  where  O suggests  order 


to  h2{\/h)  = h\  For  this  reason,  the  Euler  method  is  called  a first-order  method.  In 


(4*) 


n 


f =fx+  fyy'  =fx+  fyf 


the  Euler  method,  when  the  step  size  is  h = hn,  the  local  error  at  xn  is  about  \y"(in)\ ■ 
We  require  that  this  be  equal  to  a given  tolerance  TOL, 


(4)  (a)  \h2n \y"{U)\  = TOL,  thus  (b) 


y"(x)  must  not  be  zero  on  the  interval  /:  x0  = x = x N on  which  the  solution  is  wanted. 
Let  K be  the  minimum  of  y "(x)  on  J and  assume  that  K > 0.  Minimum  y”(x) 


(5) 


(6) 


y = y + x,  v(0)  = 0. 


(2*) 


y(x  + h)  = y(x)  + hf+  \h2f  + \li3f"  + ■ ■ • . 
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EXAMPLE  1 


high  as  possible  (to  have  high  accuracy).  Let  us  discuss  two  such  methods  that  are  of 
practical  importance,  namely,  the  improved  Euler  method  and  the  (classical)  Runge-Kutta 
method. 

In  each  step  of  the  improved  Euler  method  we  compute  two  values,  first  the  predictor 
(7a)  y*  + 1 = yn  + hf(xn,  yn), 

which  is  an  auxiliary  value,  and  then  the  new  y-value,  the  corrector 

(7b)  yn+i  = yn  + \h  [f(xn,  yn)  + f(xn+1,  y*t+i)]. 

Hence  the  improved  Euler  method  is  a predictor-corrector  method:  In  each  step  we  predict 
a value  (7a)  and  then  we  correct  it  by  (7b). 

In  algorithmic  form,  using  the  notations  k\  = hf(xn,  yn)  in  (7a)  and  k 2 = hf(xn+ 1, 
yXi+ 1)  in  (7b),  we  can  write  this  method  as  shown  in  Table  21.1. 

Table  21.1  Improved  Euler  Method  (Heun’s  Method) 


ALGORITHM  EULER  (/,  x0,  y„,  h,  N ) 

This  algorithm  computes  the  solution  of  the  initial  value  problem  y'  = /( x,  y),  yGo)  = Jo 
at  equidistant  points  xi  = xo  + h,  x2  = x0  + 2h,  ■ ■ • , xN  = x0  + Nh\  here  f is  such 
that  this  problem  has  a unique  solution  on  the  interval  [.r0, ;%]  (see  Sec.  1.6). 

INPUT:  Initial  values  x0,  y0,  step  size  h,  number  of  steps  N 

OUTPUT:  Approximation  yn+1  to  the  solution  y(xn  , U atxn+1  = x0  + (n  + I )/;, 
where  n = 0,  • • • , N — 1 

For  n = 0,  1,  ■ ■ • , N — 1 do: 

Xn+l  tcn  T h 
k\  = hf(xn,yn) 
k2  = hf(xn+ 1,  yn  + k 1) 
y-n+l  — yn  + 2(^1  + k2) 

OUTPUT  Xn+1,yn+i 

End 

Stop 

End  EULER 


Improved  Euler  Method.  Comparison  with  Euler  Method. 

Apply  the  improved  Euler  method  to  the  initial  value  problem  (6),  choosing  h = 0.2  as  in  Sec.  1.2. 
Solution.  For  the  present  problem  we  have  in  Table  21.1 

k\  0. 2(xn  + yn) 

k2  = 0.2(xn  + 0.2  + yn  + 0.2(xn  + yn )) 
yn+ 1 = yn  + ^-  ( 2.2xn  + 2.2yn  + 0.2)  = yn  + 0.22(xn  + yn)  + 0.02. 
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Table  21.2  shows  that  our  present  results  are  much  more  accurate  than  those  for  Euler’s  method  in  Table  21.1  but 
at  the  cost  of  more  computations. 


Table  21.2  Improved  Euler  Method  for  (6).  Errors 


n 

yn 

Exact  Values 
(4D) 

Error  of 
Improved  Euler 

Error  of 
Euler 

0 

0.0 

0.0000 

0.0000 

0.0000 

0.000 

1 

0.2 

0.0200 

0.0214 

0.0014 

0.021 

2 

0.4 

0.0884 

0.0918 

0.0034 

0.052 

3 

0.6 

0.2158 

0.2221 

0.0063 

0.094 

4 

0.8 

0.4153 

0.4255 

0.0102 

0.152 

5 

1.0 

0.7027 

0.7183 

0.0156 

0.230 

Error  of  the  Improved  Euler  Method.  The  local  error  is  of  order  h3  and  the  global 
error  of  order  h , so  that  the  method  is  a second-order  method. 

PROOF  Setting  fn  = f(xn,  y(xn))  and  using  (2*)  (after  (6)),  we  have 

(8a)  y{xn  + li)  - y(xn)  = hfn  + lh2f'n  + g/?3/^  + • ■ ■ . 

Approximating  the  expression  in  the  brackets  in  (7b)  by  fn  + fn+i  and  again  using  the 
Taylor  expansion,  we  obtain  from  (7b) 


(8b) 


yn+ 1 yn  2.h  [fn  fn+ 1] 

= Z h [fn  + (fn  + hfn  + kh2fn  + ’ ’ ’ )] 

= hfn  + \h2f'n  + \h3fn  + • ■ ■ 


(where  ' = d/dxn,  etc.).  Subtraction  of  (8b)  from  (8a)  gives  the  local  error 


/t3 

6 


f 

J r, 


IS 

12 


fn  + 


Since  the  number  of  steps  over  a fixed  x-interval  is  proportional  to  1 /h,  the  global  error 
is  of  order  h3/h  = h2,  so  that  the  method  is  of  second  order. 


Since  the  Euler  method  was  an  attractive  pedagogical  tool  to  teach  the  beginning  of 
solving  first-order  ODEs  numerically  but  had  its  drawbacks  in  terms  of  accuracy  and  could 
even  produce  wrong  answers,  we  studied  the  improved  Euler  method  and  thereby 
introduced  the  idea  of  a predictor-corrector  method.  Although  improved  Euler  is  better 
than  Euler,  there  are  better  methods  that  are  used  in  industrial  settings.  Thus  the  practicing 
engineer  has  to  know  about  the  Runga-Kutta  methods  and  its  variants. 


Runge-Kutta  Methods  (RK  Methods) 

A method  of  great  practical  importance  and  much  greater  accuracy  than  that  of  the 
improved  Euler  method  is  the  classical  Runge-Kutta  method  of  fourth  order , which  we 
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call  briefly  the  Runge-Kutta  method.1  It  is  shown  in  Table  21.3.  We  see  that  in  each 
step  we  first  compute  four  auxiliary  quantities  k\,  k2,  k3,  and  then  the  new  value  yn+ j . 
The  method  is  well  suited  to  the  computer  because  it  needs  no  special  starting  procedure, 
makes  light  demand  on  storage,  and  repeatedly  uses  the  same  straightforward  compu- 
tational procedure.  It  is  numerically  stable. 

Note  that,  if/ depends  only  on  x,  this  method  reduces  to  Simpson’s  rule  of  integration 
(Sec.  19.5).  Note  further  that  k\,  ■ ■ ■ , k$  depend  on  n and  generally  change  from  step 
to  step. 


Table  21.3  Classical  Runge-Kutta  Method  of  Fourth  Order 

ALGORITHM  RUNGE-KUTTA  (/,  x0,  y„,  h,  N). 

This  algorithm  computes  the  solution  of  the  initial  value  problem  y = f (x,  y),  y(x0)  = y0 
at  equidistant  points 

(9)  Xi  = Xo  + h,  X2  = xq  + 2h,  ■ ■ ■ , xN  = Xq  + Nh\ 

here  f is  such  that  this  problem  has  a unique  solution  on  the  interval  [r0, .%]  (see  Sec.  1.7). 

INPUT:  Function  /,  initial  values  x0,  y0,  step  size  h,  number  of  steps  N 

OUTPUT:  Approximation  yn+1  to  the  solution  y(xn+1)  at  xn+\  = Xo  + (n  + 1)/j, 
where  n = 0,  1,  • ■ ■ , N — l 

For  n = 0,  1,  • ■ • , N — 1 do: 
k i = hf(xn,yn) 
k2  = hf(xn  + \h,yn  + \k-f) 
k3  = hf(xn  + 2 h,  yn  + \k2) 
k4  = hf(xn  + h,yn  + k3 ) 
xn+i  xn  -f  h 

yn+ 1 = yn  + l + 2k2  + 2k  3 + k4) 

OUTPUT  xn+1,  v„+ 1 

End 

Stop 

End  RUNGE-KUTTA 


1Named  after  the  German  mathematicians  KARL  RUNGE  (Sec.  19.4)  and  WILHELM  KUTTA  (1867-1944). 
Runge  [Math.  Annalen  46  (1895),  167-178],  the  German  mathematician  KARL  HEUN  (1859-1929)  [Zeitschr. 
Math.  Phys.  45  (1900),  23-38],  and  Kutta  [Zeitschr.  Math.  Phys.  46  (1901),  435-453]  developed  various  similar 
methods.  Theoretically,  there  are  infinitely  many  fourth-order  methods  using  four  function  values  per  step.  The 
method  in  Table  21.3  is  most  popular  from  a practical  viewpoint  because  of  its  “symmetrical”  form  and  its 
simple  coefficients.  It  was  given  by  Kutta. 
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EXAMPLE  2 


Classical  Runge-Kutta  Method 

Apply  the  Runge-Kutta  method  to  the  initial  value  problem  in  Example  1,  choosing  h = 0.2,  as  before,  and 
computing  five  steps. 

Solution.  For  the  present  problem  we  have  f(x,  y)  = x + y.  Hence 

ki  = 0.2(xn  + yn),  k2  = 0.2(xn  + 0.1  + yn  + 0.5k  i), 

k3  = 0.2  (xn  + 0.1  + yn  + 0.5  k2),  k±  = 0.2(xn  + 0.2  + yn  + k3). 

Table  21.4  shows  the  results  and  their  errors,  which  are  smaller  by  factors  103  and  104  than  those  for  the  two 
Euler  methods.  See  also  Table  21.5.  We  mention  in  passing  that  since  the  present  &i,  • • • , are  simple, 
operations  were  saved  by  substituting  k i into  k2,  then  k2  into  k3,  etc.;  the  resulting  formula  is  shown  in 
Column  4 of  Table  21.4.  Keep  in  mind  that  we  have  four  function  evaluations  at  each  step. 


Table  21.4  Runge-Kutta  Method  Applied  to  (4) 


n 

xn 

yn 

0.2214(xn  + yn) 
+ 0.0214 

Exact  Values  (6D) 
y = ex  — x — 1 

106  X Error 

of  y-n 

0 

0.0 

0 

0.021400 

0.000000 

0 

1 

0.2 

0.021400 

0.070418 

0.021403 

3 

2 

0.4 

0.091818 

0.130289 

0.091825 

7 

3 

0.6 

0.222107 

0.203414 

0.222119 

12 

4 

0.8 

0.425521 

0.292730 

0.425541 

20 

5 

1.0 

0.718251 

0.718282 

31 

Table  21.5  Comparison  of  the  Accuracy  of  the  Three  Methods  under  Consideration 
in  the  Case  of  the  Initial  Value  Problem  (4),  with  h = 0.2 


Error 

X 

1 

* 

1 

II 

Euler 

(Table  21.1) 

Improved  Euler 
(Table  21.3) 

Runge-Kutta 
(Table  21.5) 

0.2 

0.021403 

0.021 

0.0014 

0.000003 

0.4 

0.091825 

0.052 

0.0034 

0.000007 

0.6 

0.222119 

0.094 

0.0063 

0.000011 

0.8 

0.425541 

0.152 

0.0102 

0.000020 

1.0 

0.718282 

0.230 

0.0156 

0.000031 

Error  and  Step  Size  Control. 

RKF  (Runge-Kutta-Fehlberg) 

The  idea  of  adaptive  integration  (Sec.  19.5)  has  analogs  for  Runge-Kutta  (and  other) 
methods.  In  Table  21.3  for  RK  (Runge-Kutta),  if  we  compute  in  each  step  approximations 
y and  y with  step  sizes  h and  2 h,  respectively,  the  latter  has  error  per  step  equal  to  25  = 32 
times  that  of  the  former;  however,  since  we  have  only  half  as  many  steps  for  2 h,  the  actual 
factor  is  25/2  = 16,  so  that,  say, 

e(2h>  « 16e(W  and  thus  y(h)  - y(2h)  = e(2W  - e(h}  - (16  - l)e(,l). 
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Hence  the  error  e = e'h)  for  step  size  h is  about 

(10)  6 = Uy  - y) 

where  y — y = y(h>  — y(Zh),  as  said  before.  Table  21.6  illustrates  (10)  for  the  initial  value 
problem 

(11)  y'  =(y-x-  if + 2,  y(0)  = 1, 

the  step  size  h = 0.1  and  0 § 0.4.  We  see  that  the  estimate  is  close  to  the  actual 

error.  This  method  of  error  estimation  is  simple  but  may  be  unstable. 


Table  21.6  Runge-Kutta  Method  Applied  to  the  Initial  Value  Problem  (11) 
and  Error  Estimate  (10).  Exact  Solution  y = tan  x + x + 1 


X 

y 

(Step  size  h) 

y 

(Step  size  2h) 

Error 

Estimate  (10) 

Actual 

Error 

Exact 

Solution  (9D) 

0.0 

1.000000000 

1.000000000 

0.000000000 

0.000000000 

1.000000000 

0.1 

1.200334589 

0.000000083 

1.200334672 

0.2 

1.402709878 

1.402707408 

0.000000165 

0.000000157 

1.402710036 

0.3 

1.609336039 

0.000000210 

1.609336250 

0.4 

1.822792993 

1.822788993 

0.000000267 

0.000000226 

1.822793219 

RKF.  E.  Fehlberg  [ Computing  6 (1970),  61-71]  proposed  and  developed  error  control 
by  using  two  RK  methods  of  different  orders  to  go  from  (xn,  yn)  to  (xn+\.  yn+ 1 ).  The 
difference  of  the  computed  y-valucs  at  xn+\  gives  an  error  estimate  to  be  used  for  step 
size  control.  Fehlberg  discovered  two  RK  formulas  that  together  need  only  six  function 
evaluations  per  step.  We  present  these  formulas  here  because  RKF  has  become  quite 
popular.  For  instance,  Maple  uses  it  (also  for  systems  of  ODEs). 

Fehlberg’s  fifth-order  RK  method  is 


(12a) 


yn+ 1 = yn  + 7iA  i + • • • + y6^6 


with  coefficient  vector  y = [71  • • • ye], 


(12b) 


_ r 16  n 6656  28,561 

7 ~ Ll35  u 12,825  56,430 


9^  jti 

50  55l  • 


His  fourth-order  RK  method  is 


(13a) 


7rn+ 1 = yn  + y*ki  + • ■ ■ + ytk5 


with  coefficient  vector 


1408 

2565 


2197 

4104 


(13b) 


7 * = [iw  0 
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In  both  formulas  we  use  only  six  different  function  evaluations  altogether,  namely. 


ky  = hf(xn,yn) 


(14) 


k-i 

* 

§ 

II 

+ 

\h. 

yn 

+ 

Iky) 

k3 

= hf(xn 

+ 

3 7 
8"> 

yn 

+ 

32^1  + 

32  ^2) 

= hf{xn 

4- 

12  r 

13 

yn 

+ 

1932  , 
2197  K 1 

7200 , , 

2197  *2  7" 

7296  7 x 
2197  *31 

^5 

= hf(xn 

+ h, 

yn 

+ 

439  7 
216  ^1 

8^2  + 

3680  7 
513  *3 

845  7 x 
4104  *41 

ke 

= hf(xn 

+ 

lh. 

yn 

- 

27*1  + 

2k2  - 

3544  7 , 

2565  *3 

1859  7 
4104  *4 

The  difference  of  (12)  and  (13)  gives  the  error  estimate 


(15) 


= 71+1 


}'n  - 1 


Vn+1  — 360^1 


128  , 2197  , 

4275*3  75,240*4 


+ 


50 


k 5 + 


2 

55 


k 6- 


EXAMPLE  3 Runge-Kutta-Fehlberg 

For  the  initial  value  problem  (11)  we  obtain  from  (12) — (14)  with  h = 0.1  in  the  first  step  the  12S-values 


k\  = 0.200000000000 
k3  = 0.200140756867 
k5  = 0.201006676700 


k2  = 0.200062500000 
*4  = 0.200856926154 
k6  = 0.200250418651 


y*  = 1.20033466949 
yy  = 1.20033467253 


and  the  error  estimate 


ei^yi~y*  = 0.00000000304. 

The  exact  12S-value  is  v(0.1)  = 1.20033467209.  Hence  the  actual  error  of  Vi  is  —4.4  • 10-10,  smaller  than  that 
in  Table  21.6  by  a factor  of  200.  fl 


Table  21.7  summarizes  essential  features  of  the  methods  in  this  section.  It  can  be  shown 
that  these  methods  are  numerically  stable  (definition  in  Sec.  19.1).  They  are  one-step 
methods  because  in  each  step  we  use  the  data  of  just  one  preceding  step,  in  contrast  to 
multistep  methods  where  in  each  step  we  use  data  from  several  preceding  steps,  as  we 
shall  see  in  the  next  section. 


Table  21.7  Methods  Considered  and  Their  Order  (=  Their  Global  Error) 


Method 

Function  Evaluation 
per  Step 

Global  Error 

Local  Error 

Euler 

1 

0(h) 

0(h2) 

Improved  Euler 

2 

o(h2) 

0(h3) 

RK  (fourth  order) 

4 

0(h 4) 

o(h5) 

RKF 

6 

o(h5) 

o(h6) 
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EXAMPLE  4 


Backward  Euler  Method.  Stiff  ODEs 

The  backward  Euler  formula  for  numerically  solving  (1)  is 

(16)  yn+ 1 = yn  + hf(xn+1,  yn+i)  (n  = 0, 1,  • • ■ ). 

This  formula  is  obtained  by  evaluating  the  right  side  at  the  new  location  (xn+1,  yTC+1); 
this  is  called  the  backward  Euler  scheme.  For  known  yn  it  gives  yn+\  implicitly,  so  it 
defines  an  implicit  method,  in  contrast  to  the  Euler  method  (3),  which  gives  yn+\ 
explicitly.  Hence  (16)  must  be  solved  foryn+1.  How  difficult  this  is  depends  on/in  (1). 
For  a linear  ODE  this  provides  no  problem,  as  Example  4 (below)  illustrates.  The  method 
is  particularly  useful  for  “stiff’  ODEs,  as  they  occur  quite  frequently  in  the  study  of 
vibrations,  electric  circuits,  chemical  reactions,  etc.  The  situation  of  stiffness  is  roughly 
as  follows;  for  details,  see,  for  example,  [E5],  [E25],  [E26]  in  App.  1. 

Error  terms  of  the  methods  considered  so  far  involve  a higher  derivative.  And  we  ask 
what  happens  if  we  let  h increase.  Now  if  the  error  (the  derivative)  grows  fast  but  the  desired 
solution  also  grows  fast,  nothing  will  happen.  However,  if  that  solution  does  not  grow  fast, 
then  with  growing  h the  error  term  can  take  over  to  an  extent  that  the  numeric  result  becomes 
completely  nonsensical,  as  in  Fig.  451.  Such  an  ODE  for  which  h must  thus  be  restricted 
to  small  values,  and  the  physical  system  the  ODE  models,  are  called  stiff.  This  term  is 
suggested  by  a mass-spring  system  with  a stiff  spring  (spring  with  a large  k;  see  Sec.  2.4). 
Example  4 illustrates  that  implicit  methods  remove  the  difficulty  of  increasing  h in  the  case 
of  stiffness:  It  can  be  shown  that  in  the  application  of  an  implicit  method  the  solution  remains 
stable  under  any  increase  of  h,  although  the  accuracy  decreases  with  increasing  h. 

Backward  Euler  Method.  Stiff  ODE 

The  initial  value  problem 

/ =flx,y) 

has  the  solution  (verify!) 

The  backward  Euler  formula  (16)  is 


= -20hy  + 20xz  + 2x,  y(0)  = 1 


y = e 20x  + x2. 


Jn+l  y-n  hf&n+l’ yn+l)  yn  T /z(  20yn+i  + 2§Xn+\  + 2xn+\). 
Noting  that  xn+1  = xn  + h,  taking  the  term  — 20v„+i  to  the  left,  and  dividing,  we  obtain 


(16*) 


yn  + h[20(xn  + A)2  + 2(xn  + A)] 


yn+i  = 


1 + 20A 


The  numeric  results  in  Table  21.8  show  the  following. 

Stability  of  the  backward  Euler  method  for  A = 0.05  and  also  for  A = 0.2  with  an  error  increase  by  about  a 
factor  4 for  A = 0.2, 

Stability  of  the  Euler  method  for  A = 0.05  but  instability  for  h = 0.1  (Fig.  451), 

Stability  of  RK  for  h = 0.1  but  instability  for  h = 0.2. 

This  illustrates  that  the  ODE  is  stiff.  Note  that  even  in  the  case  of  stability  the  approximation  of  the  solution 
near*  = 0 is  poor. 


Stiffness  will  be  considered  further  in  Sec.  21.3  in  connection  with  systems  of  ODEs. 
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Fig.  451.  Euler  method  with  h = 0.1  for  the  stiff 
ODE  in  Example  4 and  exact  solution 


Table  21.8  Backward  Euler  Method  (BEM)  for  Example  6.  Comparison  with  Euler  and  RK 


X 

BEM 
h = 0.05 

BEM 
h = 0.2 

Euler 
h = 0.05 

Euler 
h = 0.1 

RK 

h = 0.1 

RK 

h = 0.2 

Exact 

0.0 

1.00000 

1.00000 

1.00000 

1.00000 

1.00000 

1.000 

1.00000 

0.1 

0.26188 

0.00750 

-1.00000 

0.34500 

0.14534 

0.2 

0.10484 

0.24800 

0.03750 

1.04000 

0.15333 

5.093 

0.05832 

0.3 

0.10809 

0.08750 

-0.92000 

0.12944 

0.09248 

0.4 

0.16640 

0.20960 

0.15750 

1.16000 

0.17482 

25.48 

0.16034 

0.5 

0.25347 

0.24750 

-0.76000 

0.25660 

0.25004 

0.6 

0.36274 

0.37792 

0.35750 

1.36000 

0.36387 

127.0 

0.36001 

0.7 

0.49256 

0.48750 

-0.52000 

0.49296 

0.49001 

0.8 

0.64252 

0.65158 

0.63750 

1.64000 

0.64265 

634.0 

0.64000 

0.9 

0.81250 

0.80750 

-0.20000 

0.81255 

0.81000 

1.0 

1.00250 

1.01032 

0.99750 

2.00000 

1.00252 

3168 

1.00000 

1-4 


EULER  METHOD 


Do  10  steps.  Solve  exactly.  Compute  the  error.  Show 
details. 

1.  y + 0.2y  = 0,  y(0)  = 5,  h = 0.2 

2.  y = gTrVl  - y2,  y(0)  = 0,  h = 0.1 

3 . y'  = (y  ~ xf,  y(0)  = 0.  h = 0.1 

4.  y'  = (y  + xf,  y(0)  = 0.  h = 0.1 


5-10 


IMPROVED  EULER  METHOD 


Do  10  steps.  Solve  exactly.  Compute  the  error.  Show 
details. 

5.  y = v,  y(0)  = 1,  /t  = 0.1 

6.  y = 2(1  + y2),  y(0)  = 0,  h = 0.05 

7.  y'  - xy2  = 0,  y(0)  =1,  h = 0.1 

8.  Logistic  population  model,  y = y — y2,  y(0)  = 0.2, 

h = 0.1 


9.  Do  Prob.  7 using  Euler’s  method  with  h = 0.1  and  com- 
pare the  accuracy. 

10.  Do  Prob.  7 using  the  improved  Euler  method,  20  steps 
with  h — 0.05.  Compare. 
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CLASSICAL  RUNGE-KUTTA  METHOD 
OF  FOURTH  ORDER 


Do  10  steps.  Compare  as  indicated.  Show  details. 

11.  y — xy2  = 0,  y(0)  =1,  h = 0.1.  Compare  with 
Prob.  7.  Apply  the  error  estimate  (10)  to  y10. 

12.  y'  = y — yz,  y(0)  = 0.2,  h = 0.1.  Compare  with 
Prob.  8. 

13.  y = 1 + y2,  y( 0)  = 0,  h = 0.1 

14.  y = (1  - jt-1)y,  y(l)  =1,  h = 0.1 

15.  y + vtanv  = sin  lx,  y(0)  =1,  h = 0.1 

16.  Do  Prob.  15  with  h = 0.2,  5 steps,  and  compare  the 
errors  with  those  in  Prob.  15. 
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17.  y = 4 x3y2,  >'(0)  = 0.5,  h = 0.1 

18.  Kutta’s  third-order  method  is  defined  by  yn+\  = 
yn  + h(ki  + 4^2  + £3)  with  k\  and  £2  as  in  RK 
(Table  21.3)  and  £3  = hf(xn+  i,yn  — ki  + 2 k2). 
Apply  this  method  to  (4)  in  (6).  Choose  h = 0.2  and 
do  5 steps.  Compare  with  Table  21.5. 

19.  CAS  EXPERIMENT.  Euler-Cauchy  vs.  RK.  Con- 
sider the  initial  value  problem 

(17)  y = (y  - O.Ol.r2)2  sin  ( x 2)  + 0.02*, 

y(0)  = 0.4 

(solution:  y = l/[2.5  — S(*)]  + 0.01*2  where  S(x)  is 
the  Fresnel  integral  (38)  in  App.  3.1). 

(a)  Solve  (17)  by  Euler,  improved  Euler,  and  RK 
methods  for  0 S x = 5 with  step  h = 0.2.  Compare  the 
errors  for  * = 1,  3,  5 and  comment. 


21.2  Multistep  Methods 

In  a one-step  method  we  compute  yn+\  using  only  a single  step,  namely,  the  previous 
value  yn.  One-step  methods  are  “self-starting,”  they  need  no  help  to  get  going  because 
they  obtain  y\  from  the  initial  value  y0,  etc.  All  methods  in  Sec.  21.1  are  one-step. 

In  contrast,  a multistep  method  uses,  in  each  step,  values  from  two  or  more  previous 
steps.  These  methods  are  motivated  by  the  expectation  that  the  additional  information  will 
increase  accuracy  and  stability.  But  to  get  started,  one  needs  values,  say,  vo,  Vi,  >’2,  >3  in 
a 4-step  method,  obtained  by  Runge-Kutta  or  another  accurate  method.  Thus,  multistep 
methods  are  not  self-starting.  Such  methods  are  obtained  as  follows. 

Adams-Bashforth  Methods 

We  consider  an  initial  value  problem 

(1)  y = fix,  y),  y(xQ)  = y0 

as  before,  with  / such  that  the  problem  has  a unique  solution  on  some  open  interval 
containing  x0.  We  integrate  y = fix,  y)  from  xn  to  xn+1  = xn  + h.  This  gives 


(b)  Graph  solution  curves  of  the  ODE  in  (17)  for 
various  positive  and  negative  initial  values. 

(c)  Do  a similar  experiment  as  in  (a)  for  an  initial 
value  problem  that  has  a monotone  increasing  or 
monotone  decreasing  solution.  Compare  the  behavior 
of  the  error  with  that  in  (a).  Comment. 

20.  CAS  EXPERIMENT.  RKF.  (a)  Write  a program  for 
RKF  that  gives  xn,  yn,  the  estimate  (10),  and,  if  the 
solution  is  known,  the  actual  error  en. 

(b)  Apply  the  program  to  Example  3 in  the  text 
(10  steps,  h = 0.1). 

(c)  en  in  (b)  gives  a relatively  good  idea  of  the  size 
of  the  actual  error.  Is  this  typical  or  accidental?  Find 
out,  by  experimentation  with  other  problems,  on 
what  properties  of  the  ODE  or  solution  this  might 
depend. 


' Xn+ 1 r%n+ 1 

y\x)  dx  = y(xn+1)  - y(xn)  = f(x,  y(*))  dx. 

J 

Xn  Xn 

Now  comes  the  main  idea.  We  replace  fix,  y(x) ) by  an  interpolation  polynomial  p(x)  (see 
Sec.  19.3),  so  that  we  can  later  integrate.  This  gives  approximations  yn+\  of  y{xn+ 1)  and 
yn  of  y(*n), 


x. 


(2) 


yn+l  yn  "f" 


p{x)  dx. 
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Different  choices  of  p(x ) will  now  produce  different  methods.  We  explain  the  principle 
by  taking  a cubic  polynomial,  namely,  the  polynomial  p3(x)  that  at  (equidistant) 


xn>  xn—l’  xn—2>  xn— 3 

has  the  respective  values 

fn  ~ f(.xn>  yn ) 
fn—1  — f(xn  — 1’  yn  — l) 

(3) 

fn  — 2 — fixn— 2>  Jn  — 2) 
fn- 3 = f(xn-3>  yn-3)- 


This  will  lead  to  a practically  useful  formula.  We  can  obtain  p3(x)  from  Newton’s 
backward  difference  formula  (18),  Sec.  19.3: 

P3O)  =fn  + rVfn  + \r(r  + 1 )V2fn  + \r(r  + l)(r  + 2 )V3/n 


where 


r = 


h 


We  integrate  p3(x)  over  x from  xn  to  xn+\  = xn  + h,  thus  over  r from  0 to  1.  Since 


x = xn  + hr,  we  have  dx  = h dr. 

The  integral  of  \r(r  + 1)  is  ^ and  that  of \r{r  + l)(r  + 2)  is  §.  We  thus  obtain 

It  is  practical  to  replace  these  differences  by  their  expressions  in  terms  of  /: 

V/„  = fn  fn  — 1 

V2fn  = fn  ~ 2/„-l  + fn- 2 

y3fn  = fn  ~ 3fn-l  + 3/n-2  ~ fn- 3- 

We  substitute  this  into  (4)  and  collect  terms.  This  gives  the  multistep  formula  of  the 
Adams-Bashforth  method2  of  fourth  order 


(4) 


p3dx  = h p3  dr  = h\fn  + | V/n  + jr  V2/n  + f V3/, 


(5) 


yn+ 1 yn  3“ 


h 

24 


(55/n  - 59/n_, 


+ 37/n_2  - 9fns). 


2Named  after  JOHN  COUCH  ADAMS  (1819-1892),  English  astronomer  and  mathematician,  one  of  the 

predictors  of  the  existence  of  the  planet  Neptune  (using  mathematical  calculations),  director  of  the  Cambridge 
Observatory;  and  FRANCIS  BASHFORTH  (1819-1912),  English  mathematician. 
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It  expresses  the  new  value  yn+\  [approximation  of  the  solution  y of  (1)  at  xn+i\  in  terms 
of  4 values  of  / computed  from  the  y- values  obtained  in  the  preceding  4 steps.  The  local 
truncation  error  is  of  order  li5,  as  can  be  shown,  so  that  the  global  error  is  of  order  /?4; 
hence  (5)  does  define  a fourth-order  method. 

Adams-Moulton  Methods 

Adams-Moulton  methods  are  obtained  if  for  p(x ) in  (2)  we  choose  a polynomial  that 
interpolates  fix,  y(x))  at  xn+i,  xn,  xm_i,  • ■ ■ (as  opposed  to  xn,  in_i,  • • • used  before;  this 
is  the  main  point).  We  explain  the  principle  for  the  cubic  polynomial p3  (x)  that  interpolates 
at  xn+i,  xn,  xn-i,  xn-2-  (Before  we  had  xn,  xn-i,  xm_2,  xn-3.)  Again  using  (18)  in 
Sec.  19.3  but  now  setting  r = (x  — xn+i )/h,  we  have 

Ps(x)  =fn+ 1 + rVfn+ 1 + \r(r  + l)V2/n+1  + \r(r  + l)(r  + 2)V3/n+1. 

We  now  integrate  over  x from  xn  to  xn+1  as  before.  This  corresponds  to  integrating  over 
r from  — 1 to  0.  We  obtain 


* %n+ 1 


P:i(x)  dx  = h[fn+ 1 - ^ V/n+i  - ^V2/n+i  - ^V3fn+ i 


Replacing  the  differences  as  before  gives 


(6) 


: v„  + 


This  is  usually  called  an  Adams-Moulton  formula.3  It  is  an  implicit  formula  because 
fn+ 1 = f(xn+i,  yn+i)  appears  on  the  right,  so  that  it  defines  yn+\  only  implicitly,  in 
contrast  to  (5),  which  is  an  explicit  formula,  not  involving  yn+i  on  the  right.  To  use  (6) 
we  must  predict  a value  y*L+  \ , for  instance,  by  using  (5),  that  is, 


(7a) 


yn+ 1 = yn  + 


h 

24 


(55 fn  - 59/n_i  + 37/n_2  - 9/„_3). 


The  corrected  new  value  yn+i  is  then  obtained  from  (6)  with  fn+\  replaced  by 
fn+i  = f(xn+i,  Vn+i)  and  the  other /’s  as  in  (6);  thus, 


(7b) 


yn+ 1 Jn  5" 


h 

24 


(9/n+t  + 19 'fn  - 5/n_!  +/n_2). 


This  predictor-corrector  method  (7a),  (7b)  is  usually  called  the  Adams-Moulton 

method  of  fourth  order.  It  has  the  advantage  over  RK  that  (7)  gives  the  error  estimate 

€n+ 1 ~ l^CVm+1  ~ Tre+lX 

as  can  be  shown.  This  is  the  analog  of  (10)  in  Sec.  21.1. 


3FOREST  RAY  MOULTON  (1872-1952),  American  astronomer  at  the  University  of  Chicago.  For  ADAMS 
see  footnote  2. 
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Sometimes  the  name  Adams-Moulton  method  is  reserved  for  the  method  with  several 
corrections  per  step  by  (7b)  until  a specific  accuracy  is  reached.  Popular  codes  exist  for 
both  versions  of  the  method. 

Getting  Started.  In  (5)  we  need/0,/1,/2,/3.  Hence  from  (3)  we  see  that  we  must  first 
compute  >’j,  y2,  >3  by  some  other  method  of  comparable  accuracy,  for  instance,  by  RK  or 
by  RKF.  For  other  choices  see  Ref.  [E26]  listed  in  App.  1. 

Adams-Bashforth  Prediction  (7a),  Adams-Moulton  Correction  (7b) 

Solve  the  initial  value  problem 

(8)  y = x + y,  y(0)  = 0 

by  (7a),  (7b)  on  the  interval  0 S r S 2,  choosing  h = 0.2. 

Solution.  The  problem  is  the  same  as  in  Examples  1 and  2,  Sec.  2 1 . 1 , so  that  we  can  compare  the  results. 
We  compute  starting  values  Vi,  V2,  y3  by  the  classical  Runge-Kutta  method.  Then  in  each  step  we  predict 
by  (7a)  and  make  one  correction  by  (7b)  before  we  execute  the  next  step.  The  results  are  shown  and  compared 
with  the  exact  values  in  Table  21.9.  We  see  that  the  corrections  improve  the  accuracy  considerably.  This  is 
typical. 


Table  21.9  Adams-Moulton  Method  Applied  to  the  Initial  Value  Problem  (8); 
Predicted  Values  Computed  by  (7a)  and  Corrected  Values  by  (7b) 


n 

*n 

Starting 

yn 

Predicted 

yl 

Corrected 

yn 

Exact 

Values 

106  • Error 

of  y-n 

0 

0.0 

0.000000 

0.000000 

0 

1 

0.2 

0.021400 

0.021403 

3 

2 

0.4 

0.091818 

0.091825 

7 

3 

0.6 

0.222107 

0.222119 

12 

4 

0.8 

0.425361 

0.425529 

0.425541 

12 

5 

1.0 

0.718066 

0.718270 

0.718282 

12 

6 

1.2 

1.119855 

1.120106 

1.120117 

11 

7 

1.4 

1.654885 

1.655191 

1.655200 

9 

8 

1.6 

2.352653 

2.353026 

2.353032 

6 

9 

1.8 

3.249190 

3.249646 

3.249647 

1 

10 

2.0 

4.388505 

4.389062 

4.389056 

-6 

Comments  on  Comparison  of  Methods.  An  Adams-Moulton  formula  is  generally 
much  more  accurate  than  an  Adams-Bashforth  formula  of  the  same  order.  This  justifies 
the  greater  complication  and  expense  in  using  the  former.  The  method  (7a),  (7b)  is 
numerically  stable,  whereas  the  exclusive  use  of  (7a)  might  cause  instability.  Step  size 
control  is  relatively  simple.  If  | Corrector  — Predictor]  > TOL,  use  interpolation  to 
generate  “old”  results  at  half  the  current  step  size  and  then  try  h/2  as  the  new  step. 

Whereas  the  Adams-Moulton  formula  (7a),  (7b)  needs  only  2 evaluations  per  step, 
Runge-Kutta  needs  4;  however,  with  Runge-Kutta  one  may  be  able  to  take  a step  size 
more  than  twice  as  large,  so  that  a comparison  of  this  kind  (widespread  in  the  literature) 
is  meaningless. 

For  more  details,  see  Refs.  [E25],  [E26]  listed  in  App.  1. 
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1-10 


ADAMS-MOULTON  METHOD 


Solve  the  initial  value  problem  by  Adams-Moulton  (7a),  (7b), 
10  steps  with  1 correction  per  step.  Solve  exactly  and  compute 
the  error.  Use  RK  where  no  starting  values  are  given. 

1.  y = y,  y(0)  = 1,  h = 0.1,  (1.105171,1.221403, 
1.349858) 

2.  y = 2 xy,  y(0)  =1,  h = 0.1 

3.  y = 1 + y2,  y(0)  = 0,  h = 0.1,  (0.100335, 
0.202710,  0.309336) 

4.  Do  Prob.  2 by  RK,  5 steps,  h = 0.2.  Compare  the  errors. 

5.  Do  Prob.  3 by  RK,  5 steps,  h — 0.2.  Compare  the  errors. 

6.  y'  = (y  - x - l)2  + 2,  y(0)  =1,  h = 0.1, 

10  steps 

7.  y = 3y  - 12y2,  y(0)  = 0.2,  h = 0.1 

8.  y'  = I - 4y2,  y(0)  = 0,  h = 0.1 

9.  y'  = 3jc2(1  + y),  y(0)  = 0,  h = 0.05 

10.  y'  = jt/y,  y(l)  = 3,  h = 0.2 

11.  Do  and  show  the  calculations  leading  to  (4)— (7)  in  the 
text. 


12.  Quadratic  polynomial.  Apply  the  method  in  the  text 
to  a polynomial  of  second  degree.  Show  that  this  leads 
to  the  predictor  and  corrector  formulas 


v«  + l = yn  + Y ^(23/n  - 16/n-i  + 5fn-z), 
h 

Tn  + 1 yn  "f  , n (5/n+l  ~f  ^fn  Jn—U- 


13.  Using  Prob.  12,  solve  y = 2xy,  y(0)  = 1 (10  steps, 
h = 0.1,  RK  starting  values).  Compare  with  the  exact 
solution  and  comment. 

14.  How  much  can  you  reduce  the  error  in  Prob.  13  by 
halfing  h (20  steps,  h = 0.05)?  First  guess,  then 
compute. 

15.  CAS  PROJECT.  Adams-Moulton.  (a)  Accurate 
starting  is  important  in  (7a),  (7b).  Illustrate  this  in 
Example  1 of  the  text  by  using  starting  values  from 
the  improved  Euler-Cauchy  method  and  compare  the 
results  with  those  in  Table  21.8. 

(b)  How  much  does  the  error  in  Prob.  1 1 decrease 
if  you  use  exact  starting  values  (instead  of  RK 
values)? 

(c)  Experiment  to  find  out  for  what  ODEs  poor 
starting  is  very  damaging  and  for  what  ODEs  it 
is  not. 

(d)  The  classical  RK  method  often  gives  the  same 
accuracy  with  step  2 h as  Adams-Moulton  with  step 
h,  so  that  the  total  number  of  function  evaluations  is 
the  same  in  both  cases.  Illustrate  this  with  Prob.  8. 
(Hence  corresponding  comparisons  in  the  literature 
in  favor  of  Adams-Moulton  are  not  valid.  See  also 
Probs.  6 and  7.) 


21.]  Methods  for  Systems 
and  Higher  Order  ODEs 

Initial  value  problems  for  first-order  systems  of  ODEs  are  of  the  form 

(1)  y'=f(x,  y),  y(x0)  = y0- 

in  components 


y i = fi(x,  yi,  • • • , ym),  yi(x0)  = yio 

y2  = h (x,  yi,  • • • , ym)i  ^2(^0)  = ^20 


ym  fm(x9  yi?  * * ’ 5 ym .)• 


yraC^o)  JmO- 
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Here,  f is  assumed  to  be  such  that  the  problem  has  a unique  solution  y(jc)  on  some  open 
jc-interval  containing  x0.  Our  discussion  will  be  independent  of  Chap.  4 on  systems. 

Before  explaining  solution  methods  it  is  important  to  note  that  (1)  includes  initial  value 
problems  for  single  wth-order  ODEs, 

(2)  y(m)  = f{x,  y,  y\ y",  • • • , y(m_1)) 

and  initial  conditions  v(a'o)  = K\,  y (xq)  = K2,  ■ ■ • , y (xq)  = Km  as  special  cases. 
Indeed,  the  connection  is  achieved  by  setting 

/q\  tn  (tti—I) 

(3)  yi  = y,  y2  = y , y%  = y , ■ • ■ > ym  = y 

Then  we  obtain  the  system 

y\  = J2 

y2  = ^3 

(4) 

r 

ym—i  y in 

y'm  = fix,  yi,  ■ ■ ■ , ym) 


and  the  initial  conditions  yi(Ao)  = K^,  y^ix o)  = K2,  ••• , ym(xo)  ~ Km. 

Euler  Method  for  Systems 

Methods  for  single  first-order  ODEs  can  be  extended  to  systems  (1)  simply  by  writing  vector 
functions  y and  f instead  of  scalar  functions  y and  /,  whereas  x remains  a scalar  variable. 

We  begin  with  the  Euler  method.  Just  as  for  a single  ODE,  this  method  will  not  be 
accurate  enough  for  practical  purposes,  but  it  nicely  illustrates  the  extension  principle. 

Euler  Method  for  a Second-Order  ODE.  Mass-Spring  System 

Solve  the  initial  value  problem  for  a damped  mass-spring  system 

y"  + 2 y + 0.75y  = 0,  y(0)  = 3,  y'(0)  = -2.5 

by  the  Euler  method  for  systems  with  step  h = 0.2  for  x from  0 to  1 (where  x is  time). 

Solution.  The  Euler  method  (3),  Sec.  21.1,  generalizes  to  systems  in  the  form 

(5)  y»+i  = yn  + hf(xn,  yn), 

in  components 

yi,n+i  — yi,n  "f  y-\  n.  y’2,n) 

y2,n+l  = y2,n  + hf2(xn,y  l,„,  y2,n) 

and  similarly  for  systems  of  more  than  two  equations.  By  (4)  the  given  ODE  converts  to  the  system 

yl  = fi(x,  yi,  y2)  = V2 

y'2  = /2C,  yi,  y2)  = ~2y2  ~ 0.75yi. 
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EXAMPLE  2 


Hence  (5)  becomes 

Ti.n+i  = yi  ,n  + 0.2  y2j„ 

yz,n+i  = y2,n  + 0.2(-2y2,n  “ 0.75y1>n). 

The  initial  conditions  are  y( 0)  = yi(0)  = 3,  y (0)  = >>2(0)  = —2.5.  The  calculations  are  shown  in  Table  21.10. 
As  for  single  ODEs,  the  results  would  not  be  accurate  enough  for  practical  purposes.  The  example  merely  serves 
to  illustrate  the  method  because  the  problem  can  be  readily  solved  exactly, 

y = yi  = 2e~°'5x  + e-15x,  thus  y'  = y2  = ~e-°5x  - 1.5<r15x. 


Table  21.10  Euler  Method  for  Systems  in  Example  1 (Mass-Spring  System) 


n 

yi,n 

y | Exact 
(5D) 

Error 

= yi  - yi,« 

y2,n 

y2  Exact 
(5D) 

Error 

e2  = y2  — y2,n 

0 

0.0 

3.00000 

3.00000 

0.00000 

-2.50000 

-2.50000 

0.00000 

1 

0.2 

2.50000 

2.55049 

0.05049 

-1.95000 

-2.01606 

-0.06606 

2 

0.4 

2.11000 

2.18627 

0.76270 

-1.54500 

-1.64195 

-0.09695 

3 

0.6 

1.80100 

1.88821 

0.08721 

-1.24350 

-1.35067 

-0.10717 

4 

0.8 

1.55230 

1.64183 

0.08953 

-1.01625 

-1.12211 

-0.10586 

5 

1.0 

1.34905 

1.43619 

0.08714 

-0.84260 

-0.94123 

-0.09863 

Runge-Kutta  Methods  for  Systems 

As  for  Euler  methods,  we  obtain  RK  methods  for  an  initial  value  problem  (1)  simply  by 
writing  vector  formulas  for  vectors  with  m components,  which,  for  m = 1 , reduce  to  the 
previous  scalar  formulas. 

Thus,  for  the  classical  RK  method  of  fourth  order  in  Table  21.3,  we  obtain 
(6a)  y(x0)  = Yo  (Initial  values) 

and  for  each  step  n = 0,  1 , ■ ■ • , N — 1 we  obtain  the  4 auxiliary  quantities 

ki  h f I'.v yn) 

1^2  // 1" (.X T 2^’  yn  2^l) 

(6b) 

k3  = hf(xn  + 2 h,  yn  + 5k2) 
k4  = hf(xn  + h,  yn  + k3) 

and  the  new  value  [approximation  of  the  solution  y(x)  at  xr(  + 1 = jr0  + (n  + 1 )h] 

(6c)  yn+1  = yn  + s(kr  + 2k2  + 2k3  + k4). 


RK  Method  for  Systems.  Airy’s  Equation.  Airy  Function  Ai(x) 

Solve  the  initial  value  problem 


y = *y. 


y(0)  = 1/(32/s  • r (!))  = 0.35502S05,  y'(0)  = -1/(31/3  • T(J))  = -0.25881940 
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by  the  Runge-Kutta  method  for  systems  with  h = 0.2;  do  5 steps.  This  is  Airy’s  equation,4  which  arose  in 
optics  (see  Ref.  [A13],  p.  188,  listed  in  App.  1).  T is  the  gamma  function  (see  App.  A3.1).  The  initial  conditions 
are  such  that  we  obtain  a standard  solution,  the  Airy  function  Ai(;c),  a special  function  that  has  been  thoroughly 
investigated;  for  numeric  values,  see  Ref.  [GenRefl],  pp.  446,  475. 

Solution.  For  y"  = xy,  setting  yi  — y,y2  = yi  = y we  obtain  the  system  (4) 

yi  = y2 
y'z  = xyv 


Hence  f = [/i  in  (1)  has  the  components  fi(x,  y)  = yz,  fzix,  y)  = xyi.  We  now  write  (6)  in  components. 
The  initial  conditions  (6a)  are  y^o  = 0.35502805,  y2,o  = “0.25881940.  In  (6b)  we  have  fewer  subscripts  by 
simply  writing  ki  = a,  k2  — b,  k3  = c,  k4  = d,  so  that  a = [ai  , etc.  Then  (6b)  takes  the  form 


(6b*) 


a = h 


y2,n 


Xnyi,n 


b = h 


y2,n  y 2 a2 

(xn  “f  2^)(jl,n  ~f  2^l) 


C = h 


y2,n  + 2^2 

(xn  “f  2*)(yi,n  2^l) 


d = 


h 


y2,n  + C2 

(xn  + h)(yi'K  + cx) 


For  example,  the  second  component  of  b is  obtained  as  follows.  f(.r,  y)  has  the  second  component  f2(x,  y)  = xy1. 
Now  in  b (=  k2)  the  first  argument  is 

x = xn  + 2h. 


The  second  argument  in  b is 

y = yn  + |a- 


and  the  first  component  of  this  is 


yi  = yi,n 


2al- 


Together, 


xyi  = (.Xn  + i*)(yi,n  + 2«l)- 


Similarly  for  the  other  components  in  (6b*).  Finally, 

(6c*)  y„+i  = y„  + g(a  + 2b  + 2c  + d). 

Table  21.11  shows  the  values  y(x)  = yi(x)  of  the  Airy  function  Ai(;c)  and  of  its  derivative  y (*)  = y^ix)  as  well 
as  of  the  (rather  small!)  error  of  y(x). 


4Named  after  Sir  GEORGE  BIDELL  AIRY  (1801-1892),  English  mathematician,  who  is  known  for  his  work 
in  elasticity  and  in  PDEs. 
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EXAMPLE  3 


Table  2 RK  Method  for  Systems:  Values  y1-n(xn)  of  the  Airy  Function  Ai(x) 
in  Example  2 


n 

yi,nir^n) 

y-i(xn)  Exact  (8D) 

108  • Error  of  y1 

y2,n(^n) 

0 

0.0 

0.35502805 

0.35502805 

0 

-0.25881940 

1 

0.2 

0.30370303 

0.30370315 

12 

-0.25240464 

2 

0.4 

0.25474211 

0.25474235 

24 

-0.23583073 

3 

0.6 

0.20979973 

0.20980006 

33 

-0.21279185 

4 

0.8 

0.16984596 

0.16984632 

36 

-0.18641171 

5 

1.0 

0.13529207 

0.13529242 

35 

-0.15914687 

Runge-Kutta-Nystrom  Methods  (RKN  Methods) 

RKN  methods  are  direct  extensions  of  RK  methods  (Runge-Kutta  methods)  to  second-order 
ODEs  y = f{x , y,  y ),  as  given  by  the  Finnish  mathematician  E.  J.  Nystrom  [Acta  Soc.  Sci. 
fenn.,  1925,  L,  No.  13].  The  best  known  of  these  uses  the  following  formulas,  where 
n = 0,  l,  - ■ ■ , N — 1 (/V  the  number  of  steps): 

= \hf{xn,  yn,  yn) 

„ k2  = \hf{xn  + \h,  yn  + K,y'n  + kf)  where  K = \h(y'n  + 2k{) 

(7a)  i i 

k3  = 2hf(*n  + 2h,yn  + K,yn  + k2) 

*4  = \ hfixn  + h,  yn  + L,y'n  + 2 k3)  where  L = hiy^  + k3). 

From  this  we  compute  the  approximation  yn+i  of  y(xn+i)  at  xn+i  = xq  + in  + I )h, 

(7b)  vn+1  = yn  + h(y'n  + 3(k1  + k2  + k3 )), 

and  the  approximation  y'n+i  of  the  derivative  y'{xn+\)  needed  in  the  next  step, 

(7c)  y'n+ 1 = y'n  + 3(k1  + 2 k2  + 2 k3  + kf). 

RKN  for  ODEs  y"  = fix,  y)  Not  Containing  y' . Then  k2  = k3  in  (7),  which  makes 
the  method  particularly  advantageous  and  reduces  (7a)-(7c)  to 

k i = h hfixn , yJ 

k2  = khfixn  + 2h,  yn  + 2hiyn  + \k-ff)  = k3 
(7*)  kA  = 2hfixn  + h,yn  + hiy ^ + k2)) 

yn+\  = yn  + Kyh  + 3(^1  + 2£2)) 
y'n+ 1 = y'n  + 3(^1  + 4 A: 2 + kf). 


RKN  Method.  Airy’s  Equation.  Airy  Function  Ai(x) 

For  the  problem  in  Example  2 and  h — 0.2  as  before  we  obtain  from  (7*)  simply  k\  = 0.1  xnyn  and 

k2  = k3  = 0.1  (xn  + 0.1  )(yn  + 0.1)4  + 0.05^),  k4  = 0.1  (xn  + 0.2  )(yn  + 0.2 y'n  + 0.2*a). 
Table  21.12  shows  the  results.  The  accuracy  is  the  same  as  in  Example  2,  but  the  work  was  much  less. 
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Table  21.12  Runge-Kutta-Nystrom  Method  Applied  to  Airy’s  Equation, 
Computation  of  the  Airy  Function  y = Ai(x) 


Xn 

yn 

f 

yn 

y(x)  Exact  (8D) 

108  • Error 
of  yn 

0.0 

0.35502805 

-0.25881940 

0.35502805 

0 

0.2 

0.30370304 

-0.25240464 

0.30370315 

11 

0.4 

0.25474211 

-0.23583070 

0.25474235 

24 

0.6 

0.20979974 

-0.21279172 

0.20980006 

32 

0.8 

0.16984599 

-0.18641134 

0.16984632 

33 

1.0 

0.13529218 

-0.15914609 

0.13529242 

24 

Our  work  in  Examples  2 and  3 also  illustrates  that  usefulness  of  methods  for  ODEs  in  the 
computation  of  values  of  “higher  transcendental  functions.” 

Backward  Euler  Method  for  Systems.  Stiff  Systems 

The  backward  Euler  formula  (16)  in  Sec.  21.1  generalizes  to  systems  in  the  form 

(8)  yn+i  = yn  + h f(xra+i,  ym+i)  (n  = 0, 1,  • • • )• 

This  is  again  an  implicit  method,  giving  yn+]  implicitly  for  given  yn.  Hence  (8)  must  be 
solved  for  yn+1.  For  a linear  system  this  is  shown  in  the  next  example.  This  example  also 
illustrates  that,  similar  to  the  case  of  a single  ODE  in  Sec.  21.1,  the  method  is  very  useful 
for  stiff  systems.  These  are  systems  of  ODEs  whose  matrix  has  eigenvalues  A of  very 
different  magnitudes,  having  the  effect  that,  just  as  in  Sec.  21.1,  the  step  in  direct  methods, 
RK  for  example,  cannot  be  increased  beyond  a certain  threshold  without  losing  stability. 
(A  = — 1 and  —10  in  Example  4,  but  larger  differences  do  occur  in  applications.) 

Backward  Euler  Method  for  Systems  of  ODEs.  Stiff  Systems 

Compare  the  backward  Euler  method  (8)  with  the  Euler  and  the  RK  methods  for  numerically  solving  the  initial 
value  problem 

y"  + 11/  + lOy  = 10*  + 11,  y(0)  = 2,  y'(0)  = -10 

converted  to  a system  of  first-order  ODEs. 

Solution.  The  given  problem  can  easily  be  solved,  obtaining 

y = e~x  + e~Wx  + x 

so  that  we  can  compute  errors.  Conversion  to  a system  by  setting  y = y±,y  — y2  [see  (4)]  gives 

y'x  = y2  vi(0)  = 2 

y2  = — 10yi  — lly2  + 10jc  +11  ^(O)  — — 10. 

The  coefficient  matrix 


0 f 

-A  1 

A = 

has  the  characteristic  determinant 

-10  -11 

-10  -A  - 11 

whose  value  is  A2  + 1 1 A + 10  = (A  + 1)(A  + 10).  Hence  the  eigenvalues  are  —1  and  —10  as  claimed  above. 
The  backward  Euler  formula  is 
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yi,n+l 

V l.n 

+ h 

y2,n+ 1 

Yn+l 

y2,n+l_ 

y2,n. 

.~iQyi,»+i  ~ ii.v2,n+i  + i0xn+1  + ii 

Reordering  terms  gives  the  linear  system  in  the  unknowns  y\>n+i  andy2,n+i 

yi,n+i  — hy2,n+i  = yi,« 

I0hyltn+1  + (1  + ll%2,n+l  = y2,n  + 10  h(xn  + h)  + 11  h. 

The  coefficient  determinant  is  D = 1 + 11/z  4-  10 h2,  and  Cramer’s  rule  (in  Sec.  7.6)  gives  the  solution 

l I" (1  + ll%i,n  + hy2,n  + 10 h2xn  + 11/z2  + 10 h3' 
y-n+l  ~ ~ « ■ 

D — lQhyifn  + y2,n  T 10 hxn  + 11  h + 10 hr 


Table  21.13  Backward  Euler  Method  (BEM)  for  Example  4.  Comparison  with  Euler  and  RK 


X 

BEM 
h = 0.2 

BEM 
h = 0.4 

Euler 
h = 0.1 

Euler 
h = 0.2 

RK 

h = 0.2 

RK 

h = 0.3 

Exact 

0.0 

2.00000 

2.00000 

2.00000 

2.00000 

2.00000 

2.00000 

2.00000 

0.2 

1.36667 

1.01000 

0.00000 

1.35207 

1.15407 

0.4 

1.20556 

1.31429 

1.56100 

2.04000 

1.18144 

1.08864 

0.6 

1.21574 

1.13144 

0.11200 

1.18585 

3.03947 

1.15129 

0.8 

1.29460 

1.35020 

1.23047 

2.20960 

1.26168 

1.24966 

1.0 

1.40599 

1.34868 

0.32768 

1.37200 

1.36792 

1.2 

1.53627 

1.57243 

1.48243 

2.46214 

1.50257 

5.07569 

1.50120 

1.4 

1.67954 

1.62877 

0.60972 

1.64706 

1.64660 

1.6 

1.83272 

1.86191 

1.78530 

2.76777 

1.80205 

1.80190 

1.8 

1.99386 

1.95009 

0.93422 

1.96535 

8.72329 

1.96530 

2.0 

2.16152 

2.18625 

2.12158 

3.10737 

2.13536 

2.13534 

Table  21.13  shows  the  following. 

Stability  of  the  backward  Euler  method  for  h = 0.2  and  0.4  (and  in  fact  for  any  h\  try  h = 5.0)  with  decreasing 
accuracy  for  increasing  h 

Stability  of  the  Euler  method  for  h = 0.1  but  instability  for  h = 0.2 
Stability  of  RK  for  h = 0.2  but  instability  for  h = 0.3 

Figure  452  shows  the  Euler  method  for  h = 0.18,  an  interesting  case  with  initial  jumping  (for  about  x > 3)  but 
later  monotone  following  the  solution  curve  of  y = y\.  See  also  CAS  Experiment  15. 
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EULER  FOR  SYSTEMS  AND 
SECOND-ORDER  ODEs 


Solve  by  the  Euler's  method.  Graph  the  solution  in  the 
>’i.V’2-p1ane-  Calculate  the  errors. 

1-  >'i  = 2yi  - 4y2,  y2  = yy  - 3y2,  yi(0)  = 3, 

>*2(0)  = 0,  h = 0.1,  10  steps 

2.  Spiral.  y[  = -yx  + y2,  y2  = -yi  - y2,  yi(0)  = 0, 
y2(0)  = 4,  h = 0.2,  5 steps 

3.  y"  +\y  = 0,  y(0)  = 1,  y'(0)  = 0,  h = 0.2, 

5 steps 

4.  y[  = — 3yi  + y2,  y2  = >'i  - 3v2,  Vi(0)  = 2, 

y2(0)  = 0,  h = 0.1,  5 steps 

5.  y"  - y = x,  y(0)  = 1,  y'(0)  = -2,  h = 0.1, 

5 steps 


6-  y[  = yi,  y2  = -y2,  yi(0) 


h = 

7-10 


0.1,  10  steps 

RK  FOR  SYSTEMS 


2,  y2(0)  = 2. 


Solve  by  the  classical  RK. 

7.  The  ODE  in  Prob.  5.  By  what  factor  did  the  error 
decrease? 


8.  The  system  in  Prob.  2 

9.  The  system  in  Prob.  1 

10.  The  system  in  Prob.  4 

11.  Pendulum  equation  y"  + siny  = 0,  y(7r)  = 0, 

y (7r)  = 1,  as  a system,  h = 0.2,  20  steps.  How 

does  your  result  fit  into  Fig.  93  in  Sec.  4.5? 

12.  Bessel  Function  J0.  xy"  + y + xy  = 0,  y(l)  = 
0.765198,  y'(l)  = -0.440051,  h = 0.5,  5 steps. 
(This  gives  the  standard  solution  Jq(x)  in  Fig.  110  in 
Sec.  5.4.) 


13.  Verify  the  formulas  and  calculations  for  the  Airy 
equation  in  Example  2 of  the  text. 

14.  RKN.  The  classical  RK  for  a first-order  ODE  extends 
to  second-order  ODEs  (E.  J.  Nystrom,  Acta  fenn. 
No  13,  1925).  If  the  ODE  is  y"=/(jt,y),  not 
containing  y\  then 

k\  2 In) 

^2  = 2 hf(xn  + Tm  + \h(yn  + |^i))  = k3 
k*  = \hf(xn  + h,yn  + h{y'n  + k2 )) 

Tn+i  = yn  + Ky'n  + h(k  i + 2^2)) 
y'n+ 1 = yn  + g(l'!  + 4 k2  + fc4). 


Apply  this  RKN  (Runge-Kutta-Nystrdm)  method  to 
the  Airy  ODE  in  Example  2 with  h = 0.2  as  before,  to 
obtain  approximate  values  of  Ai(jc). 

15.  CAS  EXPERIMENT.  Backward  Euler  and 
Stiffness.  Extend  Example  3 as  follows. 

(a)  Verify  the  values  in  Table  21.13  and  show  them 
graphically  as  in  Fig.  452. 

(b)  Compute  and  graph  Euler  values  for  h near  the 
“critical”  h = 0.18  to  determine  more  exactly  when 
instability  starts. 

(c)  Compute  and  graph  RK  values  for  values  of  h 
between  0.2  and  0.3  to  find  h for  which  the  RK 
approximation  begins  to  increase  away  from  the  exact 
solution. 

(d)  Compute  and  graph  backward  Euler  values  for 
large  h;  confirm  stability  and  investigate  the  error 
increase  for  growing  h. 


21 A Methods  for  Elliptic  PDEs 

We  have  arrived  at  the  second  half  of  this  chapter,  which  is  devoted  to  numerics  for 
partial  differential  equations  (PDEs).  As  we  have  seen  in  Chap.  12,  there  are  many 
applications  to  PDEs,  such  as  in  dynamics,  elasticity,  heat  transfer,  electromagnetic 
theory,  quantum  mechanics,  and  others.  Selected  because  of  their  importance  in 
applications,  the  PDEs  covered  here  include  the  Laplace  equation,  the  Poisson  equation, 
the  heat  equation,  and  the  wave  equation.  By  covering  these  equations  based  on  their 
importance  in  applications  we  also  selected  equations  that  are  important  for  theoretical 
considerations.  Indeed,  these  equations  serve  as  models  for  elliptic,  parabolic,  and 
hyperbolic  PDEs.  For  example,  the  Laplace  equation  is  a representative  example  of  an 
elliptic  type  of  PDE,  and  so  forth. 
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Recall,  from  Sec.  12.4,  that  a PDE  is  called  quasilinear  if  it  is  linear  in  the  highest 
derivatives.  Hence  a second-order  quasilinear  PDE  in  two  independent  variables  x,  y is  of  the 
form 

(1)  QUXX  T 'Zbuxy  “h  CUyy  F(.X,  ^ , W,  U x,  Uy). 

u is  an  unknown  function  of  x and  y (a  solution  sought).  F is  a given  function  of  the 
indicated  variables. 

Depending  on  the  discriminant  ac  — b , the  PDE  (1)  is  said  to  be  of 

elliptic  type  if  ac  — b >0  (example:  Laplace  equation) 

parabolic  type  if  ac  — b2  = 0 (example:  heat  equation ) 

hyperbolic  type  if  ac  — b2  < 0 (example:  wave  equation). 

Here,  in  the  heat  and  wave  equations,  y is  time  t.  The  coefficients  a,  b,  c may  be  functions 
of  x,  y,  so  that  the  type  of  (1)  may  be  different  in  different  regions  of  the  xy-plane.  This 
classification  is  not  merely  a formal  matter  but  is  of  great  practical  importance  because 
the  general  behavior  of  solutions  differs  from  type  to  type  and  so  do  the  additional 
conditions  (boundary  and  initial  conditions)  that  must  be  taken  into  account. 

Applications  involving  elliptic  equations  usually  lead  to  boundary  value  problems  in  a 
region  R,  called  a first  boundary  value  problem  or  Dirichlet  problem  if  u is  prescribed 
on  the  boundary  curve  C of  R,  a second  boundary  value  problem  or  Neumann  problem 
if  un  = du/dn  (normal  derivative  of  u)  is  prescribed  on  C,  and  a third  or  mixed  problem 
if  u is  prescribed  on  a part  of  C and  un  on  the  remaining  part.  C usually  is  a closed  curve 
(or  sometimes  consists  of  two  or  more  such  curves). 


Difference  Equations 

for  the  Laplace  and  Poisson  Equations 

In  this  section  we  develop  numeric  methods  for  the  two  most  important  elliptic  PDEs  that 
appear  in  applications.  The  two  PDEs  are  the  Laplace  equation 

(2)  V U UXX  T Uyy  0 

and  the  Poisson  equation 

(3)  V2«  = uxx  + Uyy  = f(x , y). 

The  starting  point  for  developing  our  numeric  methods  is  the  idea  that  we  can  replace 
the  partial  derivatives  of  these  PDEs  by  corresponding  difference  quotients.  Details  are 
as  follows: 

To  develop  this  idea,  we  start  with  the  Taylor  formula  and  obtain 

(a)  u{x  + h,y)  = u(x,  y)  + hux(x,  y)  + \h2uxx{x,  y)  + \h3uxxx{x,  y)  + ■ • • 

(4) 

(b)  u(x  - h,  y)  = u(x,  y)  - hux(x,  y)  + \ hzuxx(x , y)  - $h3uxxx(x,  y)  + ■ ■ • . 
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We  subtract  (4b)  from  (4a),  neglect  terms  in  h'\  h4,  • • • , and  solve  for  ux.  Then 


(5a) 

Similarly, 

and 


ux(x,  y)~- 7 [u(x  + h,  y ) - u(x  - h,  y)]. 
In 


1 2 

u(x,  y + k)  = u( x,  y)  + kuy(x,  y)  + 2 k uyy(x,  y)  + • ■ ■ 


1 2 

u(x,  y — k)  = u(x,  y)  — kuy(x,  y)  + ^k  uyy(x,  y)  + 


3 4 

By  subtracting,  neglecting  terms  in  k , k , • ■ • , and  solving  for  uy  we  obtain 


(5b) 


Uy(x,  y)  ~ ^ [u(x,  y + k)  - u(x,  y - k)]. 


We  now  turn  to  second  derivatives.  Adding  (4a)  and  (4b)  and  neglecting  terms  in 
,h 5,--- 
we  have 


h4,  h5,  • • • , we  obtain  u(x  + h,y)  + u(x  — h,y)  ~ 2u(x,  y)  + h2uxx(x,  y).  Solving  for  11 


(6a) 

Similarly, 

(6b) 


(x,  y)  ~ — [u(x  + h,  y)  — 2 u(x,  y)  + u(x  — h,  y)]. 


uyy(x,  y)  ~ — [u(x,  y + k)  — 2 u{x,  y)  + u(x,  y — fc)]. 


We  shall  not  need  (see  Prob.  1) 


(6c)  uXy(x,  y)  ~ \u(x  + h,  y + k)  — u(x  — h,  y + k) 

4-hk 

— u(x  + h,y  — k)  + u(x  — h,y  — &)]. 

Figure  453a  shows  the  points  (x  + h,  y),  (x  — h,  y),  • ■ • in  (5)  and  (6). 

We  now  substitute  (6a)  and  (6b)  into  the  Poisson  equation  (3),  choosing  k = h to  obtain 
a simple  formula: 

2 

(7)  u(x  + h,  y)  + u(x,  y + h)  + u(x  — h,  y)  + u(x,  y — h)  — 4 u(x,  y)  = h f(x,  y). 

This  is  a difference  equation  corresponding  to  (3).  Hence  for  the  Laplace  equation  (2) 
the  corresponding  difference  equation  is 

(8)  u(x  + h,  y)  + u(x,  y + h)  + u(x  — h,y)  + u(x,  y — h)  — 4 u(x,  y)  = 0. 

h is  called  the  mesh  size.  Equation  (8)  relates  u at  (x,  y)  to  u at  the  four  neighboring  points 
shown  in  Fig.  453b.  It  has  a remarkable  interpretation:  u at  (x,  y)  equals  the  mean  of  the 
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(x  - h,  y) 


values  of  u at  the  four  neighboring  points.  This  is  an  analog  of  the  mean  value  property 
of  harmonic  functions  (Sec.  18.6). 

Those  neighbors  are  often  called  E (East),  N (North),  W (West),  S (South).  Then  Fig.  453b 
becomes  Fig.  453c  and  (7)  is 

(7*)  u(E)  + u(N)  + u(W)  + u(S)  — 4 u(x,  y ) = h2f{x,  y). 


( x , y + k) 
X 


-O 


(x,y) 


X 

(x,  y-k) 


■ X (x  + h,y) 


(x,  y + h) 

X 


X (x  + h,y) 


(a)  Points  in  (5)  and  (6) 


(x,  y-h ) 

(b)  Points  in  (7)  and  (8) 

Fig.  453.  Points  and  notation  in  (5)— (8)  and  (7* 


N 

X 


X E 


(c)  Notation  in  (7*) 


Our  approximation  of  h2VZu  in  (7)  and  (8)  is  a 5-point  approximation  with  the 
coefficient  scheme  or  stencil  (also  called  pattern,  molecule,  or  star) 


(9) 


f 1 

] 

f ' 

) 

1 -4 

■ 

7 . We  may  now  write  (7)  as  s 

■ 

l 1 

J 

1 i 

1 

Dirichlet  Problem 

In  numerics  for  the  Dirichlet  problem  in  a region  R we  choose  an  h and  introduce  a square 
grid  of  horizontal  and  vertical  straight  lines  of  distance  h.  Their  intersections  are  called 
mesh  points  (or  lattice  points  or  nodes).  See  Fig.  454. 

Then  we  approximate  the  given  PDE  by  a difference  equation  [(8)  for  the  Faplace 
equation],  which  relates  the  unknown  values  of  u at  the  mesh  points  in  R to  each  other 
and  to  the  given  boundary  values  (details  in  Example  1).  This  gives  a linear  system  of 
algebraic  equations.  By  solving  it  we  get  approximations  of  the  unknown  values  of  u at 
the  mesh  points  in  R. 

We  shall  see  that  the  number  of  equations  equals  the  number  of  unknowns.  Now  comes 
an  important  point.  If  the  number  of  internal  mesh  points,  call  it  p,  is  small,  say,  p < 100, 
then  a direct  solution  method  may  be  applied  to  that  linear  system  of  p < 100  equations 
in  p unknowns.  However,  if  p is  large,  a storage  problem  will  arise.  Now  since  each 
unknown  u is  related  to  only  4 of  its  neighbors,  the  coefficient  matrix  of  the  system  is  a 
sparse  matrix,  that  is,  a matrix  with  relatively  few  nonzero  entries  (for  instance,  500  of 
10,000  when  p = 100).  Hence  for  large  p we  may  avoid  storage  difficulties  by  using  an 
iteration  method,  notably  the  Gauss-Seidel  method  (Sec.  20.3),  which  in  PDEs  is  also 
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called  Liebmann’s  method  (note  the  strict  diagonal  dominance).  Remember  that  in  this 
method  we  have  the  storage  convenience  that  we  can  overwrite  any  solution  component 
(value  of  u)  as  soon  as  a “new”  value  is  available. 

Both  cases,  large  p and  small  p,  are  of  interest  to  the  engineer,  large  p if  a fine  grid  is 
used  to  achieve  high  accuracy,  and  small  p if  the  boundary  values  are  known  only  rather 
inaccurately,  so  that  a coarse  grid  will  do  it  because  in  this  case  it  would  be  meaningless 
to  try  for  great  accuracy  in  the  interior  of  the  region  R. 

We  illustrate  this  approach  with  an  example,  keeping  the  number  of  equations  small, 
for  simplicity.  As  convenient  notations  for  mesh  points  and  corresponding  values  of  the 
solution  (and  of  approximate  solutions)  we  use  (see  also  Fig.  454) 

(10)  Py  = ( ih,jh ),  Uy  = u{ih,jh). 


Fig.  454.  Region  in  the  xy-plane  covered  by  a grid  of  mesh  h, 
also  showing  mesh  points  = (h,  h),  ■ ■ ■ , P ,y  = ( ih,jh ),  • • • 

With  this  notation  we  can  write  (8)  for  any  mesh  point  Py  in  the  form 

(11)  "f  Wi,j  + 1 d"  Ui  — ij  "h  Uiy  — \ 4 Uy  0. 


Remark.  Our  current  discussion  and  the  example  that  follows  illustrate  what  we  may 
call  the  reuseability  of  mathematical  ideas  and  methods.  Recall  that  we  applied  the 
Gauss-Seidel  method  to  a system  of  ODEs  in  Sec.  20.3  and  that  we  can  now  apply  it 
again  to  elliptic  PDEs.  This  shows  that  engineering  mathematics  has  a structure  and 
important  mathematical  ideas  and  methods  will  appear  again  and  again  in  different 
situations.  The  student  should  find  this  attractive  in  that  previous  knowledge  can  be 
reapplied. 


Laplace  Equation.  Liebmann’s  Method 

The  four  sides  of  a square  plate  of  side  12  cm,  made  of  homogeneous  material,  are  kept  at  constant  temperature 
0°C  and  100°C  as  shown  in  Fig.  455a.  Using  a (very  wide)  grid  of  mesh  4 cm  and  applying  Liebmann’s  method 
(that  is,  Gauss-Seidel  iteration),  find  the  (steady-state)  temperature  at  the  mesh  points. 

Solution.  In  the  case  of  independence  of  time,  the  heat  equation  (see  Sec.  10.8) 

ut  ~ c i.uXX  4"  Uy-y) 

reduces  to  the  Laplace  equation.  Hence  our  problem  is  a Dirichlet  problem  for  the  latter.  We  choose  the  grid 
shown  in  Fig.  455b  and  consider  the  mesh  points  in  the  order  Pn,  P21,  Pl 2.  ^22-  We  use  (11)  and,  in  each  equation, 
take  to  the  right  all  the  terms  resulting  from  the  given  boundary  values.  Then  we  obtain  the  system 
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(12) 


— 4mh  + W21  + w12  = —200 

—Mu  — 4m2i  + m22  = —200 

Mu  — 4m i2  + m22  = —100 

m2i  + M12  — 4m22  = —100. 


In  practice,  one  would  solve  such  a small  system  by  the  Gauss  elimination,  finding  Mu  = m2i  = 87.5, 
W12  = w22  = 62.5. 

More  exact  values  (exact  to  3S)  of  the  solution  of  the  actual  problem  [as  opposed  to  its  model  (12)]  are  88.1 
and  61.9,  respectively.  (These  were  obtained  by  using  Fourier  series.)  Hence  the  error  is  about  1%,  which  is 
surprisingly  accurate  for  a grid  of  such  a large  mesh  size  h.  If  the  system  of  equations  were  large,  one  would 
solve  it  by  an  indirect  method,  such  as  Liebmann’s  method.  For  (12)  this  is  as  follows.  We  write  (12)  in  the 
form  (divide  by  —4  and  take  terms  to  the  right) 


— 0.25m21 

+ 0.25ui2 

+ 50 

M21  — 0.25mh 

+ 0.25m22  + 50 

U\  2 — 0.25mu 

+ 0.25m22  + 25 

u22  = 0.25m2i 

+ 0.25«i2 

+ 25. 

These  equations  are  now  used  for  the  Gauss-Seidel  iteration.  They  are  identical  with  (2)  in  Sec.  20.3,  where 
Mn  = x 1 , m2i  = x2,  mi2  = *3,  m22  = X4,  and  the  iteration  is  explained  there,  with  100,  100,  100,  100  chosen  as 
starting  values.  Some  work  can  be  saved  by  better  starting  values,  usually  by  taking  the  average  of  the  boundary 
values  that  enter  into  the  linear  system.  The  exact  solution  of  the  system  is  mu  = m2i  = 87.5,  mi2  = m22  = 62.5, 
as  you  may  verify. 


(a)  Given  problem  (b)  Grid  and  mesh  points 

Fig.  455.  Example  1 

Remark.  It  is  interesting  to  note  that,  if  we  choose  mesh  h = L/ n ( L = side  of  R)  and  consider  the  (n  — 1 )2 
internal  mesh  points  (i.e.,  mesh  points  not  on  the  boundary)  row  by  row  in  the  order 


^11>  ^21>  * ■ ■ » Pn-  1,1>  ^L2»  ^22>  ' " » Pfi- 2,2>  * “ » 
then  the  system  of  equations  has  the  ( n — l)2  X (n  — l)2  coefficient  matrix 


'B  I 

-4  1 

I B I 

1 -4  1 

Here  B = 

I B I 

1 -4  1 

I B. 

1 -4. 

(13)  A - 
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is  an  (//  — 1)  X in  — 1)  matrix.  (In  (12)  we  have  n = 3 ,(n  — l)2  = 4 internal  mesh  points,  two  submatrices 
B,  and  two  submatrices  I.)  The  matrix  A is  nonsingular.  This  follows  by  noting  that  the  off-diagonal  entries  in 
each  row  of  A have  the  sum  3 (or  2),  whereas  each  diagonal  entry  of  A equals  —4,  so  that  nonsingularity  is 
implied  by  Gerschgorin's  theorem  in  Sec.  20.7  because  no  Gerschgorin  disk  can  include  0. 


A matrix  is  called  a band  matrix  if  it  has  all  its  nonzero  entries  on  the  main  diagonal 
and  on  sloping  lines  parallel  to  it  (separated  by  sloping  lines  of  zeros  or  not).  For  example, 
A in  (13)  is  a band  matrix.  Although  the  Gauss  elimination  does  not  preserve  zeros  between 
bands,  it  does  not  introduce  nonzero  entries  outside  the  limits  defined  by  the  original 
bands.  Hence  a band  structure  is  advantageous.  In  (13)  it  has  been  achieved  by  carefully 
ordering  the  mesh  points. 


ADI  Method 

A matrix  is  called  a tridiagonal  matrix  if  it  has  all  its  nonzero  entries  on  the  main 
diagonal  and  on  the  two  sloping  parallels  immediately  above  or  below  the  diagonal.  (See 
also  Sec.  20.9.)  In  this  case  the  Gauss  elimination  is  particularly  simple. 

This  raises  the  question  of  whether,  in  the  solution  of  the  Dirichlet  problem  for  the 
Laplace  or  Poisson  equations,  one  could  obtain  a system  of  equations  whose  coefficient 
matrix  is  tridiagonal.  The  answer  is  yes,  and  a popular  method  of  that  kind,  called  the 
ADI  method  ( alternating  direction  implicit  method)  was  developed  by  Peaceman  and 
Rachford.  The  idea  is  as  follows.  The  stencil  in  (9)  shows  that  we  could  obtain  a tridiagonal 
matrix  if  there  were  only  the  three  points  in  a row  (or  only  the  three  points  in  a column). 
This  suggests  that  we  write  (11)  in  the  form 

(14a)  j 4 Ujj  T ■ l 

so  that  the  left  side  belongs  to  y-Row  j only  and  the  right  side  to  x-Column  i.  Of  course, 
we  can  also  write  (11)  in  the  form 

(14b)  u^j—\  4 uij  ~f  Uij+i  ^ /■  l , / 

so  that  the  left  side  belongs  to  Column  i and  the  right  side  to  Row  j.  In  the  ADI  method 
we  proceed  by  iteration.  At  every  mesh  point  we  choose  an  arbitrary  starting  value 
In  each  step  we  compute  new  values  at  all  mesh  points.  In  one  step  we  use  an  iteration 
formula  resulting  from  (14a)  and  in  the  next  step  an  iteration  formula  resulting  from  (14b), 
and  so  on  in  alternating  order. 

In  detail:  suppose  approximations  i4jn)  have  been  computed.  Then,  to  obtain  the  next 
approximations  ^4”^+1,,  we  substitute  the  u™’’  on  the  right  side  of  (14a)  and  solve  for  the 
u\j  on  the  left  side;  that  is,  we  use 


(15a) 


(m+1) 
ui— l,j 


4 U 


(m+1) 

ij 


, .(m+1)  _ 
' ui+l,j 


(m) 
H,j—  1 


..(m) 
ui,j+  !• 


We  use  (15a)  for  a fixed  j,  that  is,  for  a fixed  row  j,  and  for  all  internal  mesh  points  in 
this  row.  This  gives  a linear  system  of  N algebraic  equations  ( N = number  of  internal 
mesh  points  per  row)  in  N unknowns,  the  new  approximations  of  u at  these  mesh  points. 
Note  that  (15a)  involves  not  only  approximations  computed  in  the  previous  step  but  also 
given  boundary  values.  We  solve  the  system  (15a)  ( j fixed!)  by  Gauss  elimination.  Then 
we  go  to  the  next  row,  obtain  another  system  of  N equations  and  solve  it  by  Gauss,  and 
so  on,  until  all  rows  are  done.  In  the  next  step  we  alternate  direction,  that  is,  we  compute 
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EXAMPLE  2 


the  next  approximations  m|™+2)  column  by  column  from  the  t4™+1)  and  the  given  boundary 
values,  using  a formula  obtained  from  (14b)  by  substituting  the  Ujjl+ 1 ’ on  the  right. 

/icm  (m+ 2)  _ * (m+2)  , (m+2.)  _ (m+1)  _ (m+ 1) 


For  each  fixed  i,  that  is,  for  each  column,  this  is  a system  of  M equations  (M  = number 
of  internal  mesh  points  per  column)  in  M unknowns,  which  we  solve  by  Gauss  elimination. 
Then  we  go  to  the  next  column,  and  so  on,  until  all  columns  are  done. 

Let  us  consider  an  example  that  merely  serves  to  explain  the  entire  method. 


Dirichlet  Problem.  ADI  Method 

Explain  the  procedure  and  formulas  of  the  ADI  method  in  terms  of  the  problem  in  Example  1,  using  the  same 
grid  and  starting  values  100,  100,  100,  100. 

Solution.  While  working,  we  keep  an  eye  on  Fig.  455b  and  the  given  boundary  values.  We  obtain  first 
approximations  wiV,  U21,  M12,  W22  from  (15a)  with  m = 0.  We  write  boundary  values  contained  in  (15a)  without 
an  upper  index,  for  better  identification  and  to  indicate  that  these  given  values  remain  the  same  during  the 
iteration.  From  (15a)  with  m = 0 we  have  for  j = 1 (first  row)  the  system 

(i  = 1)  uq  1 — 4^11*  + U21  = — «io  — W12? 

(i  = 2)  «ii  — 4 U21  + «3i  = — M20  — M22 . 

The  solution  is  wiY  = U21  = 100.  For  j = 2 (second  row)  we  obtain  from  (15a)  the  system 

(i  = 1)  Uq  2 — 4m  12  + M22  = “Mu  — M13 

(i  = 2)  M12  — 4M22  + M32  = — M21  — M23. 

The  solution  is  u\2  — M22  = 66.667. 

Second  approximations  m^,  m^,  u^z  are  now  obtained  from  (15b)  with  m = 1 by  using  the  first 

approximations  just  computed  and  the  boundary  values.  For  i = 1 (first  column)  we  obtain  from  (15b)  the  system 

0=1)  uio  — 4u(ii  + m^2  — ~Uq  1 — M21 

( j = 2)  “S  — 4«S  + M13  = — M02  — M22- 

The  solution  is  = 91.11,  u^z  = 64.44,  For  i = 2 (second  column)  we  obtain  from  (15b)  the  system 

( j = 1)  M20  — 4m2^  + U22  = _«(n  — M31 

(7  = 2)  M^i  — 4M22)  + M23  = — M12  — M32. 

The  solution  is  u^i  — 91.11,  U22  — 64.44. 

In  this  example,  which  merely  serves  to  explain  the  practical  procedure  in  the  ADI  method,  the  accuracy  of 
the  second  approximations  is  about  the  same  as  that  of  two  Gauss-Seidel  steps  in  Sec.  20.3  (where 
Mii  = *i>  M21  — *2,  ui2  = x3i  u22  = X4),  as  the  following  table  shows. 


Method 

un 

m21 

m12 

m22 

ADI,  2nd  approximations 

91.11 

91.11 

64.44 

64.44 

Gauss-Seidel,  2nd  approximations 

93.75 

90.62 

65.62 

64.06 

Exact  solution  of  (12) 

87.50 

87.50 

62.50 

62.50 
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Improving  Convergence.  Additional  improvement  of  the  convergence  of  the  ADI 
method  results  from  the  following  interesting  idea.  Introducing  a parameter  p,  we  can  also 
write  (11)  in  the  form 


(16) 


(a)  Ui—\j 

(b) 


(2  + p)uy  + ui+1j  = -Uij-x  + (2  - p)Uy  - Uyj+i 

(2  4"  p)u jj  4“  Ui’j+i  Ui—i  j 4"  (2  p)Uy 


This  gives  the  more  general  ADI  iteration  formulas 


(17) 


(a)  4-u’  - (2  + P)uf+l)  + uZt?  = + (2  - P)u$*  ~ 


(b) 


(m+2) 
ui,j— 1 


(2  4-  p)u 


(m+2) 

ij 


U 


(m+2)  _ 

i,j+l 


= + (2 


p)4r+i) 


(m+l) 
^z+l,^  • 


For  p = 2,  this  is  (15).  The  parameter  p may  be  used  for  improving  convergence.  Indeed, 
one  can  show  that  the  ADI  method  converges  for  positive  p,  and  that  the  optimum  value 
for  maximum  rate  of  convergence  is 


(18) 


po  = 2 sin 


77 

K 


where  K is  the  larger  of  M + 1 and  N + 1 (see  above).  Even  better  results  can  be  achieved 
by  letting  p vary  from  step  to  step.  More  details  of  the  ADI  method  and  variants  are 
discussed  in  Ref.  [E25]  listed  in  App.  1. 


PROBLEM  SET  21.4 


1.  Derive  (5b),  (6b),  and  (6c). 

2.  Verify  the  calculations  in  Example  1 of  the  text.  Find 
out  experimentally  how  many  steps  you  need  to  obtain 
the  solution  of  the  linear  system  with  an  accuracy  of  3S. 

3.  Use  of  symmetry.  Conclude  from  the  boundary  values 
in  Example  1 that  w2 1 = u11  and  u2 2 = “12-  Show 
that  this  leads  to  a system  of  two  equations  and  solve  it. 

4.  Finer  grid  of  3 X 3 inner  points.  Solve  Example  1, 
choosing  h = ^ = 3 (instead  of  h = 32  = 4)  and  the 
same  starting  values. 

GAUSS  ELIMINATION,  GAUSS-SEIDEL 
ITERATION 


Fig.  456.  Problems  5-10 


For  the  grid  in  Fig.  456  compute  the  potential  at  the 
four  internal  points  by  Gauss  and  by  5 Gauss-Seidel 
steps  with  starting  values  100,  100,  100,  100  (showing 
the  details  of  your  work)  if  the  boundary  values  on  the 
edges  are: 

5.  w(l,  0)  = 60,  u( 2,  0)  = 300,  u = 100  on  the  other 
three  edges. 

6.  u = 0 on  the  left,  x3  on  the  lower  edge,  27  — 9v2  on 
the  right,  x3  — 21x  on  the  upper  edge. 

7.  (7o  on  the  upper  and  lower  edges,  — Uq  on  the  left  and 
right.  Sketch  the  equipotential  lines. 

8.  u = 220  on  the  upper  and  lower  edges,  1 10  on  the  left 
and  right. 

9.  u = sin  \ttx  on  the  upper  edge,  0 on  the  other  edges, 
10  steps. 

10.  u = x4  on  the  lower  edge,  81  — 54y2  4-  y4  on  the  right, 
x4  — 54x2  4-  81  on  the  upper  edge,  y4  on  the  left. 
Verify  the  exact  solution  x4  — 6x2y2  4-  y4  and 
determine  the  error. 
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11.  Find  the  potential  in  Fig.  457  using  (a)  the  coarse 
grid,  (b)  the  fine  grid  5X3,  and  Gauss  elimination. 
Hint.  In  (b),  use  symmetry;  take  u — 0 as  boundary 
value  at  the  two  points  at  which  the  potential  has  a 
jump. 


u = -1 10  V 

Fig.  457.  Region  and  grids  in  Problem  11 

12.  Influence  of  starting  values.  Do  Prob.  9 by  Gauss- 
Seidel,  starting  from  0.  Compare  and  comment. 

13.  For  the  square  0 S x S 4,  0 S y S 4 let  the  boundary 
temperatures  be  0°C  on  the  horizontal  and  50°C  on  the 
vertical  edges.  Find  the  temperatures  at  the  interior 
points  of  a square  grid  with  h — 1. 

14.  Using  the  answer  to  Prob.  13,  try  to  sketch  some 
isotherms. 


15.  Find  the  isotherms  for  the  square  and  grid  in  Prob.  13 
if  u = sin  5 ttx  on  the  horizontal  and  — sin  | Try  on  the 
vertical  edges.  Try  to  sketch  some  isotherms. 

16.  ADI.  Apply  the  ADI  method  to  the  Dirichlet  problem 
in  Prob.  9,  using  the  grid  in  Fig.  456,  as  before  and 
starting  values  zero. 

17.  What  po  in  (18)  should  we  choose  for  Prob.  16?  Apply 
the  ADI  formulas  (17)  with  that  value  of  p0  to  Prob.  16, 
performing  1 step.  Illustrate  the  improved  convergence 
by  comparing  with  the  corresponding  values  0.077, 
0.308  after  the  first  step  in  Prob.  16.  (Use  the  starting 
values  zero.) 

18.  CAS  PROJECT.  Laplace  Equation,  (a)  Write  a 
program  for  Gauss-Seidel  with  16  equations  in  16 
unknowns,  composing  the  matrix  (13)  from  the  indicated 
4X4  submatrices  and  including  a transformation  of 
the  vector  of  the  boundary  values  into  the  vector  b of 
Ax  = b. 

(b)  Apply  the  program  to  the  square  grid  in  0 S x & 5, 
0 S y S 5 with  h = 1 and  u = 220  on  the  upper  and 
lower  edges,  u = 1 10  on  the  left  edge  and  u = —10 
on  the  right  edge.  Solve  the  linear  system  also  by  Gauss 
elimination.  What  accuracy  is  reached  in  the  20th 
Gauss-Seidel  step? 


21.5  Neumann  and  Mixed  Problems. 

Irregular  Boundary 

We  continue  our  discussion  of  boundary  value  problems  for  elliptic  PDEs  in  a region  R 
in  the  xy-plane.  The  Dirichlet  problem  was  studied  in  the  last  section.  In  solving  Neumann 
and  mixed  problems  (defined  in  the  last  section)  we  are  confronted  with  a new  situation, 
because  there  are  boundary  points  at  which  the  (outer)  normal  derivative  un  = du/dn  of 
the  solution  is  given,  but  u itself  is  unknown  since  it  is  not  given.  To  handle  such  points 
we  need  a new  idea.  This  idea  is  the  same  for  Neumann  and  mixed  problems.  Hence  we 
may  explain  it  in  connection  with  one  of  these  two  types  of  problems.  We  shall  do  so  and 
consider  a typical  example  as  follows. 


EXAM  P L E Mixed  Boundary  Value  Problem  for  a Poisson  Equation 

Solve  the  mixed  boundary  value  problem  for  the  Poisson  equation 


V2w  = uxx  + Uyy  =f(x,y ) = 12xy 
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shown  in  Fig.  458a. 


(a)  Region  R and  boundary  values  (b)  Grid  ( h = 0.5) 

Fig.  458.  Mixed  boundary  value  problem  in  Example  1 


Solution.  We  use  the  grid  shown  in  Fig.  458b,  where  h = 0.5.  We  recall  that  (7)  in  Sec.  21.4  has  the  right 
side  h2f(x,  y)  = 0.52  • 12xy  = 3xy.  From  the  formulas  u = 3y3  and  un  = 6x  given  on  the  boundary  we  compute 
the  boundary  data 


(1) 


m3i  = 0.375, 


W32  - 3, 


dzil2  _ dzii2 
dn  dy 


6 • 0.5  = 3. 


dU22  _ dli22 
dn  dy 


6*1=6. 


^11  and  P21  are  internal  mesh  points  and  can  be  handled  as  in  the  last  section.  Indeed,  from  (7),  Sec.  21.4,  with 
n = 0.25  and  hj(x , y)  = 3xy  and  from  the  given  boundary  values  we  obtain  two  equations  corresponding  to 
Pu  and  P21,  as  follows  (with  —0  resulting  from  the  left  boundary). 


(2a) 


— 4»n  + «2i  + «i2  = 12(0.5  ■ 0.5)  • j - 0 = 0.75 

un  - 4h2i  + u22  = 12(1  • 0.5)  • j - 0.375  = 1.125. 


The  only  difficulty  with  these  equations  seems  to  be  that  they  involve  the  unknown  values  u 12  and  U22  of  u at 
P12  and  P22  on  the  boundary,  where  the  normal  derivative  un  = du/dn  — du/dy  is  given,  instead  of  m;  but  we 
shall  overcome  this  difficulty  as  follows. 

We  consider  P12  and  ^22-  The  idea  that  will  help  us  here  is  this.  We  imagine  the  region  R to  be  extended 
above  to  the  first  row  of  external  mesh  points  (corresponding  to  y = 1.5),  and  we  assume  that  the  Poisson 
equation  also  holds  in  the  extended  region.  Then  we  can  write  down  two  more  equations  as  before  (Fig.  458b) 


(2b) 


Mil  “ 4^12  + U22  + M13  = 1.5  - 0 = 1.5 

W21  T M12  — 4^22  + M23  = 3 — 3 = 0. 


On  the  right,  1.5  is  12 xyh2  at  (0.5,  1)  and  3 is  12 xyh2  at  (1,  1)  and  0 (at  P02)  and  3 (at  P32)  are  given  boundary 
values.  We  remember  that  we  have  not  yet  used  the  boundary  condition  on  the  upper  part  of  the  boundary  of 
R , and  we  also  notice  that  in  (2b)  we  have  introduced  two  more  unknowns  u 13,  «23-  But  we  can  now  use  that 
condition  and  get  rid  of  W13,  W23  by  applying  the  central  difference  formula  for  du/dy.  From  (1)  we  then  obtain 
(see  Fig.  458b) 


a»i2 

«13  Un 

3 = 

— U 13  U 11, 

hence 

“13  — “11 

dy 

2 h 

du  22 

u23  ~ U21 

6 = 

— 

~ — «23  u21> 

hence 

“23  = “21 

ay 

2h 

Substituting  these  results  into  (2b)  and  simplifying,  we  have 

2wn  4wi2  "F  U22  = 1.5  3 = 1.5 

2m2i  + M12  - 4m22  = 3 - 3 - 6 = -6. 
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Together  with  (2a)  this  yields,  written  in  matrix  form, 


~—4  1 1 ()’ 

»n 

0.75 

0.75  ’ 

1-401 

U21 

1.125 

1.125 

2 0-41 

11 12 

1.5  - 3 

-1.5 

0 2 1-4 

u22 

0-6 

-6 

(The  entries  2 come  from  W13  and  U23,  and  so  do  —3  and  —6  on  the  right).  The  solution  of  (3)  (obtained  by 
Gauss  elimination)  is  as  follows;  the  exact  values  of  the  problem  are  given  in  parentheses. 

ui2  = 0.866  (exact  1)  U22  — 1.812  (exact  2) 

u 11  = 0.077  (exact  0.125)  U21  — 0.191  (exact  0.25). 

Irregular  Boundary 

We  continue  our  discussion  of  boundary  value  problems  for  elliptic  PDEs  in  a region  R 
in  the  xy-plane.  If  R has  a simple  geometric  shape,  we  can  usually  arrange  for  certain 
mesh  points  to  lie  on  the  boundary  C of  R,  and  then  we  can  approximate  partial  derivatives 
as  explained  in  the  last  section.  However,  if  C intersects  the  grid  at  points  that  are  not 
mesh  points,  then  at  points  close  to  the  boundary  we  must  proceed  differently,  as  follows. 

The  mesh  point  O in  Fig.  459  is  of  that  kind.  For  O and  its  neighbors  A and  P we  obtain 
from  Taylor’s  theorem 


(4) 


duo  1 

(a)  uA  = uQ  + ah  — h - (ah) 

dx  1 


.,2 

2 ° uO 

dx1 2 


+ 


du( 


<0  1 2 ^Uo 

(b)  up  = u0  - h — — + - h -g-  + 
dx  2 dx 


We  disregard  the  terms  marked  by  dots  and  eliminate  duo/dx.  Equation  (4b)  times  a plus 
equation  (4a)  gives 


uA  + aup  ~ (1  + a)  uq  + 


1 , , n;2  d2u0 

— a (a  + 1 ) n «- 

2 dx 


Fig.  459.  Curved  boundary  C of  a region  R,  a mesh  point  O near  C, 
and  neighbors  A,  B,  P,  Q 

We  solve  this  last  equation  algebraically  for  the  derivative,  obtaining 


-.2 

d Uo 
dx2 


n , , «A  + y T Up 

a + a)  1 + a 


u0 
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EXAMPLE  2 


Similarly,  by  considering  the  points  O,  B,  and  Q, 


92uq  2 

dy2  ~1? 


1 

b(  1 + b)  U B 


+ 


1 

1 + b 


uq  ~ 


1 

-bu0 


By  addition, 

„ 2 r UB  UP  Uq  ( a + b)uQ 

h2  [a(l  + a ) b(  1 + b)  1 + a 1 + b ab 

For  example,  if  a = \,b  = \,  instead  of  the  stencil  (see  Sec.  21.4) 


1 1 

* 

<1-4  1 

> we  now  have  < 

1 -4  | 

■ J 

2 

l 3 

because  l/[a(l  + a)]  = §,  etc.  The  sum  of  all  five  terms  still  being  zero  (which  is  useful 
for  checking). 

Using  the  same  ideas,  you  may  show  that  in  the  case  of  Fig.  460. 


(6)  V u0  ~ 2 
n 


uA 


a formula  that  takes  care  of  all  conceivable  cases. 


Fig.  460  Neighboring  points  A,  B,  P,  Q of  a 
mesh  point  O and  notations  in  formula  (6) 


Dirichlet  Problem  for  the  Laplace  Equation.  Curved  Boundary 

Find  the  potential  u in  the  region  in  Fig.  461  that  has  the  boundary  values  given  in  that  figure;  here  the  curved 
portion  of  the  boundary  is  an  arc  of  the  circle  of  radius  10  about  (0,0).  Use  the  grid  in  the  figure. 

Solution,  u is  a solution  of  the  Laplace  equation.  From  the  given  formulas  for  the  boundary  values  u = x 3, 
u — 512  24y2,  • • • we  compute  the  values  at  the  points  where  we  need  them;  the  result  is  shown  in  the  figure. 

For  Pn  and  P12  we  have  the  usual  regular  stencil,  and  for  P21  and  P22  we  use  (6),  obtaining 
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Fig.  46'  Region,  boundary  values  of  the  potential,  and  grid  in  Example  2 

We  use  this  and  the  boundary  values  and  take  the  mesh  points  in  the  usual  order  Pll.  ^21.  Pl2-  ^22-  Then  we 
obtain  the  system 


-4«u 

+ W2i 

+ 

“12 

= 

0 - 

27 

= 

-27 

0.6i(n 

— 2.5m2i 

+ 

0.5m22  = 

-0.9 

■ 296 

- 0.5  ■ 

■ 216  = 

-374.4 

“n 

- 

4“  12 

+ 

u22  = 

702 

+ 0 

= 

702 

0.6m2i 

+ 

0.6  »12 

- 

3m22  = 

0.9 

■ 352 

+ 0.9  ■ 

■ 936  = 

1159.2 

In  matrix  form, 


~— 4 

1 

1 

0 

“n 

-27  ’ 

0.6 

-2.5 

0 

0.5 

“21 

-374.4 

1 

0 

-4 

1 

“12 

702 

0 

0.6 

0.6 

-3 

“22 

1159.2 

Gauss  elimination  yields  the  (rounded)  values 

1/11  = —55.6,  m2i  = 49.2,  Mi2  = —298.5,  m22  = —436.3. 

Clearly,  from  a grid  with  so  few  mesh  points  we  cannot  expect  great  accuracy.  The  exact  solution  of  the  PDE 
(not  of  the  difference  equation)  having  the  given  boundary  values  is  u = x3  — 3 xy2  and  yields  the  values 

Mu  — 54,  m2i  — 54,  M12  — 297,  m22  — 432. 

In  practice  one  would  use  a much  finer  grid  and  solve  the  resulting  large  system  by  an  indirect  method. 


PR  OB  L EMSET2 15 


1-7 


MIXED  BOUNDARY  VALUE  PROBLEMS 


1.  Check  the  values  for  the  Poisson  equation  at  the  end 
of  Example  1 by  solving  (3)  by  Gauss  elimination. 


2.  Solve  the  mixed  boundary  value  problem  for  the 

Poisson  equation  V2t(  = 2 ( x 2 + y2)  in  the  region  and 
for  the  boundary  conditions  shown  in  Fig.  462,  using 
the  indicated  grid. 


Fig.  462.  Problems  2 and  6 
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3.  CAS  EXPERIMENT.  Mixed  Problem.  Do  Example 
1 in  the  text  with  finer  and  finer  grids  of  your  choice 
and  study  the  accuracy  of  the  approximate  values  by 
comparing  with  the  exact  solution  u = 2 xy3.  Verify  the 
latter. 

4.  Solve  the  mixed  boundary  value  problem  for  the 
Laplace  equation  V2«  = 0 in  the  rectangle  in  Fig.  458a 
(using  the  grid  in  Fig.  458b)  and  the  boundary 
conditions  ux  = 0 on  the  left  edge,  ux  = 3 on  the  right 
edge,  u — xz  on  the  lower  edge,  and  u = x2  — 1 on 
the  upper  edge. 

5.  Do  Example  1 in  the  text  for  the  Laplace  equation 
(instead  of  the  Poisson  equation)  with  grid  and 
boundary  data  as  before. 

6.  Solve  V2i/  = —7 T2y  sing7rx  for  the  grid  in  Fig.  462 
and  uy(  1,  3)  = uy( 2,  3)  = | V243,  u — 0 on  the  other 
three  sides  of  the  square. 

7.  Solve  Prob.  4 when  un  = 110  on  the  upper  edge  and 
u = 110  on  the  other  edges. 


8-16 


IRREGULAR  BOUNDARY 


8.  Verify  the  stencil  shown  after  (5). 

9.  Derive  (5)  in  the  general  case. 

10.  Derive  the  general  formula  (6)  in  detail. 

11.  Derive  the  linear  system  in  Example  2 of  the  text. 

12.  Verify  the  solution  in  Example  2. 

13.  Solve  the  Laplace  equation  in  the  region  and  for  the 
boundary  values  shown  in  Fig.  463,  using  the 
indicated  grid.  (The  sloping  portion  of  the  boundary 
is  y = 4.5  — x.) 


Fig.  463.  Problem  13 


14.  If,  in  Prob.  13,  the  axes  are  grounded  ( u = 0),  what 
constant  potential  must  the  other  portion  of  the 
boundary  have  in  order  to  produce  220  V at  P\  i ? 

15.  What  potential  do  we  have  in  Prob.  13  if  u = 100  V 
on  the  axes  and  u = 0 on  the  other  portion  of  the 
boundary? 

16.  Solve  the  Poisson  equation  V2;;  = 2 in  the  region  and 
for  the  boundary  values  shown  in  Fig.  464,  using  the 
grid  also  shown  in  the  figure. 


Fig.  464  Problem  16 


21.6  Methods  for  Parabolic  PDEs 


The  last  two  sections  concerned  elliptic  PDEs,  and  we  now  turn  to  parabolic  PDEs.  Recall 
that  the  definitions  of  elliptic,  parabolic,  and  hyperbolic  PDEs  were  given  in  Sec.  21.4. 
There  it  was  also  mentioned  that  the  general  behavior  of  solutions  differs  from  type  to 
type,  and  so  do  the  problems  of  practical  interest.  This  reflects  on  numerics  as  follows. 

For  all  three  types,  one  replaces  the  PDE  by  a corresponding  difference  equation,  but 
for  parabolic  and  hyperbolic  PDEs  this  does  not  automatically  guarantee  the  convergence 
of  the  approximate  solution  to  the  exact  solution  as  the  mesh  h —*  0;  in  fact,  it  does  not 
even  guarantee  convergence  at  all.  For  these  two  types  of  PDEs  one  needs  additional 
conditions  (inequalities)  to  assure  convergence  and  stability,  the  latter  meaning  that  small 
perturbations  in  the  initial  data  (or  small  errors  at  any  time)  cause  only  small  changes  at 
later  times. 

In  this  section  we  explain  the  numeric  solution  of  the  prototype  of  parabolic  PDEs,  the 
one-dimensional  heat  equation 


_ 2 
t/t  ^ C l/C 


(c  constant). 
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This  PDE  is  usually  considered  for  x in  some  fixed  interval,  say,  0 =§  x L,  and  time 
t 0,  and  one  prescribes  the  initial  temperature  u(x,  0)  = fix)  (/  given)  and  boundary 
conditions  at  x = 0 and  x = L for  all  0,  for  instance,  u( 0,  t)  = 0,  w(L,  t)  = 0.  We  may 
assume  c = 1 and  L = 1 ; this  can  always  be  accomplished  by  a linear  transformation  of 
x and  t (Prob.  1).  Then  the  heat  equation  and  those  conditions  are 

(1)  ut  = uxx  Ogxg  1,  r § 0 

(2)  u(x,  0)  = /(x)  (Initial  condition) 

(3)  m(0,  t)  = m(1,  t)  = 0 (Boundary  conditions). 

A simple  finite  difference  approximation  of  (1)  is  [see  (6a)  in  Sec.  21.4;  j is  the  number 
of  the  time  step] 


(4) 


(ui,j 


3+1 


uij ) j2 


( U 


*+1,3 


2 tin  + 


1,3')’ 


Figure  465  shows  a corresponding  grid  and  mesh  points.  The  mesh  size  is  h in  the  x-direction 
and  k in  the  r-direction.  Formula  (4)  involves  the  four  points  shown  in  Fig.  466.  On  the  left 
in  (4)  we  have  used  a forward  difference  quotient  since  we  have  no  information  for  negative 
t at  the  start.  From  (4)  we  calculate  «j  J+ 1,  which  corresponds  to  time  row  j + 1,  in  terms 
of  the  three  other  u that  correspond  to  time  row  j.  Solving  (4)  for  iq  ,-+i,  we  have 


(5) 


^*,3+1  (1  2r) iij.j  + ] j T Ui—\tj), 


r = 


k_ 

h2' 


Computations  by  this  explicit  method  based  on  (5)  are  simple.  However,  it  can  be  shown 
that  crucial  to  the  convergence  of  this  method  is  the  condition 


(6) 


r = 


k 

h2 


1 

2 ' 


Fig.  465.  Grid  and  mesh  points  corresponding  to  (4),  (5) 

UJ+  1) 

X 

I* 

(i  — 1,3)  X — — X (i  + 1,3) 

h r \ h 

6,3) 

Fig.  466.  The  four  points  in  (4)  and  (5) 
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That  is,  Uy  should  have  a positive  coefficient  in  (5)  or  (for  r = |)  be  absent  from  (5). 
Intuitively,  (6)  means  that  we  should  not  move  too  fast  in  the  /-direction.  An  example  is 
given  below. 


Crank-Nicolson  Method 

Condition  (6)  is  a handicap  in  practice.  Indeed,  to  attain  sufficient  accuracy,  we  have 
to  choose  h small,  which  makes  k very  small  by  (6).  For  example,  if  h = 0.1,  then 
k 7=  0.005.  Accordingly,  we  should  look  for  a more  satisfactory  discretization  of  the 
heat  equation. 

A method  that  imposes  no  restriction  on  r = k/h2  is  the  Crank-Nicolson  (CN) 
method,5  which  uses  values  of  u at  the  six  points  in  Fig.  467.  The  idea  of  the  method 
is  the  replacement  of  the  difference  quotient  on  the  right  side  of  (4)  by  \ times  the 
sum  of  two  such  difference  quotients  at  two  time  rows  (see  Fig.  467).  Instead  of  (4) 
we  then  have 


(7) 


(ui  j+ 1 Uij ) o (wi-t-l,j  2ujj  + 

k An 


T n (Mi+  i,j  + i T Ui—\ 7 + 1). 

2 h2 


Multiplying  by  2k  and  writing  r = k/h2  as  before,  we  collect  the  terms  corresponding  to 
time  row  j + 1 on  the  left  and  the  terms  corresponding  to  time  row  j on  the  right: 


(8)  (2  T 2 i)R(,7+i  -f-  W7— 1,7+1  (2  2r)uij  T p(tq+i,7  T tU— 1,7)- 


How  do  we  use  (8)?  In  general,  the  three  values  on  the  left  are  unknown,  whereas  the 
three  values  on  the  right  are  known.  If  we  divide  the  x-interval  1 in  ( 1 ) into  n 

equal  intervals,  we  have  n — 1 internal  mesh  points  per  time  row  (see  Fig.  465,  where 
n = 4).  Then  for  j = 0 and  i = 1,  • • • , n — 1,  formula  (8)  gives  a linear  system  of  n — 1 
equations  for  the  n — 1 unknown  values  #n,  1*21,  ■ • • , un- 14  in  the  first  time  row  in  terms 
of  the  initial  values  uqq,  um,  • ■ • , uno  and  the  boundary  values  //01(=  0),  un\  (=  0). 
Similarly  for  j = 1,7  = 2,  and  so  on;  that  is,  for  each  time  row  we  have  to  solve  such  a 
linear  system  of  n — 1 equations  resulting  from  (8). 

Although  r = k/h2  is  no  longer  restricted,  smaller  r will  still  give  better  results.  In 
practice,  one  chooses  a k by  which  one  can  save  a considerable  amount  of  work,  without 


5JOHN  CRANK  (1916-2006),  English  mathematician  and  physicist  at  Courtaulds  Fundamental  Research 
Laboratory,  professor  at  Brunei  University,  England.  Student  of  Sir  WILLIAM  LAWRENCE  BRAGG 
(1890-1971),  Australian  British  physicist,  who  with  his  father.  Sir  WILLIAM  HENRY  BRAGG  (1862-1942) 
won  the  Nobel  Prize  in  physics  in  1915  for  their  fundamental  work  in  X-ray  crystallography.  (This  is  the  only 
case  where  a father  and  a son  shared  the  Nobel  Prize  for  the  same  research.  Furthermore,  W.  L.  Bragg  is  the 
youngest  Nobel  laureate  ever.)  PHYLLIS  NICOLSON  (1917-1968),  English  mathematician,  professor  at  the 
University  of  Leeds,  England. 
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EXAMPLE  1 


making  r too  large.  For  instance,  often  a good  choice  is  r = 1 (which  would  be  impossible 
in  the  previous  method).  Then  (8)  becomes  simply 

(9)  . ] 1^2  + l,j  + l + T Mi—  l,j- 


Time  row _/'+ 1 X X X 

k 

Time  row  j X- X X 

h h 


Fig.  467.  The  six  points  in  the  Crank-Nicolson  formulas  (7)  and  (8) 


i = 0 i = 1 i = 2 i = 3 i = 4 i = 5 

Fig.  468.  Grid  in  Example  1 


Temperature  in  a Metal  Bar.  Crank-Nicolson  Method,  Explicit  Method 

Consider  a laterally  insulated  metal  bar  of  length  1 and  such  that  c2  = 1 in  the  heat  equation.  Suppose  that  the 
ends  of  the  bar  are  kept  at  temperature  u = 0°C  and  the  temperature  in  the  bar  at  some  instant — call  it  t = 0 — 
is  f(x)  = sin  7Tx.  Applying  the  Crank-Nicolson  method  with  h = 0.2  and  r = 1,  find  the  temperature  u{x,  t)  in 
the  bar  for  0 ^ t ^ 0.2.  Compare  the  results  with  the  exact  solution.  Also  apply  (5)  with  an  r satisfying  (6), 
say,  r — 0.25,  and  with  values  not  satisfying  (6),  say,  r = 1 and  r = 2.5. 

Solution  by  Crank-Nicolson.  Since  r = 1,  formula  (8)  takes  the  form  (9).  Since  h = 0.2  and 
r — k/h‘ 2 = 1,  we  have  k = h2  = 0.04.  Hence  we  have  to  do  5 steps.  Figure  468  shows  the  grid.  We  shall  need 
the  initial  values 


u io  = sin  0.277  = 0.587785,  W20  = sin  0.477  = 0.951057. 

Also,  «3o  = M20  and  m40  = Mio-  (Recall  that  u^q  means  u at  P\q  in  Fig.  468,  etc.)  In  each  time  row  in  Fig. 
468  there  are  4 internal  mesh  points.  Hence  in  each  time  step  we  would  have  to  solve  4 equations  in  4 
unknowns.  But  since  the  initial  temperature  distribution  is  symmetric  with  respect  to  x = 0.5,  and  u = 0 at 
both  ends  for  all  t,  we  have  u^i  = «2i»  w4i  = Mu  in  the  first  time  row  and  similarly  for  the  other  rows.  This 
reduces  each  system  to  2 equations  in  2 unknowns.  By  (9),  since  u^i  = «2i  and  uoi  = for  j = 0 these 
equations  are 


(i  = 1)  4 Mu  - «21  = »oo  + “20  = 0.951057 

(i  = 2)  —u  n + 4h2i  — «2i  = 11  io  + “20  = 1-538842. 

The  solution  is  u 1 1 = 0.399274,  m2 i = 0.646039.  Similarly,  for  time  row  j = 1 we  have  the  system 

(i  = 1)  4h12  — it  22  = “oi  + “21  = 0.646039 

( i = 2)  — “12  + 3»22  = »n  + “2i  = 1.045313. 
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The  solution 
(Fig.  469): 

is  M i2  = 

0.271221,  u22  = 

0.438844,  and  so 

on.  This  gives 

the  temperature 

distribution 

t 

x = 0 

X 

II 

o 

to 

II 

o 

4^ 

x — 0.6 

X 

II 

O 

bo 

X = 1 

0.00 

0 

0.588 

0.951 

0.951 

0.588 

0 

0.04 

0 

0.399 

0.646 

0.646 

0.399 

0 

0.08 

0 

0.271 

0.439 

0.439 

0.271 

0 

0.12 

0 

0.184 

0.298 

0.298 

0.184 

0 

0.16 

0 

0.125 

0.202 

0.202 

0.125 

0 

0.20 

0 

0.085 

0.138 

0.138 

0.085 

0 

Comparison  with  the  exact  solution.  The  present  problem  can  be  solved  exactly  by  separating 
variables  (Sec.  12.5);  the  result  is 

(10)  u(x,  t)  = sin  7Tx  e~7r  t. 

Solution  by  the  explicit  method  (5)  with  r = 0.25.  For  h = 0.2  and  r = k/h2  = 0.25  we  have 
k = rh2  = 0.25  • 0.04  = 0.01.  Hence  we  have  to  perform  4 times  as  many  steps  as  with  the  Crank-Nicolson 
method!  Formula  (5)  with  r = 0.25  is 

(11)  Uitj+ 1 0.25 T 2 Uy  H-  i,j)‘ 

We  can  again  make  use  of  the  symmetry.  For  j = 0 we  need  wqo  ~ 0,  Mio  — 0.587785  (see  p.  939), 
u20  = w30  = 0.951057  and  compute 


Mu  = 0.25(mOq  + 2miq  + m2q)  — 0.531657 


m2i  = 0.25(mio  + 2m20  + m30)  = 0.25(miO  + 3m20)  = 0.860239. 


Of  course  we  can  omit  the  boundary  terms  mqi  — 0,  M02  — 0,  • • • from  the  formulas.  For  j = 1 we  compute 


M12  — 0.25(2mh  + m2i)  — 0.480888 


m22  = 0.25(wn  + 3m2i)  = 0.778094 


and  so  on.  We  have  to  perform  20  steps  instead  of  the  5 CN  steps,  but  the  numeric  values  show  that  the  accuracy 
is  only  about  the  same  as  that  of  the  Crank-Nicolson  values  CN.  The  exact  3D-values  follow  from  (10). 
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t 

x = 0.2 

X 

II 

O 

4^ 

CN 

By  (11) 

Exact 

CN 

By  (11) 

Exact 

0.04 

0.399 

0.393 

0.396 

0.646 

0.637 

0.641 

0.08 

0.271 

0.263 

0.267 

0.439 

0.426 

0.432 

0.12 

0.184 

0.176 

0.180 

0.298 

0.285 

0.291 

0.16 

0.125 

0.118 

0.121 

0.202 

0.191 

0.196 

0.20 

0.085 

0.079 

0.082 

0.138 

0.128 

0.132 

Failure  of  (5)  with  r violating  (6).  Formula  (5)  with  h = 0.2  and 

r = 1 — which  violates  (6) — is 

ui,j+l 

Ujj  + Ui+ij 

and  gives  very  poor  values;  some  of  these  are 

t 

x = 0.2 

Exact 

* 

II 

o 

4^ 

Exact 

0.04 

0.363 

0.396 

0.588 

0.641 

0.12 

0.139 

0.180 

0.225 

0.291 

0.20 

0.053 

0.082 

0.086 

0.132 

Formula  (5)  with 

an  even  larger 

r = 2.5  (and  h 

= 0.2  as 

before)  gives  completely  nonsensical  results;  some  oJ 

these  are 

t 

x = 0.2 

Exact 

* 

II 

o 

4^ 

Exact 

0.1 

0.0265 

0.2191 

0.0429 

0.3545 

0.3 

0.0001 

0.0304 

0.0001 

0.0492. 

■ 

gRQBEEiW^SET?^ 


1.  Nondimensional  form.  Show  that  the  heat  equation 
uj  = c1 2 3 4uxx,  0 £ x £ L,  can  be  transformed  to  the 
“nondimensional”  standard  form  ut  = uxx,  0 £ x £ 1, 
by  setting  x = x/L,  t = c27IL2,  u = S7u0,  where  u0  is 
any  constant  temperature. 

2.  Difference  equation.  Derive  the  difference  approxi- 
mation (4)  of  the  heat  equation. 

3.  Explicit  method.  Derive  (5)  by  solving  (4)  for  Uij+1. 

4.  CAS  EXPERIMENT.  Comparison  of  Methods. 

(a)  Write  programs  for  the  explicit  and  the  Crank — 
Nicolson  methods. 

(b)  Apply  the  programs  to  the  heat  problem  of  a 
laterally  insulated  bar  of  length  1 with  u( x,  0)  = sin  ttx 
and  u( 0,  t)  = «(1,  t)  = 0 for  all  t,  using  h = 0.2, 
k = 0.01  for  the  explicit  method  (20  steps),  h = 0.2 
and  (9)  for  the  Crank-Nicolson  method  (5  steps). 
Obtain  exact  6D-values  from  a suitable  series  and 
compare. 

(c)  Graph  temperature  curves  in  (b)  in  two  figures 
similar  to  Fig.  299  in  Sec.  12.7. 


(d)  Experiment  with  smaller  h (0.1,  0.05,  etc.)  for  both 
methods  to  find  out  to  what  extent  accuracy  increases 
under  systematic  changes  of  h and  k. 

EXPLICIT  METHOD 

5.  Using  (5)  with  h = 1 and  k = 0.5,  solve  the  heat 
problem  (1) — (3)  to  find  the  temperature  at  t = 2 in  a 
laterally  insulated  bar  of  length  10  ft  and  initial 
temperature /(x)  = x(l  — O.lx). 

6.  Solve  the  heat  problem  (1)— (3)  by  the  explicit  method 
with/;  = 0.2  and/:  = 0.01,  8 time  steps,  when/(x)  = x 
if  0 £ x < 2 , /(x)  =1—  x if  § £ x £ 1.  Compare 
with  the  3S-values  0.108,  0.175  for  t = 0.08, 
x = 0.2,  0.4  obtained  from  the  series  (2  terms)  in 
Sec.  12.5. 

7.  The  accuracy  of  the  explicit  method  depends  on 
r (S  |).  Illustrate  this  for  Prob.  6,  choosing  r — g (and 
h = 0.2  as  before).  Do  4 steps.  Compare  the  values  for 
t — 0.04  and  0.08  with  the  3S-values  in  Prob.  6,  which 
are  0.156,  0.254  ( t = 0.04),  0.105,  0.170  (t  = 0.08). 
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8.  In  a laterally  insulated  bar  of  length  1 let  the  initial 
temperature  be  f(x)  = x if  0 S x < 0.5, /(x)  = 1 — x 
if  0.5  £ x £ 1.  Let  (1)  and  (3)  hold.  Apply  the  explicit 
method  with/?  = 0.2  ,k  = 0.01, 5 steps.  Can  you  expect 
the  solution  to  satisfy  u(x,  t)  = m(1  — x,  t)  for  all  ft 

9.  Solve  Prob.  8 with  fix)  = x if  0 £ x £ 0.2, 
f(x)  = 0.25(1  — x)  if  0.2  < x £ 1,  the  other  data 
being  as  before. 

10.  Insulated  end.  If  the  left  end  of  a laterally  insulated 
bar  extending  from  x = 0 to  x = 1 is  insulated,  the 
boundary  condition  at  x = 0 is  un( 0,  t)  = ux( 0,  t)  — 0. 
Show  that,  in  the  application  of  the  explicit  method 
given  by  (5),  we  can  compute  t?oj  + i by  the  formula 

“oj  + i = (1  “ 2 r)uoj  + 2 ray. 

Apply  this  with  h = 0.2  and  r = 0.25  to  determine  the 
temperature  u(x,  i)  in  a laterally  insulated  bar  extending 
from  x = 0 to  1 if  u(x,  0)  = 0,  the  left  end  is  insulated 
and  the  right  end  is  kept  at  temperature  g(t)  = sin  777. 
Hint.  Use  0 = duoj/dx  — («y  — u_if)/2h. 


CRANK-NICOLSON  METHOD 

11.  Solve  Prob.  9 by  (9)  with  h = 0.2,  2 steps.  Compare 
with  exact  values  obtained  from  the  series  in  Sec.  12.5 
(2  terms)  with  suitable  coefficients. 

12.  Solve  the  heat  problem  ( 1) — (3)  by  Crank-Nicolson 
for  OStS  0.20  with  h — 0.2  and  k = 0.04  when 
fix)  = x if  0 £ x < g,/(x)  = 1 — x if  g £ x £ 1. 
Compare  with  the  exact  values  for  t = 0.20  obtained 
from  the  series  (2  terms)  in  Sec.  12.5. 


13-15 

Solve  (l)-(3)  by  Crank-Nicolson  with  r = 1 (5  steps), 
where: 

13.  fix)  = 5x  if  0 S x < 0.25,  fix)  = 1.25(1  - x)  if 
0.25  SxSl,/i  = 0.2 

14.  fix)  = x(l  — x),  h = 0.1.  (Compare  with  Prob.  15.) 

15.  fix)  = x(l  - x),  h = 0.2 


21.  Method  for  Hyperbolic  PDEs 


In  this  section  we  consider  the  numeric  solution  of  problems  involving  hyperbolic  PDEs. 
We  explain  a standard  method  in  terms  of  a typical  setting  for  the  prototype  of  a hyperbolic 


PDE,  the  wave  equation: 

(1) 

Utt  UXX 

O 

All 

VII 

k 

VII 

o 

(2) 

u(x , 0)  = f(x) 

(Given  initial  displacement) 

(3) 

ut(x,  0)  = g(x) 

(Given  initial  velocity) 

(4) 

u{ 0,  t)  = k(1,  f)  = 0 

(Boundary  conditions). 

Note  that  an  equation  utt  = c2uxx  and  another  x-interval  can  be  reduced  to  the  form  ( 1 ) 
by  a linear  transformation  of  x and  t.  This  is  similar  to  Sec.  21.6,  Prob.  1. 

For  instance,  ( 1 )— (4 ) is  the  model  of  a vibrating  elastic  string  with  fixed  ends  at  x = 0 

and  x = 1 (see  Sec.  12.2).  Although  an  analytic  solution  of  the  problem  is  given  in  (13), 
Sec.  12.4,  we  use  the  problem  for  explaining  basic  ideas  of  the  numeric  approach  that  are 
also  relevant  for  more  complicated  hyperbolic  PDEs. 

Replacing  the  derivatives  by  difference  quotients  as  before,  we  obtain  from  ( 1)  [see  (6) 
in  Sec.  21.4  with  y = t\ 


(5) 


2 1 + llij—  l)  2 (^i+l,j  2ujj  + Ui  — ij) 

k h 


where  h is  the  mesh  size  in  x,  and  k is  the  mesh  size  in  t.  This  difference  equation  relates 
5 points  as  shown  in  Fig.  470a.  It  suggests  a rectangular  grid  similar  to  the  grids  for 
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EXAMPLE  1 


parabolic  equations  in  the  preceding  section.  We  choose  r*  = k2/h2  = 1.  Then  m™  drops 
out  and  we  have 

(6)  Uij+i  = Mi-i, j + Ui+i,j  ~ “lj-i  (Fig.  470b). 


It  can  be  shown  that  for  0 < r*  g 1 the  present  explicit  method  is  stable,  so  that  from 
(6)  we  may  expect  reasonable  results  for  initial  data  that  have  no  discontinuities.  (For  a 
hyperbolic  PDE  the  latter  would  propagate  into  the  solution  domain — a phenomenon  that 
would  be  difficult  to  deal  with  on  our  present  grid.  For  unconditionally  stable  implicit 
methods  see  [El]  in  App.  1.) 


x 

\k 


x 

(a)  Formula  (5) 


Fig.  470. 


Time  row  j + 1 
Time  row  j 
Time  row  j - 1 


X X 


X 

(b)  Formula  (6) 

Mesh  points  used  in  (5)  and  (6) 


Equation  (6)  still  involves  3 time  steps  j — 1 ,j,j  + 1,  whereas  the  formulas  in  the 
parabolic  case  involved  only  2 time  steps.  Furthermore,  we  now  have  2 initial  conditions. 
So  we  ask  how  we  get  started  and  how  we  can  use  the  initial  condition  (3).  This  can  be 
done  as  follows. 

From  ut(x,  0)  = g(x)  we  derive  the  difference  formula 


(7)  — («ii  - m r)  = gi,  hence  ut  i = ulX  - 2% 

2k 

where  gt  = g(ih).  For  t = 0,  that  is,  j = 0,  equation  (6)  is 

Mil  Mj_i  o T Mi  + j o M^-i- 

Into  this  we  substitute  Mi  _i  as  given  in  (7).  We  obtain  mu  = «i-i,o  + ui+  i,o  — uii  + 2 kgi 
and  by  simplification 

(8)  Mil  = 2(«i—  i,o  + Mi+i>0)  + kgi, 

This  expresses  in  terms  of  the  initial  data.  It  is  for  the  beginning  only.  Then  use  (6). 

Vibrating  String,  Wave  Equation 

Apply  the  present  method  with  h = k = 0.2  to  the  problem  (l)-(4),  where 

/( x)  = sin  7 tx,  g(x)  = 0. 

Solution.  The  grid  is  the  same  as  in  Fig.  468,  Sec.  21 .6,  except  for  the  values  of  t,  which  now  are  0.2,  0.4,  • • • 
(instead  of  0.04,  0.08,  • • • )•  The  initial  values  uqq,  u iq,  • * • are  the  same  as  in  Example  1,  Sec.  21.6.  From  (8) 
and  g(x)  = 0 we  have 

un  — h(ui- i,o  + ui+ i,o)- 
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From  this  we  compute,  using  »10  = u40  = sin  0.277  = 0.587785,  «20  — u:’,a  = 0.951057, 

(t  = 1)  »n  = g (mqo  + »2o)  = h ' 0.951057  = 0.475528 
(i  = 2)  «21  = |(«10  + h30)  = | • 1.538842  = 0.769421 

and  u3i  = m2i,  m4i  = u n by  symmetry  as  in  Sec.  21.6,  Example  1.  From  (6)  with  7 = 1 we  now  compute, 
using  w0i  = t'02  = ■ ■ ■ = 0, 


(;  = 1)  u 12  = «oi  + »2i  - »io  = 0.769421  - 0.587785  = 0.181636 

(i  = 2)  w22  = «n  + u31  - »20  = 0.475528  + 0.769421  - 0.951057  = 0.293892, 

and  11 32  = 11 22- 11 42  = “12  by  symmetry;  and  so  on.  We  thus  obtain  the  following  values  of  the  displacement 
u(x,  t)  of  the  string  over  the  first  half-cycle: 


7 

x = 0 

jc  = 0.2 

* 

II 

o 

x = 0.6 

x = 0.8 

X = 1 

0.0 

0 

0.588 

0.951 

0.951 

0.588 

0 

0.2 

0 

0.476 

0.769 

0.769 

0.476 

0 

0.4 

0 

0.182 

0.294 

0.294 

0.182 

0 

0.6 

0 

-0.182 

-0.294 

-0.294 

-0.182 

0 

0.8 

0 

-0.476 

-0.769 

-0.769 

-0.476 

0 

1.0 

0 

-0.588 

-0.951 

-0.951 

-0.588 

0 

These  values  are  exact  to  3D  (3  decimals),  the  exact  solution  of  the  problem  being  (see  Sec.  12.3) 

u(x,  7)  = sin  ttx  cos  177. 

The  reason  for  the  exactness  follows  from  d’Alembert’s  solution  (4),  Sec.  12.4.  (See  Prob.  4,  below.) 


This  is  the  end  of  Chap.  21  on  numerics  for  ODEs  and  PDEs,  a field  that  continues  to 
develop  rapidly  in  both  applications  and  theoretical  research.  Much  of  the  activity  in  the 
field  is  due  to  the  computer  serving  as  an  invaluable  tool  for  solving  large-scale  and 
complicated  practical  problems  as  well  as  for  testing  and  experimenting  with  innovative 
ideas.  These  ideas  could  be  small  or  major  improvements  on  existing  numeric  algorithms 
or  testing  new  algorithms  as  well  as  other  ideas. 


PROBLEM  SET  217 


VIBRATING  STRING 


1-3  Using  the  present  method,  solve  ( 1) — (4)  with 
h = k = 0.2  for  the  given  initial  deflection /(.r)  and  initial 
velocity  0 on  the  given  7-interval. 


1.  f{x)  = jcifO  = x < i fix)  = |(1  ~x)ifhSx£  1, 
0 S 7 S 1 

2.  fix)  = jv2  — jc3,  0 S 7 S 2 

3.  fix)  = 0.2(x  - jc2),  0 S 7 S 2 


4.  Another  starting  formula.  Show  that  (12)  in  Sec.  12.4 
gives  the  starting  formula 


1 2 (M7+1.0 


L,o)  + 


Xi +k 


gis)  ds 


(where  one  can  evaluate  the  integral  numerically  if 
necessary).  In  what  case  is  this  identical  with  (8)? 

5.  Nonzero  initial  displacement  and  speed.  Illustrate  the 
starting  procedure  when  both/and  g are  not  identically 
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zero,  say,  f(x)  = 1 — cos  27rx,  g(x)  = x(l  — x), 
h = k = 0.1,  2 time  steps. 

6.  Solve  (1) — (3)  (h  = k = 0.2,  5 time  steps)  subject  to 
/(x)  = xz,  g(x)  = 2x,  ux( 0,  r)  = 2 1,  u(  1,  t)  = (1  + f)2. 

7.  Zero  initial  displacement.  If  the  string  governed  by  the 
wave  equation  (1)  starts  from  its  equilibrium  position  with 
initial  velocity  g(x)  = sin  ttx,  what  is  its  displacement 
at  time  t = 0.4  and  x = 0.2,  0.4,  0.6,  0.8?  (Use  the 
present  method  with  h = 0.2,  k = 0.2.  Use  (8).  Compare 
with  the  exact  values  obtained  from  (12)  in  Sec.  12.4.) 


8.  Compute  approximate  values  in  Prob.  7,  using  a finer 
grid  {h  = 0.1,  k — 0.1),  and  notice  the  increase  in 
accuracy. 

9.  Compute  u in  Prob.  5 for  t = 0.1  and  x = 0.1, 

0.2,  ■ ■ ■ , 0.9,  using  the  formula  in  Prob.  8,  and  compare 
the  values. 

10.  Show  that  from  d'Alembert’s  solution  (13)  in  Sec. 12.4 
with  c = 1 it  follows  that  (6)  in  the  present  section 
gives  the  exact  value  Uij+i  = u{ih , (j  + l)/t). 


^BEIEEEEBBEHEFEga^FEEES  T I O N S AND  PROBLEMS 


1.  Explain  the  Euler  and  improved  Euler  methods 
in  geometrical  terms.  Why  did  we  consider  these 
methods? 

2.  How  did  we  obtain  numeric  methods  from  the  Taylor 
series? 

3.  What  are  the  local  and  the  global  orders  of  a method? 
Give  examples. 

4.  Why  did  we  compute  auxiliary  values  in  each  Runge- 
Kutta  step?  How  many? 

5.  What  is  adaptive  integration?  How  does  its  idea  extend 
to  Runge-Kutta? 

6.  What  are  one-step  methods?  Multistep  methods?  The 
underlying  ideas?  Give  examples. 

7.  What  does  it  mean  that  a method  is  not  self-starting? 
How  do  we  overcome  this  problem? 

8.  What  is  a predictor-corrector  method?  Give  an 
important  example. 

9.  What  is  automatic  step  size  control?  When  is  it  needed? 
How  is  it  done  in  practice? 

10.  How  do  we  extend  Runge-Kutta  to  systems  of  ODEs? 

11.  Why  did  we  have  to  treat  the  main  types  of  PDEs  in 
separate  sections?  Make  a list  of  types  of  problems  and 
numeric  methods. 

12.  When  and  how  did  we  use  finite  differences?  Give  as 
many  details  as  you  can  remember  without  looking 
into  the  text. 

13.  How  did  we  approximate  the  Laplace  and  Poisson 
equations? 

14.  How  many  initial  conditions  did  we  prescribe  for  the 
wave  equation?  For  the  heat  equation? 

15.  Can  we  expect  a difference  equation  to  give  the  exact 
solution  of  the  corresponding  PDE? 

16.  In  what  method  for  PDEs  did  we  have  convergence 
problems? 


17.  Solve  y = y,  y(0)  = 1 by  Euler’s  method,  10  steps, 
h = 0.1. 

18.  Do  Prob.  17  with  h = 0.01,  10  steps.  Compute  the  errors. 
Compare  the  error  for  x = 0. 1 with  that  in  Prob.  17. 

19.  Solve  y = 1 + y2,  y(0)  = 0 by  the  improved  Euler 
method,  h = 0.1,  10  steps. 

20.  Solve  y + y = (x  + l)2,  y(0)  = 3 by  the  improved 
Euler  method,  10  steps  with  h = 0.1.  Determine  the 
errors. 

21.  Solve  Prob.  19  by  RK  with  h = 0.1,  5 steps.  Compute 
the  error.  Compare  with  Prob.  19. 

22.  Fair  comparison.  Solve  y = 2x_1Vy  — lnx  + x-1, 
y(l)  = 0 for  1 S x S 1.8  (a)  by  the  Euler  method  with 
h = 0.1,  (b)  by  the  improved  Euler  method  with 
h = 0.2,  and  (c)  by  RK  with  h = 0.4.  Verify  that  the 
exact  solution  is  y = (lnx)2  + lnx.  Compute  and 
compare  the  errors.  Why  is  the  comparison  fair? 

23.  Apply  the  Adams-Moulton  method  to  y = \/ 1 — y2, 
y(0)  = 0,  h = 0.2,  x = 0,  • • • , 1,  starting  with 
0.198668,  0.389416,  0.564637. 

24.  Apply  the  A-M  method  toy'  = (x  + y — 4)2,y(0)  = 4, 
h = 0.2,  x = 0,  ■ ■ • , 1,  starting  with  4.00271,  4.02279, 
4.08413. 

25.  Apply  Euler’s  method  for  systems  to  y"  = x2y, 
y(0)  = l,y'(0)  = 0,  h = 0.1,  5 steps. 

26.  Apply  Euler’s  method  for  systems  to  y[  = y2, 
y2  = —4yi,  yi(0)  = 2,  >>2(0)  = 0,  h = 0.2,  10  steps. 
Sketch  the  solution. 

27.  Apply  Runge-Kutta  for  systems  to  y"  + y = 2ex, 
y(0 ) = 0,  ;/(0)  = 1,  h = 0.2,  5 steps.  Determine  the 
errors. 

28.  Apply  Runge-Kutta  for  systems  to  = 6yi  + 9v2, 
>’2  = yi  + 6y2,  yi(0)  = -3,  y2(0)  = -3,  h = 0.05, 
3 steps. 
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29.  Find  rough  approximate  values  of  the  electrostatic 
potential  at  P14,  P12,  Pi 3 in  Fig.  47 1 that  lie  in  a field 
between  conducting  plates  (in  Fig.  471  appearing  as 
sides  of  a rectangle)  kept  at  potentials  0 and  220  V as 
shown.  (Use  the  indicated  grid.) 


30.  A laterally  insulated  homogeneous  bar  with  ends  at 
x = 0 and  x = 1 has  initial  temperature  0.  Its  left  end 
is  kept  at  0,  whereas  the  temperature  at  the  right  end 
varies  sinusoidally  according  to 

u(t,  1)  = g(t)  = sin  ^ 7Tt. 

Find  the  temperature  u(x,  t ) in  the  bar  [solution  of  (1) 
in  Sec.  21.6]  by  the  explicit  method  with  h = 0.2  and 
r = 0.5  (one  period,  that  is,  0 S t £ 0.24). 

31.  Find  the  solution  of  the  vibrating  string  problem 

utt  ~ uxx>  k(jc,  0)  = jc(1  — x),  ut  = 0,  m(0,  t ) = 


u{  1,  t)  = 0 by  the  method  in  Sec.  21.7  with  h = 0.1 
and  k = 0.1  for  t = 0.3. 


32-34 


POTENTIAL 


Find  the  potential  in  Fig.  472,  using  the  given  grid  and  the 
boundary  values: 


32.  w(P0i)  = w(P03)  = w(P41)  = w(P43)  = 200, 
u(P 10)  = wlP'io)  = —400,  u(P2o)  = 1600, 
u(P<>2)  = WCP42)  = «(fi4)  = m(P24)  = m(P34)  = 0 


33.  m(Pio)  = u(P3o)  = 960,  «(P2o)  = — 480,  u = 0 
elsewhere  on  the  boundary 

34.  u = 70  on  the  upper  and  left  sides,  u = 0 on  the  lower 
and  right  sides 


Fig.  472.  Problems  32-34 

35.  Solve  ut  — uxx  (0  S x = 1,  t = 0), 

u(x,  0)  = x2(l  — x),  u{ 0,  t)  = u(  \ , t)  = 0 by  Crank- 
Nicolson  with  h = 0.2,  k — 0.04,  5 time  steps. 


Numerics  for  ODEs  and  PDEs 


In  this  chapter  we  discussed  numerics  for  ODEs  (Secs.  21.1-21.3)  and  PDEs  (Secs. 
21.4-21.7).  Methods  for  initial  value  problems 

(1)  y'=f(x,y),  y(xo)  = Jo 

involving  a first-order  ODE  are  obtained  by  truncating  the  Taylor  series 

h2 

y{x  + h)  = y(x)  + hy'(x ) + — y"(x)  + • ■ ■ 


Summary  of  Chapter  21 
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where,  by  (1),  y =fy"  = f'  = df/dx  + (df/dy)y',  etc.  Truncating  after  the  term 
hy  , we  get  the  Euler  method,  in  which  we  compute  step  by  step 

(2)  yn+ 1 = yn  + hf(xn,  yn)  (n  = 0,  1,  • ■ ■ ). 

Taking  one  more  term  into  account,  we  obtain  the  improved  Euler  method.  Both 
methods  show  the  basic  idea  but  are  too  inaccurate  in  most  cases. 

Truncating  after  the  term  in  /z4,  we  get  the  important  classical  Runge-Kutta 
(RK)  method  of  fourth  order.  The  crucial  idea  in  this  method  is  the  replacement 
of  the  cumbersome  evaluation  of  derivatives  by  the  evaluation  of  f(x,  y)  at 
suitable  points  (x,  y);  thus  in  each  step  we  first  compute  four  auxiliary  quantities 
(Sec.  21.1) 

k i = hf(xn,  yn) 

k2  = hf(xn  + \h,  yn  + \ki) 

(3a)  1 j 

k 3 = hf(xn  + 2«,  yn  + 2k2) 
k4  = hf(xn  + h,  yn  + k3) 

and  then  the  new  value 

(3b)  yn+i  = yn  + b(^'i  + 2 A: 2 + 2k3  + k4). 

Error  and  step  size  control  are  possible  by  step  halving  or  by  RKF 
(Runge-Kutta-Fehlberg). 

The  methods  in  Sec.  21.1  are  one-step  methods  since  they  get  yn+i  from  the 
result  yn  of  a single  step.  A multistep  method  (Sec.  21.2)  uses  the  values  of 
yn,yn- 1,  • • • of  several  steps  for  computing  yn+\.  Integrating  cubic  interpolation 
polynomials  gives  the  Adams-Bashforth  predictor  (Sec.  21.2) 

(4a)  y*  + 1 = yn  + 24 h(55fn  - 59/n_i  + 37/TO_2  - 9 fn-3) 

where  fj  = f(xj,  w),  and  an  Adams-Moulton  corrector  (the  actual  new  value) 

(4b)  yn+ 1 = yn  + Mh(9fn+i  + 19 fn  - 5/w_i  +/n_2), 

where  fn+i  = f{xn+ i,  >’n+i).  Here,  to  get  started,  yq,  y2,  y3  must  be  computed  by 
the  Runge-Kutta  method  or  by  some  other  accurate  method. 

Section  19.3  concerned  the  extension  of  Euler  and  RK  methods  to  systems 

y ' = f (x,  y),  thus  yj  = f)(x,  yx,  ■ ■ ■ , ym),  j = 1,  • • ■ , m. 

This  includes  single  /nth-order  ODEs,  which  are  reduced  to  systems.  Second-order 
equations  can  also  be  solved  by  RKN  (Runge-Kutta-Nystrom)  methods.  These  are 
particularly  advantageous  for  y = f(x,  y)  with  / not  containing  y . 
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Numeric  methods  for  PDEs  are  obtained  by  replacing  partial  derivatives  by 
difference  quotients.  This  leads  to  approximating  difference  equations,  for  the 

Laplace  equation  to 

(5)  Ui+  ij  + Uij+i  + m- ij  + Uij-i  ~ 4uy  = 0 (Sec.  21.4) 

for  the  heat  equation  to 

(6)  1 Mjj)  2 Ujj  T Ui — i,j)  (Sec.  21.6) 

K /f 

and  for  the  wave  equation  to 


(7) 


o 1 2n j «■  T Ui  j—i) 

k2 


2 ^Mi+l,j 

h 


2,Ujj  "f  Ui—\ j) 


(Sec.  21.7); 


here  h and  k are  the  mesh  sizes  of  a grid  in  the  x-  and  y-directions,  respectively, 
where  in  (6)  and  (7)  the  variable  y is  time  t. 

These  PDEs  are  elliptic,  parabolic,  and  hyperbolic,  respectively.  Corresponding 
numeric  methods  differ,  for  the  following  reason.  For  elliptic  PDEs  we  have 
boundary  value  problems,  and  we  discussed  for  them  the  Gauss-Seidel  method 
(also  known  as  Liebmann’s  method)  and  the  ADI  method  (Secs.  21.4,  21.5).  For 
parabolic  PDEs  we  are  given  one  initial  condition  and  boundary  conditions,  and 
we  discussed  an  explicit  method  and  the  Crank-Nicolson  method  (Sec.  21.6).  For 
hyperbolic  PDEs,  the  problems  are  similar  but  we  are  given  a second  initial 
condition  (Sec.  21.7). 


PART 


Optimization, 

Graphs 


CHAPTER  22  Unconstrained  Optimization.  Linear  Programming 

CHAPTER  23  Graphs.  Combinatorial  Optimization 

The  material  of  Part  F is  particularly  useful  in  modeling  large-scale  real-world  problems. 
Just  as  it  is  in  numerics  in  Part  E,  where  the  greater  availability  of  quality  software  and 
computing  power  is  a deciding  factor  in  the  continued  growth  of  the  field,  so  it  is  also  in 
the  fields  of  optimization  and  combinatorial  optimization.  Problems,  such  as  optimizing 
production  plans  for  different  industries  (microchips,  pharmaceuticals,  cars,  aluminum, 
steel,  chemicals),  optimizing  usage  of  transportation  systems  (usage  of  runways  in  airports, 
tracks  of  subways),  efficiency  in  running  of  power  plants,  optimal  shipping  (delivery 
services,  shipping  of  containers,  shipping  goods  from  factories  to  warehouses  and  from 
warehouses  to  stores),  designing  optimal  financial  portfolios,  and  others  are  all  examples 
where  the  size  of  the  problem  usually  requires  the  use  of  optimization  software.  More 
recently,  environmental  concerns  have  put  new  aspects  into  the  picture,  where  an  important 
concern,  added  to  these  problems,  is  the  minimization  of  environmental  impact.  The  main 
task  becomes  to  model  these  problems  correctly.  The  purpose  of  Part  F is  to  introduce 
the  main  ideas  and  methods  of  unconstrained  and  constrained  optimization  (Chap.  22), 
and  graphs  and  combinatorial  optimization  (Chap.  23). 

Chapter  22  introduces  unconstrained  optimization  by  the  method  of  steepest  descent  and 
constrained  optimization  by  the  versatile  simplex  method.  The  simplex  method  (Secs. 
22.3,  22.4)  is  very  useful  for  solving  many  linear  optimization  problems  (also  called  linear 
programming  problems). 

Graphs  let  us  model  problems  in  transportation  logistics,  efficient  use  of  communication 
networks,  best  assignment  of  workers  to  jobs,  and  others.  We  consider  shortest  path  problems 
(Secs.  22.2,  22.3),  shortest  spanning  trees  (Secs.  23.4,  23.5),  flow  problems  in  networks  (Secs. 
23.6,  23.7),  and  assignment  problems  (Sec.  23.8).  We  discuss  algorithms  of  Moore,  Dijkstra 
(both  for  shortest  path),  Kruskal,  Prim  (shortest  spanning  trees),  and  Ford-Fulkerson  (for  flow). 
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CHAPTER  2 2 


Unconstrained  Optimization. 
Linear  Programming 


Optimization  is  a general  term  used  to  describe  types  of  problems  and  solution  techniques 
that  are  concerned  with  the  best  (“optimal”)  allocation  of  limited  resources  in  projects.  The 
problems  are  called  optimization  problems  and  the  methods  optimization  methods.  Typical 
problems  are  concerned  with  planning  and  making  decisions,  such  as  selecting  an  optimal 
production  plan.  A company  has  to  decide  how  many  units  of  each  product  from  a choice 
of  (distinct)  products  it  should  make.  The  objective  of  the  company  may  be  to  maximize 
overall  profit  when  the  different  products  have  different  individual  profits.  In  addition,  the 
company  faces  certain  limitations  (constraints).  It  may  have  a certain  number  of  machines, 
it  takes  a certain  amount  of  time  and  usage  of  these  machines  to  make  a product,  it  requires 
a certain  number  of  workers  to  handle  the  machines,  and  other  possible  criteria.  To  solve 
such  a problem,  you  assign  the  first  variable  to  number  of  units  to  be  produced  of  the  first 
product,  the  second  variable  to  the  second  product,  up  to  the  number  of  different  (distinct) 
products  the  company  makes.  When  you  multiply  these,  for  example,  by  the  price,  you 
obtain  a linear  function  called  the  objective  function.  You  also  express  the  constraints  in 
terms  of  these  variables,  thereby  obtaining  several  inequalities,  called  the  constraints. 
Because  the  variables  in  the  objective  function  also  occur  in  the  constraints,  the  objective 
function  and  the  constraints  are  tied  mathematically  to  each  other  and  you  have  set  up  a 
linear  optimization  problem,  also  called  a linear  programming  problem. 

The  main  focus  of  this  chapter  is  to  set  up  (Sec.  22.2)  and  solve  (Secs.  22.3,  22.4)  such 
linear  programming  problems.  A famous  and  versatile  method  for  doing  so  is  the  simplex 
method.  In  the  simplex  method , the  objective  function  and  the  constraints  are  set  up  in 
the  form  of  an  augmented  matrix  as  in  Sec.  7.3,  however,  the  method  of  solving  such 
linear  constrained  optimization  problems  is  a new  approach. 

The  beauty  of  the  simplex  method  is  that  it  allows  us  to  scale  problems  up  to  thousands 
or  more  constraints,  thereby  modeling  real-world  situations.  We  can  start  with  a small 
model  and  gradually  add  more  and  more  constraints.  The  most  difficult  part  is  modeling 
the  problem  correctly.  The  actual  task  of  solving  large  optimization  problems  is  done  by 
software  implementations  for  the  simplex  method  or  perhaps  by  other  optimization  methods. 

Besides  optimal  production  plans,  problems  in  optimal  shipping,  optimal  location  of 
warehouses  and  stores,  easing  traffic  congestion,  efficiency  in  running  power  plants  are 
all  examples  of  applications  of  optimization.  More  recent  applications  are  in  minimizing 
environmental  damages  due  to  pollutants,  carbon  dioxide  emissions,  and  other  factors. 
Indeed,  new  fields  of  green  logistics  and  green  manufacturing  are  evolving  and  naturally 
make  use  of  optimization  methods. 

Prerequisite:  a modest  working  knowledge  of  linear  systems  of  equations. 

References  and  Answers  to  Problems:  App.  1 Part  F,  App.  2. 


SEC.  22.1  Basic  Concepts.  Unconstrained  Optimization 


951 


22.1  Basic  Concepts. 

Unconstrained  Optimization: 

Method  of  Steepest  Descent 

In  an  optimization  problem  the  objective  is  to  optimize  ( maximize  or  minimize ) some 
function  /.  This  function /is  called  the  objective  function.  It  is  the  focal  point  or  goal  of 
our  optimization  problem. 

For  example,  an  objective  function/to  be  maximized  may  be  the  revenue  in  a production 
of  TV  sets,  the  rate  of  return  of  a financial  portfolio,  the  yield  per  minute  in  a chemical 
process,  the  mileage  per  gallon  of  a certain  type  of  car,  the  hourly  number  of  customers 
served  in  a bank,  the  hardness  of  steel,  or  the  tensile  strength  of  a rope. 

Similarly,  we  may  want  to  minimize  f if  / is  the  cost  per  unit  of  producing  certain 
cameras,  the  operating  cost  of  some  power  plant,  the  daily  loss  of  heat  in  a heating  system, 
CO2  emissions  from  a fleet  of  trucks  for  freight  transport,  the  idling  time  of  some  lathe, 
or  the  time  needed  to  produce  a fender. 

In  most  optimization  problems  the  objective  function  / depends  on  several  variables 


Xi,  , xn. 

These  are  called  control  variables  because  we  can  “control”  them,  that  is,  choose  their  values. 

For  example,  the  yield  of  a chemical  process  may  depend  on  pressure  x4  and  temperature 
X2-  The  efficiency  of  a certain  air-conditioning  system  may  depend  on  temperature  x±,  air 
pressure  X2.  moisture  content  *3,  cross-sectional  area  of  outlet  x4,  and  so  on. 

Optimization  theory  develops  methods  for  optimal  choices  of  x±,  ■ ■ ■ , xn,  which  maximize 
(or  minimize)  the  objective  function/,  that  is,  methods  for  finding  optimal  values  of  X\,  • • ■ , xn. 

In  many  problems  the  choice  of  values  of  xi,  • • • , xn  is  not  entirely  free  but  is  subject 
to  some  constraints,  that  is,  additional  restrictions  arising  from  the  nature  of  the  problem 
and  the  variables. 

For  example,  if  X\  is  production  cost,  then  X\  = 0,  and  there  are  many  other  variables 
(time,  weight,  distance  traveled  by  a salesman,  etc.)  that  can  take  nonnegative  values  only. 
Constraints  can  also  have  the  form  of  equations  (instead  of  inequalities). 

We  first  consider  unconstrained  optimization  in  the  case  of  a function /(x4,  • ■ ■ , xn). 
We  also  write  x = (xi,  • • • , xn)  and/(x),  for  convenience. 

By  definition, /has  a minimum  at  a point  x = Xq  in  a region  R (where /is  defined)  if 

/(X)  = /'(Xq) 

for  all  x in  R.  Similarly,  / has  a maximum  at  Xq  in  R if 

/(x)  g/(X 0) 

for  all  x in  R.  Minima  and  maxima  together  are  called  extrema. 

Furthermore,  / is  said  to  have  a local  minimum  at  Xq  if 

/(x)  a/(X 0) 

for  all  x in  a neighborhood  of  Xo,  say,  for  all  x satisfying 

|x  - Xol  = [(X!  -X1f  + ---+  (xn  - Xn)2]1/2  < r, 
where  Xq  = (X\,  • • • , Xn)  and  r > 0 is  sufficiently  small. 
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Similarly, /has  a local  maximum  at  Xo  if /(x)  Si  /(Xo)  for  all  x satisfying  |x  — Xj  < r. 
If /is  differentiable  and  has  an  extremum  at  a point  Xo  in  the  interior  of  a region  R 
(that  is,  not  on  the  boundary),  then  the  partial  derivatives  df/dx\,  ■ ■ • , <{f/dxn  must  be  zero 
at  Xo-  These  are  the  components  of  a vector  that  is  called  the  gradient  of  / and  denoted 
by  grad /or  V/.  (For  n = 3 this  agrees  with  Sec.  9.7.)  Thus 

(1)  V/(X o)  = 0. 


A point  Xo  at  which  (1)  holds  is  called  a stationary  point  off. 

Condition  (1)  is  necessary  for  an  extremum  of /at  Xo  in  the  interior  of  R , but  is  not 
sufficient.  Indeed,  if  n = 1,  then  for  y = fix),  condition  (1)  is  y = /(X q)  = 0;  and,  for 
instance,  y = x satisfies  y = 3x  = 0 at  x = X0  = 0 where  / has  no  extremum  but  a 
point  of  inflection.  Similarly,  for/(x)  = X1X2  we  have  V/(  0)  = 0,  and /does  not  have  an 
extremum  but  has  a saddle  point  at  0.  Hence,  after  solving  (1),  one  must  still  find  out 
whether  one  has  obtained  an  extremum.  In  the  case  n = 1 the  conditions  y'(X o)  = 0, 
y\x o)  0 guarantee  a local  minimum  at  Xo  and  the  conditions  y (Xo)  — 0,  y (Xo)  0 a 
local  maximum,  as  is  known  from  calculus.  For  n > 1 there  exist  similar  criteria.  However, 
in  practice,  even  solving  (1)  will  often  be  difficult.  For  this  reason,  one  generally  prefers 
solution  by  iteration,  that  is,  by  a search  process  that  starts  at  some  point  and  moves 
stepwise  to  points  at  which  / is  smaller  (if  a minimum  of  / is  wanted)  or  larger  (in  the 
case  of  a maximum). 

The  method  of  steepest  descent  or  gradient  method  is  of  this  type.  We  present  it  here 
in  its  standard  form.  (For  refinements  see  Ref.  [E25]  listed  in  App.  1.) 

The  idea  of  this  method  is  to  find  a minimum  of  /(x)  by  repeatedly  computing  minima 
of  a function  g(t)  of  a single  variable  t,  as  follows.  Suppose  that /has  a minimum  at  Xo 
and  we  start  at  a point  x.  Then  we  look  for  a minimum  of/ closest  to  x along  the  straight 
line  in  the  direction  of  — V/(x),  which  is  the  direction  of  steepest  descent  (=  direction 
of  maximum  decrease)  of  / at  x.  That  is,  we  determine  the  value  of  t and  the  correspond- 
ing point 

(2)  z (t)  = x - rV/(x) 

at  which  the  function 

o)  g(t)  = mt)) 

has  a minimum.  We  take  this  z (t)  as  our  next  approximation  to  Xo- 

Method  of  Steepest  Descent 

Determine  a minimum  of 

(4)  /(x)  = x\  + 3xl 

starting  from  Xq  = (6,  3)  = 6i  + 3j  and  applying  the  method  of  steepest  descent. 

Solution.  Clearly,  inspection  shows  that  /(x)  has  a minimum  at  0.  Knowing  the  solution  gives  us  a better 
feel  of  how  the  method  works.  We  obtain  V/(x)  = 2x^\  + 6a-2  j and  from  this 


z (0  = x - fV/(x)  = (1  - 2r)*ii  + (1  - 60*2 j 
g(t)  =/(z(0)  = (1  - Itfxl  + 3(1  - 6tfxl. 
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We  now  calculate  the  derivative 


g\t)  = 2(1  - 2t)x\(-2)  + 6(1  - 6f)*i(-6), 
set  g (t)  = 0,  and  solve  for  t,  finding 

x\  + 9xl 

t = . 

2xf  + 54.vl 


Starting  from  jtq  — 6i  + 3j,  we  compute  the  values  in  Table  22.1,  which  are  shown  in  Fig.  473. 

Figure  473  suggests  that  in  the  case  of  slimmer  ellipses  (“a  long  narrow  valley”),  convergence  would  be 
poor.  You  may  confirm  this  by  replacing  the  coefficient  3 in  (4)  with  a large  coefficient.  For  more  sophisticated 
descent  and  other  methods,  some  of  them  also  applicable  to  vector  functions  of  vector  variables,  we  refer  to  the 
references  listed  in  Part  F of  App.  1;  see  also  [E25]. 


Table  22.1  Method  of  Steepest  Descent,  Computations  in  Example  1 


n 

X 

t 

1 - 2 1 

1 - 6 1 

0 

6.000 

3.000 

0.210 

0.581 

-0.258 

1 

3.484 

-0.774 

0.310 

0.381 

-0.857 

2 

1.327 

0.664 

0.210 

0.581 

-0.258 

3 

0.771 

-0.171 

0.310 

0.381 

-0.857 

4 

0.294 

0.147 

0.210 

0.581 

-0.258 

5 

0.170 

-0.038 

0.310 

0.381 

-0.857 

6 

0.065 

0.032 

P R O B LEM  S E 


1.  Orthogonality.  Show  that  in  Example  1,  successive 
gradients  are  orthogonal  (perpendicular).  Why? 

2.  What  happens  if  you  apply  the  method  of  steepest 
descent  to/(x)  = xf  + xf?  First  guess,  then  calculate. 


STEEPEST  DESCENT 

Do  steepest  descent  steps  when: 

3.  fix)  = 2xf  + xf  — 4xi  + 4^2,  xo  = 0,  3 steps 

4.  f(x)  = xf  + 0.5xi  ~ 5.0xi  - 3.0x2  + 24.95, 
x0  = (3,  4),  5 steps 


5.  f(x)  = ax i + bx2,  a # 0,  b 0.  First  guess,  then 
compute. 

6.  fix)  = xf  — xf,  x0  = (1,  2),  5 steps.  First  guess, 

then  compute.  Sketch  the  path.  What  if  x0  = (2,  1)? 

7.  fix)  = xf  + cxf,  *0  = (c>  1)-  Show  that  2 steps  give 
(c,  1)  times  a factor,  — 4c2/(c2  — l)2.  What  can  you 
conclude  from  this  about  the  speed  of  convergence? 

8.  fix)  = xf  — x2,  x0  = (1,  1);  3 steps.  Sketch  your  path. 
Predict  the  outcome  of  further  steps. 

9.  fix)  = O.lxf  + xf  — 0.02x!,  x0  = (3,  3),  5 steps 
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10.  CAS  EXPERIMENT.  Steepest  Descent,  (a)  Write  a 
program  for  the  method. 

(b)  Apply  your  program  to  /(x)  = x\  + 4x|,  exper- 
imenting with  respect  to  speed  of  convergence  depending 
on  the  choice  of  x0. 


(c)  Apply  your  program  to  /(x)  = x\  + x\  and  to 
/(x)  = x\  + x%  x0  = (2,  1).  Graph  level  curves  and 
your  path  of  descent.  (Try  to  include  graphing  directly 
in  your  program.) 


Linear  Programming 

Linear  programming  or  linear  optimization  consists  of  methods  for  solving  optimization 
problems  with  constraints , that  is,  methods  for  finding  a maximum  (or  a minimum) 

x = (*i,  • • • , xn)  of  a linear  objective  function 

z = /(x)  = a\X\  + a2x2  + • • • + anxn 

satisfying  the  constraints.  The  latter  are  linear  inequalities,  such  as  3xi  + 4^2  = 36,  or 

xi  0,  etc.  (examples  below).  Problems  of  this  kind  arise  frequently,  almost  daily,  for 
instance,  in  production,  inventory  management,  bond  trading,  operation  of  power  plants, 
routing  delivery  vehicles,  airplane  scheduling,  and  so  on.  Progress  in  computer  technology 
has  made  it  possible  to  solve  programming  problems  involving  hundreds  or  thousands  or 
more  variables.  Let  us  explain  the  setting  of  a linear  programming  problem  and  the  idea 
of  a “geometric”  solution,  so  that  we  shall  see  what  is  going  on. 

EXAMPLE  1 Production  Plan 

Energy  Savers,  Inc.,  produces  heaters  of  types  S and  L.  The  wholesale  price  is  $40  per  heater  for  S and  $88  for 
L.  Two  time  constraints  result  from  the  use  of  two  machines  M\  and  M2.  On  Mi  one  needs  2 min  for  an  S heater 
and  8 min  for  an  L heater.  On  M2  one  needs  5 min  for  an  S heater  and  2 min  for  an  L heater.  Determine  production 
figures  x\  and  *2  f°r  S and  L,  respectively  (number  of  heaters  produced  per  hour),  so  that  the  hourly  revenue 

z = /(x)  = 40*!  + 88x2 


is  maximum. 

Solution.  Production  figures  jq  and  X2  must  be  nonnegative.  Hence  the  objective  function  (to  be  maximized) 
and  the  four  constraints  are 


(0) 

z = 40xi  + 88*2 

(1) 

2xi  + 8x2  = 

60  min  time  on  machine  Mi 

(2) 

5*!  + 2X2  = 

60  min  time  on  machine  M2 

(3) 

*1  = 

0 

(4) 

0. 

Figure  474  shows  (0)-(4)  as  follows.  Constancy  lines 

z — const 

are  marked  (0).  These  are  lines  of  constant  revenue.  Their  slope  is  —40/88  = —5/11.  To  increase  z we  must 
move  the  line  upward  (parallel  to  itself),  as  the  arrow  shows.  Equation  (1)  with  the  equality  sign  is  marked  (1). 
It  intersects  the  coordinate  axes  at  xi  = 60/2  = 30  (set  X2  — 0)  and  X2  — 60/8  = 7.5  (set  x±  = 0).  The  arrow 
marks  the  side  on  which  the  points  (jti,  X2)  lie  that  satisfy  the  inequality  in  (1).  Similarly  for  Eqs.  (2)-(4).  The 
blue  quadrangle  thus  obtained  is  called  the  feasibility  region.  It  is  the  set  of  all  feasible  solutions,  meaning 
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solutions  that  satisfy  all  four  constraints.  The  figure  also  lists  the  revenue  at  O,  A , B,  C.  The  optimal  solution 
is  obtained  by  moving  the  line  of  constant  revenue  up  as  much  as  possible  without  leaving  the  feasibility  region 
completely.  Obviously,  this  optimum  is  reached  when  that  line  passes  through  B,  the  intersection  (10,  5)  of  (1) 
and  (2).  We  see  that  the  optimal  revenue 

Zmax  = 40  • 10  + 88  • 5 = $840 
is  obtained  by  producing  twice  as  many  S heaters  as  L heaters. 


Note  well  that  the  problem  in  Example  1 or  similar  optimization  problems  cannot  be 
solved  by  setting  certain  partial  derivatives  equal  to  zero,  because  crucial  to  such  problems 
is  the  region  in  which  the  control  variables  are  allowed  to  vary. 

Furthermore,  our  “geometric”  or  graphic  method  illustrated  in  Example  1 is  confined 
to  two  variables  jq,  x2.  However,  most  practical  problems  involve  much  more  than  two 
variables,  so  that  we  need  other  methods  of  solution. 

Normal  Form  of  a Linear  Programming  Problem 

To  prepare  for  general  solution  methods,  we  show  that  constraints  can  be  written  more 
uniformly.  Let  us  explain  the  idea  in  terms  of  (1), 


2xi  + 8x2  = 60. 


This  inequality  implies  60  — 2xi  — 8x2  = 0 (and  conversely),  that  is,  the  quantity 


X3  = 60  — 2xi  — 8x2 

is  nonnegative.  Hence,  our  original  inequality  can  now  be  written  as  an  equation 


2xi  + 8x2  + X3  = 60, 


where 


x3  g 0. 
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X3  is  a nonnegative  auxiliary  variable  introduced  for  converting  inequalities  to  equations. 
Such  a variable  is  called  a slack  variable,  because  it  “takes  up  the  slack”  or  difference 
between  the  two  sides  of  the  inequality. 


Conversion  of  Inequalities  by  the  Use  of  Slack  Variables 

With  the  help  of  two  slack  variables  *3,  *4  we  can  write  the  linear  programming  problem  in  Example  1 in  the 
following  form.  Maximize 


f = 40*1  + 88*2 


subject  to  the  constraints 


2*i  + 8*2  + *3  = 60 

5*i  + 2*2  + *4  — 60 

*i^  0 (i  = 1,  • • • , 4). 

We  now  have  n = 4 variables  and  m = 2 (linearly  independent)  equations,  so  that  two  of  the  four  variables, 
for  example,  *1,  *2,  determine  the  others.  Also  note  that  each  of  the  four  sides  of  the  quadrangle  in  Fig.  474 
now  has  an  equation  of  the  form  x\  = 0: 


OA:  *2  — 0, 

AB:  *4  = 0, 

BC:  *3  = 0, 

CO:  *i  = 0, 

A vertex  of  the  quadrangle  is  the  intersection  of  two  sides.  Hence  at  a vertex,  n — m = 4 — 2 = 2 of  the 
variables  are  zero  and  the  others  are  nonnegative.  Thus  at  A we  have  *2  — 0,  *4  = 0,  and  so  on. 

Our  example  suggests  that  a general  linear  optimization  problem  can  be  brought  to  the 
following  normal  form.  Maximize 

(5)  / = C±X±  + C2X2  + ■ ■ • + cnxn 

subject  to  the  constraints 


(6) 


aii^i  + ■ ■ • + alnxn  = hi 
+ • • • + a2nxn  = b2 


^mi^i  "h  ■ ■ * 4"  amnxn  bm 
Xi  = 0 (i  = 1,  • • • , n) 


with  all  bj  nonnegative.  (If  a bj  < multiply  the  equation  by  — 1.)  Here  x\,  ■ ■ ■ ,xn  include 
the  slack  variables  (for  which  the  cjs  in  /are  zero).  We  assume  that  the  equations  in  (6) 
are  linearly  independent.  Then,  if  we  choose  values  for  n — m of  the  variables,  the  system 
uniquely  determines  the  others.  Of  course,  since  we  must  have 

X\  0,  • • • , xn  0, 


this  choice  is  not  entirely  free. 
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Our  problem  also  includes  the  minimization  of  an  objective  function  / since  this 
corresponds  to  maximizing  — / and  thus  needs  no  separate  consideration. 

An  n-tuple  (xj,  • ■ ■ , xn ) that  satisfies  all  the  constraints  in  (6)  is  called  a.  feasible  point 
or  feasible  solution.  A feasible  solution  is  called  an  optimal  solution  if,  for  it,  the  objective 
function /becomes  maximum,  compared  with  the  values  of/ at  all  feasible  solutions. 

Finally,  by  a basic  feasible  solution  we  mean  a feasible  solution  for  which  at  least 
n — m of  the  variables  Xi,  ■ • ■ , xn  are  zero.  For  instance,  in  Example  2 we  have  n = 4, 
m = 2,  and  the  basic  feasible  solutions  are  the  four  vertices  O,  A,  B,  C in  Fig.  474.  Here 
B is  an  optimal  solution  (the  only  one  in  this  example). 

The  following  theorem  is  fundamental. 


THEOREM  1 


Optimal  Solution 

Some  optimal  solution  of  a linear  programming  problem  (5),  (6)  is  also  a basic 
feasible  solution  of  { 5),  (6). 


For  a proof,  see  Ref.  [F5],  Chap.  3 (listed  in  App.  1).  A problem  can  have  many  optimal 
solutions  and  not  all  of  them  may  be  basic  feasible  solutions;  but  the  theorem  guarantees 
that  we  can  find  an  optimal  solution  by  searching  through  the  basic  feasible  solutions 

/ n \ f n\ 

only.  This  is  a great  simplification;  but  since  there  are  I I = I different  ways 

\n  — m J \m  ) 

of  equating  n — m of  the  n variables  to  zero,  considering  all  these  possibilities,  dropping 
those  which  are  not  feasible  and  then  searching  through  the  rest  would  still  involve  very 
much  work,  even  when  n and  m are  relatively  small.  Hence  a systematic  search  is  needed. 
We  shall  explain  an  important  method  of  this  type  in  the  next  section. 


FR-QB^E^^SFT— 


REGIONS,  CONSTRAINTS 

Describe  and  graph  the  regions  in  the  first  quadrant  of 
the  xiX2-plane  determined  by  the  given  inequalities. 

1.  Xi  — 3x2  £ —6 
Xi  + x2  S 6 

2.  2*!  — x2  £ 6 

8xi  + 10x2  S 80 

X\  — 2 x2  £ —3 

3.  — 0.5xi  + x2  a 2 

x\  + x2  £ 2 
— X\  + 5x2  £ 5 

4.  — X\  + x2  a 5 
2xi  + x2  £ 10 

X2  a 4 
10xi  + 15x2  a 150 


5-  — xi  + x2  £ 0 

xi  + x2  a 5 
—2x1  + x2  a 16 

6.  x1  + x2  £ 2 

3xi  + 5x2  £ 15 
2xi  — x2  £ — 2 

— X\  + 2x2  a 10 

7.  Location  of  maximum.  Could  we  find  a profit 
fix i,  x2)  = apt ! + a2x2  whose  maximum  is  at  an 
interior  point  of  the  quadrangle  in  Fig.  474?  Give 
reason  for  your  answer. 

8.  Slack  variables.  Why  are  slack  variables  always 
nonnegative?  How  many  of  them  do  we  need? 

9.  What  is  the  meaning  of  the  slack  variables  X3,  X4  in 
Example  2 in  terms  of  the  problem  in  Example  1? 

10.  Uniqueness.  Can  we  always  expect  a unique  solution 
(as  in  Example  1)? 
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MAXIMIZATION,  MINIMIZATION 


Maximize  or  minimize  the  given  objective  function/ 
subject  to  the  given  constraints. 

11.  Maximize/ = 30xj  + 10x2  in  the  region  in  Prob.  5. 

12.  Minimize/  = 45.0xi  + 22.5x2  in  the  region  in  Prob.  4. 

13.  Maximize /=  5xi  + 25x2  in  the  region  in  Prob.  5. 

14.  Minimize/ = 5xj  + 25x2  in  the  region  in  Prob.  3. 

15.  Maximize /=  20x!  + 30x2  subject  to  4.t!  + 3x2  £ 
12,  xi  — x2  S —3,  x2  S 6,  2xi  — 3x2  S 0. 

16.  Maximize  /=  — 10xi  + 2x2  subject  to  X\  £ 0, 

x2  £ 0,  — Xi  + x2  £ — 1,  Xi  + x2  S 6,  x2  S 5. 


17.  Maximum  profit.  United  Metal,  Inc.,  produces  alloys 
B\  (special  brass)  and  B2  (yellow  tombac).  Bi  contains 
50%  copper  and  50%  zinc.  (Ordinary  brass  contains 
about  65%  copper  and  35%  zinc.)  B2  contains  75% 
copper  and  25%  zinc.  Net  profits  are  $120  per  ton  of 
Bi  and  $100  per  ton  of  B2.  The  daily  copper  supply  is 
45  tons.  The  daily  zinc  supply  is  30  tons.  Maximize 
the  net  profit  of  the  daily  production. 

18.  Maximum  profit.  The  DC  Drug  Company  produces 
two  types  of  liquid  pain  killer,  N (normal)  and  5 
(Super).  Each  bottle  of  N requires  2 units  of  drug  A,  1 
unit  of  drug  B,  and  1 unit  of  drug  C.  Each  bottle  of  S 
requires  1 unit  of  A,  1 unit  of  B,  and  3 units  of  C.  The 
company  is  able  to  produce,  each  week,  only  1400  units 
of  A,  800  units  of  B,  and  1800  units  of  C.  The  profit 
per  bottle  of  N and  5 is  $11  and  $15,  respectively. 
Maximize  the  total  profit. 


19.  Maximum  output.  Giant  Ladders,  Inc.,  wants  to 
maximize  its  daily  total  output  of  large  step  ladders  by 
producing  X\  of  them  by  a process  P\  and  x2  by  a 
process  P2,  where  7)  requires  2 hours  of  labor  and 
4 machine  hours  per  ladder,  and  P2  requires  3 hours  of 
labor  and  2 machine  hours.  For  this  kind  of  work,  1200 
hours  of  labor  and  1600  hours  on  the  machines  are,  at 
most,  available  per  day.  Find  the  optimal  X\  and  x2. 

20.  Minimum  cost.  Hardbrick,  Inc.,  has  two  kilns.  Kiln 
I can  produce  3000  gray  bricks,  2000  red  bricks,  and 
300  glazed  bricks  daily.  For  Kiln  II  the  corresponding 
figures  are  2000,  5000,  and  1500.  Daily  operating  costs 
of  Kilns  I and  II  are  $400  and  $600,  respectively.  Find 
the  number  of  days  of  operation  of  each  kiln  so  that 
the  operation  cost  in  filling  an  order  of  18,000  gray, 
34,000  red,  and  9000  glazed  bricks  is  minimized. 

21.  Maximum  profit.  Universal  Electric,  Inc.,  manufactures 
and  sells  two  models  of  lamps,  L i and  L2,  the  profit  being 
$150  and  $100,  respectively.  The  process  involves  two 
workers  Wj  and  W2  who  are  available  for  this  kind 
of  work  100  and  80  hours  per  month,  respectively. 
Wj  assembles  Li  in  20  min  and  L2  in  30  min.  W2  paints 
L1  in  20  min  and  L2  in  10  min.  Assuming  that  all  lamps 
made  can  be  sold  without  difficulty,  determine  production 
figures  that  maximize  the  profit. 

22.  Nutrition.  Foods  A and  B have  600  and  500  calories, 
contain  15  gand  30  g of  protein,  and  cost  $1.80  and  $2.10 
per  unit,  respectively.  Find  the  minimum  cost  diet  of  at 
least  3900  calories  containing  at  least  150  g of  protein. 


22.3  Simplex  Method 

From  the  last  section  we  recall  the  following.  A linear  optimization  problem  (linear 
programming  problem)  can  be  written  in  normal  form;  that  is: 


Maximize 

(1)  z = fix)  = cy.x  r + • ■ ■ + cnxn 
subject  to  the  constraints 

+ • • • + a\nxn  = b\ 

@21^1  T • ■ ■ + a2nXn  b2 

(2)  

om\X\  "t-  * * * T amrlxn  bm 

= 0 


(i  = h). 
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For  finding  an  optimal  solution  of  this  problem,  we  need  to  consider  only  the  basic  feasible 
solutions  (defined  in  Sec.  22.2),  but  there  are  still  so  many  that  we  have  to  follow  a 
systematic  search  procedure.  In  1948  G.  B.  Dantzig1  published  an  iterative  method,  called 
the  simplex  method,  for  that  purpose.  In  this  method,  one  proceeds  stepwise  from  one 
basic  feasible  solution  to  another  in  such  a way  that  the  objective  function  / always 
increases  its  value.  Let  us  explain  this  method  in  terms  of  the  example  in  the  last  section. 

In  its  original  form  the  problem  concerned  the  maximization  of  the  objective  function 

z = 4(ki  + 88x2 

2xi  + 8x2  = 60 


subject  to 


5jci  + 2x2  = 60 
Xi  SO 

x2  = 0- 


Converting  the  first  two  inequalities  to  equations  by  introducing  two  slack  variables  X3,  X4, 
we  obtained  the  normal  form  of  the  problem  in  Example  2.  Together  with  the  objective 
function  (written  as  an  equation  z ~ 40xi  — 88x2  = 0)  this  normal  form  is 

Z — 40xi  — 88x2  = 0 

(3)  2xi  + 8x2+X3  =60 

5xi  + 2x2  + x4  = 60 


where  xi  S 0,  ■ • ■ , x4  S 0.  This  is  a linear  system  of  equations.  To  find  an  optimal  solution 
of  it,  we  may  consider  its  augmented  matrix  (see  Sec.  7.3) 


(4) 


z 

X\ 

x2 

*3 

x4 

b 

r 1 1 

-40 

-88 

! 0 

0 

0 

T0  = 

0 ! 

2 

8 

1 

0 

60 

. 0 ! 

5 

2 

! 0 

1 

60 

1GEORGE  BERNARD  DANTZIG  (1914-2005),  American  mathematician,  who  is  one  of  the  pioneers  of 
linear  programming  and  inventor  of  the  simplex  method.  According  to  Dantzig  himself  (see  G.  B.  Dantzig, 
Linear  programming:  The  story  of  how  it  began,  in  J.  K.  Lenestra  et  al..  History  of  Mathematical  Programming: 
A Collection  of  Personal  Reminiscences.  Amsterdam:  Elsevier,  1991,  pp.  19-31),  he  was  particularly  fascinated 
by  Wassilly  Leontief’s  input-output  model  (Sec.  8.2)  and  invented  his  famous  method  to  solve  large-scale 
planning  (logistics)  problems.  Besides  Leontief,  Dantzig  credits  others  for  their  pioneering  work  in  linear 
programming,  that  is,  JOHN  VON  NEUMANN  (1903—1957),  Hungarian  American  mathematician.  Institute  for 
Advanced  Studies,  Princeton  University,  who  made  major  contributions  to  game  theory,  computer  science, 
functional  analysis,  set  theory,  quantum  mechanics,  ergodic  theory,  and  other  areas,  the  Nobel  laureates  LEONID 
VIT  ALIYEVICH  KANTOROVICH  (1912-1986),  Russian  economist,  and  TJALLING  CHARLES 
KOOPMANS  (1910-1985),  Dutch-American  economist,  who  shared  the  1975  Nobel  Prize  in  Economics  for 
their  contributions  to  the  theory  of  optimal  allocation  of  resources.  Dantzig  was  a driving  force  in  establishing 
the  field  of  linear  programming  and  became  professor  of  transportation  sciences,  operations  research,  and 
computer  science  at  Stanford  University.  For  his  work  see  R.  W.  Cottle  (ed.).  The  Basic  George  B.  Dantzig. 
Palo  Alto,  CA:  Stanford  University  Press,  2003. 
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This  matrix  is  called  a simplex  tableau  or  simplex  table  (the  initial  simplex  table).  These 
are  standard  names.  The  dashed  lines  and  the  letters 

Z , b 


are  for  ease  in  further  manipulation. 

Every  simplex  table  contains  two  kinds  of  variables  Xj.  By  basic  variables  we  mean 
those  whose  columns  have  only  one  nonzero  entry.  Thus  x3,  x4  in  (4)  are  basic  variables 
and  Xi,  x2  are  nonbasic  variables. 

Every  simplex  table  gives  a basic  feasible  solution.  It  is  obtained  by  setting  the  nonbasic 
variables  to  zero.  Thus  (4)  gives  the  basic  feasible  solution 

xi  = 0,  x2  = 0,  x3  = 60/1  = 60,  x 4 = 60/1  = 60,  z = 0 

with  x3  obtained  from  the  second  row  and  x4  from  the  third. 

The  optimal  solution  (its  location  and  value)  is  now  obtained  stepwise  by  pivoting, 
designed  to  take  us  to  basic  feasible  solutions  with  higher  and  higher  values  of  z until  the 
maximum  of  z is  reached.  Here,  the  choice  of  the  pivot  equation  and  pivot  are  quite 
different  from  that  in  the  Gauss  elimination.  The  reason  is  that  x4,  x2,  x3,  x4  are  restricted 
to  nonnegative  values. 

Step  1.  Operation  Ot:  Selection  of  the  Column  of  the  Pivot 

Select  as  the  column  of  the  pivot  the  first  column  with  a negative  entry  in  Row  1 . In  (4) 
this  is  Column  2 (because  of  the  —40). 

Operation  02:  Selection  of  the  Row  of  the  Pivot.  Divide  the  right  sides  [60  and  60  in 
(4)]  by  the  corresponding  entries  of  the  column  just  selected  (60/2  = 30,  60/5  = 12). 
Take  as  the  pivot  equation  the  equation  that  gives  the  smallest  quotient.  Thus  the  pivot 
is  5 because  60/5  is  smallest. 

Operation  03:  Elimination  by  Row  Operations.  This  gives  zeros  above  and  below  the 
pivot  (as  in  Gauss-Jordan,  Sec.  7.8). 

With  the  notation  for  row  operations  as  introduced  in  Sec.  7.3,  the  calculations  in  Step  1 
give  from  the  simplex  table  T0  in  (4)  the  following  simplex  table  (augmented  matrix), 
with  the  blue  letters  referring  to  the  previous  table. 


(5) 


Tr 


z x4  x2  x3  x4  b 

1 ! 0 -72  ! 0 8 ! 480 

0 | 0 7.2  j 1 -0.4  | 36 

I I I 

_ 0 ! 5 2 ! 0 1 ! 60  _ 


Row  1+8  Row  3 
Row  2 — 0.4  Row  3 


We  see  that  basic  variables  are  now  X\,  x3)  and  nonbasic  variables  are  x2,  x4.  Setting  the 
latter  to  zero,  we  obtain  the  basic  feasible  solution  given  by  T1; 

x 4 = 60/5  = 12,  x2  = 0,  x3  = 36/1  = 36,  x4  = 0,  z = 480. 

This  is  A in  Fig.  474  (Sec.  22.2).  We  thus  have  moved  from  O:  (0,  0)  with  z = 0 to 
A:  (12,  0)  with  the  greater  z = 480.  The  reason  for  this  increase  is  our  elimination  of  a 


SEC.  22.3  Simplex  Method 


961 


term  (— 40x  t)  with  a negative  coefficient.  Hence  elimination  is  applied  only  to  negative 
entries  in  Row  1 but  to  no  others.  This  motivates  the  selection  of  the  column  of  the  pivot. 

We  now  motivate  the  selection  of  the  row  of  the  pivot.  Had  we  taken  the  second  row 
of  T0  instead  (thus  2 as  the  pivot),  we  would  have  obtained  z = 1200  (verify!),  but  this 
line  of  constant  revenue  z = 1200  lies  entirely  outside  the  feasibility  region  in  Fig.  474. 
This  motivates  our  cautious  choice  of  the  entry  5 as  our  pivot  because  it  gave  the  smallest 
quotient  (60/5  = 12). 

Step  2.  The  basic  feasible  solution  given  by  (5)  is  not  yet  optimal  because  of  the  negative 
entry  —72  in  Row  1.  Accordingly,  we  perform  the  operations  0\  to  03  again,  choosing  a 
pivot  in  the  column  of  —72. 

Operation  0\.  Select  Column  3 of  T|  in  (5)  as  the  column  of  the  pivot  (because  — 72  < 0). 

Operation  0%.  We  have  36/7.2  = 5 and  60/2  = 30.  Select  7.2  as  the  pivot  (because 
5 < 30). 


Operation  03.  Elimination  by  row  operations  gives 


z 

Xi 

x2 

*3 

X4 

b 

1 

! 0 

o ! 

10 

4 

840  " 

Row  1 + 

10  Row  2 

(6) 

t2  = 

0 

0 

7.2  j 

1 

-0.4 

36 

1 

1 

1 

2 

0 

5 

0 

50 

Row  3 — 

Row  2 

1 

3.6 

0.9 

7.2 

We  see  that  now  x±,  x2  are  basic  and  x:j,  x4  nonbasic.  Setting  the  latter  to  zero,  we  obtain 
from  T2  the  basic  feasible  solution 

x4  = 50/5  = 10,  x2  = 36/7.2  = 5,  x3  = 0,  x4  = 0,  z = 840. 

This  is  B in  Fig.  474  (Sec.  22.2).  In  this  step,  z has  increased  from  480  to  840,  due  to  the 
elimination  of  — 72  in  T4.  Since  T2  contains  no  more  negative  entries  in  Row  1,  we 
conclude  that  z =/(  10,  5)  = 40  • 10  + 88  • 5 = 840  is  the  maximum  possible  revenue. 
It  is  obtained  if  we  produce  twice  as  many  S heaters  as  L heaters.  This  is  the  solution  of 
our  problem  by  the  simplex  method  of  linear  programming. 

Minimization.  If  we  want  to  minimize  z = /(x)  (instead  of  maximize),  we  take  as  the 
columns  of  the  pivots  those  whose  entry  in  Row  1 is  positive  (instead  of  negative).  In 
such  a Column  k we  consider  only  positive  entries  tjk  and  take  as  pivot  a tjk  for  which 
bj/tjk  is  smallest  (as  before).  For  examples,  see  the  problem  set. 


P=RQBFEM=SET— 


1.  Verify  the  calculations  in  Example  1 of  the  text. 


2-14 


SIMPLEX  METHOD 


2.  The  problem  in  the  example  in  the  text  with  the 
constraints  interchanged. 

3.  Maximize /=  3xj  + 2x2  subject  to  3xi  + 4x2  £ 60, 
4xi  + 3x2  S 60,  10xi  + 2x2  S 120. 


Write  in  normal  form  and  solve  by  the  simplex  method, 
assuming  all  Xj  to  be  nonnegative. 
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4.  Maximize  the  daily  output  in  producing  X\  chairs  by 
Process  P1  and  x2  chairs  by  Process  P2  subject  to 
3.x  4 + 4x2  fi  550  (machine  hours),  5xx  + 4x2  fi  650 
(labor). 

5.  Minimize  f—  5x4  — 20x2  subject  to  ~2x1  + 10x2 

fi  5,  2x!  + 5x2  £ 10. 

6.  Prob.  19  in  Sec.  22.2. 

7.  Suppose  we  produce  X\  AA  batteries  by  Process 
P1  and  x2  by  Process  P2,  furthermore  X3  A batteries  by 
Process  P3  and  x4  by  Process  P4.  Let  the  profit  for  100 
batteries  be  $10  for  AA  and  $20  for  A.  Maximize  the 
total  profit  subject  to  the  constraints 

12xi  + 8x2  + 6x3  + 4x4  £ 120  (Material) 

3xi  + 6x2  + 12x3  + 24x4  £ 180  (Labor). 

8.  Maximize  the  daily  profit  in  producing  x4  metal  frames 
Fi  (profit  $90  per  frame)  and  x2  frames  F2  (profit  $50 
per  frame)  subject  to  x4  + 3x2  £ 18  (material), 
Xi  + x2  £ 10  (machine  hours),  3xi  + x2  £ 24  (labor). 

9.  Maximize  / = 2x1  + x2  + 3x3  subject  to  4x4  + 3x2  + 

6x3  = 12. 


10.  Minimize  / = 4xi  — 10x2  — 20x3  subject  to  3xy  + 
4x2  + 5x3  £ 60,  2x1  + x2  £ 20,  2x4  + 3x3  £ 30. 

11.  Prob.  22  in  Problem  Set  22.2. 

12.  Maximize  / = 2x4  + 3x2  + X3  subject  to  x4  + x2  + 
x3  £ 4.8,  I0x1  + x3  £ 9.9,  x2  — x3  £ 0.2. 

13.  Maximize/ = 34xi  + 29x2  + 32x3  subject  to  8x1  + 
2x2  + x3  £ 54,  3x4  + 8x2  + 2x3  £ 59,  x4  + x2  + 
5x3  fi  39. 

14.  Maximize/ = 2x1  + 3x2  subject  to  5x4  + 3x2  fi  105, 
3xi  + 6x2  £ 126. 

15.  CAS  PROJECT.  Simple  Method,  (a)  Write  a program 
for  graphing  a region  R in  the  first  quadrant  of  the 
XiX 2-plane  determined  by  linear  constraints. 

(b)  Write  a program  for  maximizing  z = aiX'i  + a2x 2 
in  R. 

(c)  Write  a program  for  maximizing  z = a 1X1  + 
• • • + anxn  subject  to  linear  constraints. 

(d)  Apply  your  programs  to  problems  in  this  problem 
set  and  the  previous  one. 


22.4  Simplex  Method:  Difficulties 

In  solving  a linear  optimization  problem  by  the  simplex  method,  we  proceed  stepwise 
from  one  basic  feasible  solution  to  another.  By  so  doing,  we  increase  the  value  of  the 
objective  function  /.  We  continue  this  stepwise  procedure,  until  we  reach  an  optimal 
solution.  This  was  all  explained  in  Sec.  22.3.  However,  the  method  does  not  always  proceed 
so  smoothly.  Occasionally,  but  rather  infrequently  in  practice,  we  encounter  two  kinds  of 
difficulties.  The  first  one  is  the  degeneracy  and  the  second  one  concerns  difficulties  in 
starting. 

Degeneracy 

A degenerate  feasible  solution  is  a feasible  solution  at  which  more  than  the  usual  number 
n — m of  variables  are  zero.  Here  n is  the  number  of  variables  (slack  and  others)  and  m 
the  number  of  constraints  (not  counting  the  xj  0 conditions).  In  the  last  section,  n = 4 
and  m = 2,  and  the  occurring  basic  feasible  solutions  were  nondegenerate;  n — m = 2 
variables  were  zero  in  each  such  solution. 

In  the  case  of  a degenerate  feasible  solution  we  do  an  extra  elimination  step  in  which 
a basic  variable  that  is  zero  for  that  solution  becomes  nonbasic  (and  a nonbasic  variable 
becomes  basic  instead).  We  explain  this  in  a typical  case.  For  more  complicated  cases 
and  techniques  (rarely  needed  in  practice)  see  Ref.  [F5]  in  App.  1. 

EXAM  Simplex  Method,  Degenerate  Feasible  Solution 


AB  Steel,  Inc.,  produces  two  kinds  of  iron  /]_,  I2  by  using  three  kinds  of  raw  material  R±,  R2,  R3  (scrap  iron  and 
two  kinds  of  ore)  as  shown.  Maximize  the  daily  profit. 
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Raw 

Material 

Raw  Material  Needed 
per  Ton 

Raw  Material  Available 
per  Day  (tons) 

Iron  / 1 

Iron  /2 

Ri 

2 

1 

16 

R2 

1 

1 

8 

r3 

0 

1 

3.5 

Net  profit 
per  ton 

$150 

$300 

Solution.  Let  xi  and  *2  denote  the  amount  (in  tons)  of  iron  7i  and  I2,  respectively,  produced  per  day.  Then 
our  problem  is  as  follows.  Maximize 

(1)  z =f(x)  = 150*!  + 30(k2 

subject  to  the  constraints  *1  0,  *2  = 0 and 

2*i  + *2  = 16  (raw  material  Ri ) 

*1  + *2  = 8 (raw  material  R2 ) 

*2  = 3.5  (raw  material  R 3). 

By  introducing  slack  variables  *3,  *4,  *5  we  obtain  the  normal  form  of  the  constraints 

2*i  + *2  + *3  =16 

(2)  *1  + *2  + *4  =8 

*2  + *5  = 3.5 

= 0 (i  = 1,  • • • , 5). 

As  in  the  last  section  we  obtain  from  (1)  and  (2)  the  initial  simplex  table 


z 

Xi 

*2 

*3 

x4 

*5 

b 

r 1 1 

— 1 

-150 

-300 

1 

O 1 

1 

r 

0 

0 

1 

O 1 
1 

-4 

1 

0 1 

2 

1 

1 

1 1 

0 

0 

16 

0 

1 

1 

0 

1 

0 

8 

1 

L 0 1 

0 

1 

1 

l 0 

0 

1 

1 

1 3.5J 

We  see  that  *1,  *2  are  nonbasic  variables  and  *3,  *4,  *5  are  basic.  With  *1  = *2  — 0 we  have  from  (3)  the  basic 
feasible  solution 

*1  = 0,  *2  = 0,  *3  = 16/1  = 16,  *4  = 8/1  = 8,  *5  — 3.5/1  = 3.5,  z = 0. 

This  is  O:  (0,  0)  in  Fig.  475.  We  have  n = 5 variables  Xj,  m = 3 constraints,  and  n — m = 2 variables  equal  to 
zero  in  our  solution,  which  thus  is  nondegenerate. 


Step  1 of  Pivoting 

Operation  0\i  Column  Selection  of  Pivot.  Column  2 (since  — 150  < 0). 

Operation  Oz*  Row  Selection  of  Pivot.  16/2  = 8,  8/1  =8;  3.5/0  is  not  possible.  Hence  we  could  choose 
Row  2 or  Row  3.  We  choose  Row  2.  The  pivot  is  2. 
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Operation  0^\  Elimination  by  Row  Operations.  This  gives  the  simplex  table 


Z *1  *2  *3  *4  *5  b 


r i 1 

4 

0 

1 1 
1 

N> 
1 C/1 

]__ 

1 

75 

0 

0 

1 

1 

1200  ' 

Row  1 

+ 75  Row  2 

0 

2 

1 

1 

0 

0 

1 

1 

16 

0 

0 

1 | 
2 | 

1 

2 

1 

0 

1 

1 

0 

Row  3 

— \ Row  2 

1 

L o i 

0 

1 

1 I 

0 

0 

1 

1 

1 

3.5  . 

Row  4 

We  see  that  the  basic  variables  are  xi,  x±,  x$  and  the  nonbasic  are  X2,  *3.  Setting  the  nonbasic  variables  to  zero, 
we  obtain  from  Ti  the  basic  feasible  solution 


X!  = 16/2  = 8,  *2  = 0,  *3  = 0,  *4  = 0/1  = 0,  *5  = 3.5/1  = 3.5,  z = 1200. 

This  is  A:  (8,  0)  in  Fig.  475.  This  solution  in  degenerate  because  *4  = 0 (in  addition  to  *2  = 0,  *3  = 0); 
geometrically:  the  straight  line  X4  = 0 also  passes  through  A.  This  requires  the  next  step,  in  which  *4  will 
become  nonbasic. 


Step  2 of  Pivoting 

Operation  0\i  Column  Selection  of  Pivot.  Column  3 (since  —225  < 0). 

Operation  0%:  Row  Selection  of  Pivot.  16/1  = 16,  0/|  =0.  Hence  ^ must  serve  as  the  pivot. 
Operation  O3:  Elimination  by  Row  Operations.  This  gives  the  following  simplex  table. 


z 

*1 

x2 

x3 

x4 

x5 

b 

r i i o 

4 

0 I 
4. 

-150 

450 

0 l 
4 

1200 

Row  1 

+ 450  Row  3 

0 

2 

0 

2 

-2 

o ] 

16 

Row  2 

— 2 Row  3 

0 

0 

k 

1 

2 

1 

o j 

0 

. 0 

0 

o 

l 

-2 

i 

3.5. 

Row  4 

— 2 Row  3 

We  see  that  the  basic  variables  are  *1,  *2,  x5  and  the  nonbasic  are  *3,  X4.  Hence  X4  has  become  nonbasic,  as 
intended.  By  equating  the  nonbasic  variables  to  zero  we  obtain  from  T2  the  basic  feasible  solution 

*1  = 16/2  = 8,  x2  = 0/k  =0,  x3  = 0,  *4  = 0,  x5  = 3.5/1  = 3.5,  z = 1200. 

This  is  still  A:  (8,  0)  in  Fig.  475  and  z has  not  increased.  But  this  opens  the  way  to  the  maximum,  which  we 
reach  in  the  next  step. 
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EXAMPLE  2 


Step  3 of  Pivoting 

Operation  0\i  Column  Selection  of  Pivot.  Column  4 (since  — 150  < 0). 

Operation  O Row  Selection  of  Pivot.  16/2  = 8,  0/(— g)  = 0,  3.5/1  = 3.5.  We  can  take  1 as  the  pivot. 
(With  — \ as  the  pivot  we  would  not  leave  A.  Try  it.) 

Operation  0%:  Elimination  by  Row  Operations.  This  gives  the  simplex  table 


(6) 


z 

*1 

*2 

*3 

x4 

a5 

b 

' 1 

l 0 

_i 

0 

1 

b 

0 

150 

1 

1 C/i 

1 O 

1 

-b 

1725 

0 

2 

0 

l 

1 

0 

2 

-2 

1 

1 

9 

0 

0 

1 

2 

1 

1 

0 

0 

1 

2 

1 

1 

1.75 

. 0 

0 

0 

1 

1 

1 

-2 

1 

1 

1 

3.5 

Row  1 + 150  Row  4 
Row  2 — 2 Row  4 
Row  3+|  Row  4 


We  see  that  basic  variables  are  xi,  X2,  xs  and  nonbasic  x±,  x$.  Equating  the  latter  to  zero  we  obtain  from  T3  the 
basic  feasible  solution 

x ! = 9/2  = 4.5,  *2  = 1.75/1  = 3.5,  x3  = 3.5/1  = 3.5,  x4  = 0,  x5  = 0,  z=  1725. 

This  is  B\  (4.5,  3.5)  in  Fig.  475.  Since  Row  1 of  T3  has  no  negative  entries,  we  have  reached  the  maximum  daily 
profit  zmax  =/(4.5,  3.5)  = 150  • 4.5  + 300  • 3.5  = $1725.  This  is  obtained  by  using  4.5  tons  of  iron  Ii  and 
3.5  tons  of  iron  I2. 

Difficulties  in  Starting 

As  a second  kind  of  difficulty,  it  may  sometimes  be  hard  to  find  a basic  feasible  solution 
to  start  from.  In  such  a case  the  idea  of  an  artificial  variable  (or  several  such  variables) 
is  helpful.  We  explain  this  method  in  terms  of  a typical  example. 

Simplex  Method:  Difficult  Start,  Artificial  Variable 

Maximize 

(7)  2 = /(x)  = 2*1  + x2 

subject  to  the  constraints  x\  ^ 0,  X2  = 0 and  (Fig.  476) 

*1  - 1*2  a 1 

X\  “ x2  & 2 
X!  + X2^  4. 

Solution.  By  means  of  slack  variables  we  achieve  the  normal  form  of  the  constraints 


1 

1 

*2 

- 0 

*1  - 

hx2 

x3  = • 

(8) 

Xi  - 

x2 

+ *4  =2 

XX  + 

x2 

+ X5  = 4 

XiSO  (i=l,  ••■,5). 
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Note  that  the  first  slack  variable  is  negative  (or  zero),  which  makes  x 3 nonnegative  within  the  feasibility  region 
(and  negative  outside).  From  (7)  and  (8)  we  obtain  the  simplex  table 


z 

*1 

*2 

*3 

*4 

*5 

b 

" 1 

l -2 

4- 

-1 

I 0 

_l 

0 

0 1 

1 _ 

0 " 

0 

1 

1 

2 

-1 

0 

0 

1 

0 

1 

-1 

0 

1 

0 

2 

0 

1 

1 

0 

0 

1 

4 

*1,  *2  are  nonbasic,  and  we  would  like  to  take  X3,  X4,  x$  as  basic  variables.  By  our  usual  process  of  equating 
the  nonbasic  variables  to  zero  we  obtain  from  this  table 

*1  = 0,  *2  = 0,  *3  = l/(— 1)  = -1,  *4  = 1=2,  *5  = 1=4,  z = 0. 

X3  < 0 indicates  that  (0,  0)  lies  outside  the  feasibility  region.  Since  *3  < 0,  we  cannot  proceed  immediately. 
Now,  instead  of  searching  for  other  basic  variables,  we  use  the  following  idea.  Solving  the  second  equation  in 
(8)  for  *3,  we  have 


*3  = -1  + *1  - 2*2- 


To  this  we  now  add  a variable  xq  on  the  right, 


(9)  *3  = - 1 + *1  - 2*2  + *6- 

xq  is  called  an  artificial  variable  and  is  subject  to  the  constraint  A'e  = 0. 

We  must  take  care  that  Xq  (which  is  not  part  of  the  given  problem!)  will  disappear  eventually.  We  shall  see 
that  we  can  accomplish  this  by  adding  a term  —Mxq  with  very  large  M to  the  objective  function.  Because  of 
(7)  and  (9)  (solved  for  *5)  this  gives  the  modified  objective  function  for  this  “extended  problem” 

(10)  £ = Z ~ Mxq  = 2xi  + *2  — Mxq  = (2  + M)x  1 + (1  — \M)x 2 — Mx 3 — M. 

We  see  that  the  simplex  table  corresponding  to  (10)  and  (8)  is 


z 

*1 

*2 

*3 

*4 

*5 

x6 

b 

1 

-2  - M 

-1  +1 M ! 

M 

0 

0 

0 

-M 

0 

1 

1 

-1 

0 

0 

0 

1 

2 

0 1 

1 

-1  1 

0 

1 

0 

0 1 

2 

0 

1 

1 

0 

0 

1 

0 

4 

1 

_ 0 1 

1 

-5 

-1 

0 

0 

1 

1 1 

1 
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The  last  row  of  this  table  results  from  (9)  written  as  X\  — \x2  — x3  + x6  = 1.  We  see  that  we  can  now  start, 
taking  x4,  xs,  xe  as  the  basic  variables  and  xi,  x2,  x3  as  the  nonbasic  variables.  Column  2 has  a negative  first 
entry.  We  can  take  the  second  entry  (1  in  Row  2)  as  the  pivot.  This  gives 


z 

Xi 

*2 

x3 

-V4 

*5 

*6 

b 

1 

1 

-4- 

0 

-2 

1 

-1 

-2 

0 

0 

0 

1 

--I-- 

2 

0 

1 

1 

1 

1 

2 

1 

1 

-1 

0 

0 

0 

1 

1 

1 

0 

1 

1 

0 

1 

2 

1 

1 

1 

1 

0 

0 

1 

1 

1 

0 

1 

1 

0 

3 

2 

1 

1 

1 

0 

1 

0 

1 

1 

3 

0 

1 

1 

0 

0 

1 

1 

0 

0 

0 

1 

1 

1 

0 

This  corresponds  to  x1  = 1 , x2  = 0 (point  A in  Fig.  476),  x3  = 0,  x4  = 1 . x5  = 3,  xe  = 0.  We  can  now  drop 
Row  5 and  Column  7.  In  this  way  we  get  rid  of  x6,  as  wanted,  and  obtain 


Z 

*1 

*2 

*3 

x4 

* 5 

b 

r i 

1 

n 

—2 

1 

-2 

0 

0 

1 

7 1 

-4- 

4- 

1 

0 

1 

l 

1 

1 

-1 

0 

0 

1 

1 

t2  = 

1 

2 

1 

1 

0 

1 

1 

0 

1 

2 

1 

1 

1 

1 

0 

1 

1 

1 

. 0 

1 

1 

0 

3 

2 

1 

1 

1 

0 

1 

1 

1 

3 . 

In  Column  3 we  choose  § as  the  next  pivot.  We  obtain 

Z *1  *2  *3 

*4 

x5 

b 

i 

1 0 

0 

2 

3 

0 

4 

3 

6 ' 

t3  = 

0 

1 

0 

2 

3 

0 

1 

3 

2 

0 

0 

0 

4 

3 

l 

3 

2 

. 0 

0 

3 

2 

l 

0 

l 

3 . 

This  corresponds  to  xi  = 2,X2 

= 2 (this  is  B in 

Fig.  476),  x3 

= 0,  X4 

II 

JO 

* 

Ol 

II 

p 

In  Column  4 we  choose  | 

as  the  pivot,  by  the  usual  principle.  This 

gives 

7 

*2 

*3 

*4 

*5  b 

r i 1 

0 

n 

1 0 

1 

| 1 7 1 

4 

4- 

0 

1 

0 

o 

1 

2 

i 1 3 

T4  = 

1 

1 

0 

0 

0 

1 3 

l 

I 1 2 

3 1 ^ 

. o | 

0 

3 

2 

0 

3 

4 

3 ! 3 

4 1 2-1 

This  corresponds  to  xi  = 3,  X2  — 1 (point  C 

in  Fig 

476), 

*3  = i,*4  = o,  X5  = 

0.  This  is  the  maximum 

/max  =/(  3,  1)  = 7. 


We  have  reached  the  end  of  our  discussion  on  linear  programming.  We  have  presented 
the  simplex  method  in  great  detail  as  this  method  has  many  beautiful  applications  and 
works  well  on  most  practical  problems.  Indeed,  problems  of  optimization  appear  in  civil 
engineering,  chemical  engineering,  environmental  engineering,  management  science, 
logistics,  strategic  planning,  operations  management,  industrial  engineering,  finance,  and 
other  areas.  Furthermore,  the  simplex  method  allows  your  problem  to  be  scaled  up  from 
a small  modeling  attempt  to  a larger  modeling  attempt,  by  adding  more  constraints  and 
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variables,  thereby  making  your  model  more  realistic.  The  area  of  optimization  is  an  active 
field  of  development  and  research  and  optimization  methods,  besides  the  simplex  method, 
are  being  explored  and  experimented  with. 


1.  Maximize  z =/i(x)  = 7x1  + 14x2  subject  to  0 S x1 

S 6,  0 S x2  £ 3,  7x\  + 14x2  £ 84. 

2.  Do  Prob.  1 with  the  last  two  constraints  interchanged. 

3.  Maximize  the  daily  output  in  producing  x1  steel  sheets 
by  process  PA  and  x2  steel  sheets  by  process  Pb  subject 
to  the  constraints  of  labor  hours,  machine  hours,  and 
raw  material  supply: 

3xi  + 2x2  £ 180,  4xi  + 6x2  £ 200, 

5xi  + 3x2  £ 160. 

4.  Maximize  z = 300-Xj  + 500x2  subject  to  2xj  + 8x2 

£ 60,  2.x i + x2  £ 30,  4x1  + 4x2  £ 60. 

5.  Do  Prob.  4 with  the  last  two  constraints  interchanged. 
Comment  on  the  resulting  simplification. 


6.  Maximize  the  total  output  /=  Xi  + x2  + X3  (pro- 
duction from  three  distinct  processes)  subject  to  input 
constraints  (limitation  of  time  available  for  production) 

5xi  + 6x2  + 7.X3  £ 12, 

7xi  + 4x2  + x3  £ 12. 

7.  Maximize  / = 5xj  + 8x2  + 4x3  subject  to  xj  S 0 
(j  — 1,  ■ ■ ■ , 5)  and  xi  + x3  + X5  = 1,  x2  + x3 

+ X4  = 1. 

8.  Using  an  artificial  variable,  minimize/  = 4xj  — x2  subject 
to  xi  + x2  S 2,  — 2x4  + 3x2  £ 1,  5xx  + 4x2  £ 50. 

9.  Maximize  / = 2xi  + 3x2  + 2x3,  x1  a 0,  x2  a 0, 

X3  S 0,  x1  + 2x2  — 4x3  S 2,  Xi  + 2x2  + 2x3  £ 5. 


S T I O N S AND  PROBLEMS 


1.  What  is  unconstrained  optimization?  Constraint  optimiza- 
tion? To  which  one  do  methods  of  calculus  apply? 

2.  State  the  idea  and  the  formulas  of  the  method  of  steepest 
descent. 

3.  Write  down  an  algorithm  for  the  method  of  steepest  descent. 

4.  Design  a “method  of  steepest  ascent”  for  determining 
maxima. 


5.  What  is  the  method  of  steepest  descent  for  a function 
of  a single  variable? 

6.  What  is  the  basic  idea  of  linear  programming? 

7.  What  is  an  objective  function?  A feasible  solution? 

8.  What  are  slack  variables?  Why  did  we  introduce  them? 

9.  What  happens  in  Example  1 of  Sec.  22.1  if  you  replace 

/(x)  = xf  + 3x2  with  /(x)  = X?  + 5x2?  Start  from 
x0  = [6  3]t.  Do  5 steps.  Is  the  convergence  faster  or 

slower? 

10.  Apply  the  method  of  steepest  descent  to/(x)  = 9xf  + 
x!  + 18xi  — 4x2,  5 steps.  Start  from  x0  = [2  4]T. 

11.  In  Prob.  10,  could  you  start  from  [0  0]T  and  do  5 steps? 


12.  Show  that  the  gradients  in  Prob.  1 1 are  orthogonal.  Give 
a reason. 


13-16  Graph  or  sketch  the  region  in  the  first  quadrant 
of  the  x1.x2-plane  determined  by  the  following  inequalities. 


13.  Xi  — 2x2  £ —2 
0.8x!  + x2  fi  6 


14.  xi  - 2x2  a -4 
2X!  + x2  £ 12 

x1  + x2  £ 8 

15.  Xi  + x2  S 5 

x2  S 3 
— Xi  + x2  £ 2 

16.  Xi  + x2  £ 2 

2xi  — 3x2  a —12 

x'i  S 15 


17-20  Maximize  or  minimize  as  indicated. 

17.  Maximize  / = 10xi  + 20x2  subject  to  X\  £ 5,  Xi  + 
x2  S 6,  x2  S 4. 

18.  Maximize  f = X\  + x2  subject  to  xi  + 2x2  S 10, 
2x2  + x 2 S 10,  x2  S 4. 

19.  Minimize  /=  2xi  — 10x2  subject  to  Xi  — x2  S 4, 

2xi  + x2  S 14,  xi  + x2  S 9,  — xi  + 3x2  S 15. 


20.  A factory  produces  two  kinds  of  gaskets,  Gj,  G2,  with 
net  profit  of  $60  and  $30,  respectively.  Maximize  the 
total  daily  profit  subject  to  the  constraints  (xj  = number 
of  gaskets  Gj  produced  per  day): 

40x!  + 40x2  S 1800  (Machine  hours), 

200-t!  + 20x2  S 6300  (Labor). 


Summary  of  Chapter  22 
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SUMMARY  OF  CHAPTER  22 

Unconstrained  Optimization.  Linear  Programming 


In  optimization  problems  we  maximize  or  minimize  an  objective  function  z = /T x) 
depending  on  control  variables  X\,  ■ ■ ■ , xm  whose  domain  is  either  unrestricted 
(“ unconstrained  optimization Sec.  22.1)  or  restricted  by  constraints  in  the  form 
of  inequalities  or  equations  or  both  (“ constrained  optimization Sec.  22.2). 

If  the  objective  function  is  linear  and  the  constraints  are  linear  inequalities  in 
jti,  • • ■ , xm.  then  by  introducing  slack  variables  xm+i,  ■ ■ ■ , xn  we  can  write  the 
optimization  problem  in  normal  form  with  the  objective  function  given  by 

(1)  /i  = cq*!  + • ■ ■ + cnxn 

(where  cm+\  = ■ ■ ■ = cn  = 0)  and  the  constraints  given  by 

a\\x  i + a12x2  + ■ ■ • + alnxn  = b± 


(2)  

a /n  1^  ] "h  am2X2  “h  * "h  ayyinXfi  £?m 
Xi  S 0,  ■ • • , xn  §S  0. 

In  this  case  we  can  then  apply  the  widely  used  simplex  method  (Sec.  22.3),  a 
systematic  stepwise  search  through  a very  much  reduced  subset  of  all  feasible 
solutions.  Section  22.4  shows  how  to  overcome  difficulties  with  this  method. 


CHAPTER  2 3 


Graphs. 

Combinatorial  Optimization 


Many  problems  in  electrical  engineering,  civil  engineering,  operations  research,  industrial 
engineering,  management,  logistics,  marketing,  and  economics  can  be  modeled  by  graphs 
and  directed  graphs,  called  digraphs.  This  is  not  surprising  as  they  allow  us  to  model 
networks,  such  as  roads  and  cables,  where  the  nodes  may  be  cities  or  computers.  The 
task  then  is  to  find  the  shortest  path  through  the  network  or  the  best  way  to  connect 
computers.  Indeed,  many  researchers  who  made  contributions  to  combinatorial 
optimization  and  graphs,  and  whose  names  lend  themselves  to  fundamental  algorithms 
in  this  chapter,  such  as  Fulkerson,  Kruskal,  Moore,  and  Prim,  all  worked  at  Bell 
Laboratories  in  New  Jersey,  the  major  R&D  facilities  of  the  huge  telephone  and 
telecommunication  company  AT&T.  As  such,  they  were  interested  in  methods  of 
optimally  building  computer  networks  and  telephone  networks.  The  field  has  progressed 
into  looking  for  more  and  more  efficient  algorithms  for  very  large  problems. 

Combinatorial  optimization  deals  with  optimization  problems  that  are  of  a pronounced 
discrete  or  combinatorial  nature.  Often  the  problems  are  very  large  and  so  a direct  search 
may  not  be  possible.  Just  like  in  linear  programming  (Chap.  22),  the  computer  is  an 
indispensible  tool  and  makes  solving  large-scale  modeling  problems  possible.  Because 
the  area  has  a distinct  flavor,  different  from  ODEs,  linear  algebra,  and  other  areas,  we 
start  with  the  basics  and  gradually  introduce  algorithms  for  shortest  path  problems  (Secs. 
22.2,  22.3),  shortest  spanning  trees  (Secs.  23.4,  23.5),  flow  problems  in  networks  (Secs. 
23.6,  23.7),  and  assignment  problems  (Sec.  23.8). 

Prerequisite:  none. 

References  and  Answers  to  Problems:  App.  1 Part  F,  App.  2. 


23.  Graphs  and  Digraphs 

Roughly,  a graph  consists  of  points,  called  vertices , and  lines  connecting  them,  called 
edges.  For  example,  these  may  be  four  cities  and  five  highways  connecting  them,  as  in 
Fig.  477.  Or  the  points  may  represent  some  people,  and  we  connect  by  an  edge  those  who 
do  business  with  each  other.  Or  the  vertices  may  represent  computers  in  a network  and 
the  edge  connections  between  them.  Let  us  now  give  a formal  definition. 
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Fig.  477.  Graph  consisting  of 
4 vertices  and  5 edges 


Fig.  478.  Isolated  vertex,  loop,  double 
edge.  (Excluded  by  definition.) 


DEFINITION 


Graph 

A graph  G consists  of  two  finite  sets  (sets  having  finitely  many  elements),  a set  V 
of  points,  called  vertices,  and  a set  E of  connecting  lines,  called  edges,  such  that 
each  edge  connects  two  vertices,  called  the  endpoints  of  the  edge.  We  write 

G = (V,  E). 

Excluded  are  isolated  vertices  (vertices  that  are  not  endpoints  of  any  edge),  loops 
(edges  whose  endpoints  coincide),  and  multiple  edges  (edges  that  have  both 
endpoints  in  common).  See  Fig.  478. 


CAUTION!  Our  three  exclusions  are  practical  and  widely  accepted,  but  not  uniformly. 
For  instance,  some  authors  permit  multiple  edges  and  call  graphs  without  them  simple 
graphs. 

We  denote  vertices  by  letters,  u,  v,  • ■ • or  i>i,  u2, ' ■ ■ or  simply  by  numbers  1,  2,  • • • (as 
in  Fig.  477).  We  denote  edges  by  e\,  e2, ' ' ' or  by  their  two  endpoints;  for  instance, 
ci  = (1,4),  e2  = (1,2)  in  Fig.  477. 

An  edge  ( Vi , Vj)  is  called  incident  with  the  vertex  vt  (and  conversely);  similarly,  (vt,  vf) 
is  incident  with  Vj.  The  number  of  edges  incident  with  a vertex  v is  called  the  degree  of  v. 
Two  vertices  are  called  adjacent  in  G if  they  are  connected  by  an  edge  in  G (that  is,  if  they 
are  the  two  endpoints  of  some  edge  in  G). 

We  meet  graphs  in  different  fields  under  different  names:  as  “networks”  in  electrical 
engineering,  “structures”  in  civil  engineering,  “molecular  structures”  in  chemistry, 
“organizational  structures”  in  economics,  “sociograms,”  “road  maps,”  “telecommunication 
networks,”  and  so  on. 


Digraphs  (Directed  Graphs) 

Nets  of  one-way  streets,  pipeline  networks,  sequences  of  jobs  in  construction  work,  flows 
of  computation  in  a computer,  producer-consumer  relations,  and  many  other  applications 
suggest  the  idea  of  a “digraph”  (=  directed  graph),  in  which  each  edge  has  a direction 
(indicated  by  an  arrow,  as  in  Fig.  479). 
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DEFI 


ITION 


Digraph  (Directed  Graph) 

A digraph  G = ( V , E ) is  a graph  in  which  each  edge  e = ( i,j ) has  a direction  from 
its  “ initial  point ” i to  its  “ terminal  point”  j. 


Two  edges  connecting  the  same  two  points  i,  j are  now  permitted,  provided  they  have 
opposite  directions,  that  is,  they  are  (i,j)  and  (j,  i).  Example.  (1,  4)  and  (4,  1)  in  Fig.  479. 

A subgraph  or  subdigraph  of  a given  graph  or  digraph  G = ( V , E),  respectively,  is  a 
graph  or  digraph  obtained  by  deleting  some  of  the  edges  and  vertices  of  G,  retaining  the 
other  edges  of  G (together  with  their  pairs  of  endpoints).  For  instance,  e±,  e%  (together 
with  the  vertices  1,  2,  4)  form  a subgraph  in  Fig.  477,  and  e:j,  e4,  e5  (together  with  the 
vertices  1,  3,  4)  form  a subdigraph  in  Fig.  479. 


Computer  Representation  of  Graphs  and  Digraphs 

Drawings  of  graphs  are  useful  to  people  in  explaining  or  illustrating  specific  situations. 
Here  one  should  be  aware  that  a graph  may  be  sketched  in  various  ways;  see  Fig.  480. 
For  handling  graphs  and  digraphs  in  computers,  one  uses  matrices  or  lists  as  appropriate 
data  structures,  as  follows. 


(a) 


Fig.  480.  Different  sketches  of  the  same  graph 


Adjacency  Matrix  of  a Graph  G:  Matrix  A = [a^]  with  entries 

{1  if  G has  an  edge  (i,  j), 

0 else. 

Thus  Ojj  = 1 if  and  only  if  two  vertices  i and  j are  adjacent  in  G.  Here,  by  definition,  no 
vertex  is  considered  to  be  adjacent  to  itself;  thus,  an  = 0.  A is  symmetric,  al3  = aTl.  (Why?) 

The  adjacency  matrix  of  a graph  is  generally  much  smaller  than  the  so-called  incidence 
matrix  (see  Prob.  18)  and  is  preferred  over  the  latter  if  one  decides  to  store  a graph  in  a 
computer  in  matrix  form. 
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EXAMPLE  1 


EXAMPLE  2 


EXAMPLE  3 


Adjacency  Matrix  of  a Graph 


Adjacency  Matrix  of  a Digraph  G:  Matrix  A = [a-i-j]  with  entries 


1 if  G has  a directed  edge  ( i , j), 
0 else. 


This  matrix  A need  not  be  symmetric.  (Why?) 


Lists.  The  vertex  incidence  list  of  a graph  shows,  for  each  vertex,  the  incident  edges. 
The  edge  incidence  list  shows  for  each  edge  its  two  endpoints.  Similarly  for  a digraph; 
in  the  vertex  list,  outgoing  edges  then  get  a minus  sign,  and  in  the  edge  list  we  now  have 
ordered  pairs  of  vertices. 


Vertex  Incidence  List  and  Edge  Incidence  List  of  a Graph 

This  graph  is  the  same  as  in  Example  1 , except  for  notation. 


Vertex 

Incident  Edges 

Edge 

Endpoints 

V\ 

Cl.  <?5 

ei 

Vl,  v2 

v2 

Ci,  e2,  C3 

e2 

v2 , ^3 

v3 

e2,  c4 

e3 

v2,  v4 

u4 

C3,  c4,  e5 

e4 

v3 , v4 

e5 

Vl,  v4 
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Sparse  graphs  are  graphs  with  few  edges  (far  fewer  than  the  maximum  possible  number 
n{n  — l)/2,  where  n is  the  number  of  vertices).  For  these  graphs,  matrices  are  not  efficient. 
Lists  then  have  the  advantage  of  requiring  much  less  storage  and  being  easier  to  handle; 
they  can  be  ordered,  sorted,  or  manipulated  in  various  other  ways  directly  within  the 
computer.  For  instance,  in  tracing  a “walk”  (a  connected  sequence  of  edges  with  pairwise 
common  endpoints),  one  can  easily  go  back  and  forth  between  the  two  lists  just  discussed, 
instead  of  scanning  a large  column  of  a matrix  for  a single  1 . 

Computer  science  has  developed  more  refined  lists,  which,  in  addition  to  the  actual 
content,  contain  “pointers”  indicating  the  preceding  item  or  the  next  item  to  be  scanned 
or  both  items  (in  the  case  of  a “walk”:  the  preceding  edge  or  the  subsequent  one).  For 
details,  see  Refs.  [E16]  and  [F7J. 

This  section  was  devoted  to  basic  concepts  and  notations  needed  throughout  this  chapter, 
in  which  we  shall  discuss  some  of  the  most  important  classes  of  combinatorial  optimization 
problems.  This  will  at  the  same  time  help  us  to  become  more  and  more  familiar  with 
graphs  and  digraphs. 


P R Q B^  M — S^  T 2 3 1 


1.  Explain  how  the  following  can  be  regarded  as  a graph 
or  a digraph:  a family  tree,  air  connections  between 
given  cities,  trade  relations  between  countries,  a tennis 
tournament,  and  memberships  of  some  persons  in  some 
committees. 

2.  Sketch  the  graph  consisting  of  the  vertices  and  edges 
of  a triangle.  Of  a pentagon.  Of  a tetrahedron. 

3.  How  would  you  represent  a net  of  two-way  and  one- 
way streets  by  a digraph? 

4.  Worker  Wj  can  do  jobs  Jlt  J3,  74,  worker  W 2 job  J3, 
and  worker  W3  jobs  J2,  J3 , 74.  Represent  this  by  a 
graph. 

5.  Find  further  situations  that  can  be  modeled  by  a graph 
or  diagraph. 


ADJACENCY  MATRIX 

6.  Show  that  the  adjacency  matrix  of  a graph  is  symmetric. 

7.  When  will  the  adjacency  matrix  of  a digraph  be 
symmetric? 


14-15 


Sketch  the  graph  for  the  given  adjacency  matrix. 


"0 

1 

0 

f 

"0 

1 

0 

0” 

1 

0 

1 

0 

1 

0 

0 

0 

14. 

0 

1 

0 

0 

15. 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 

1 

0 

16.  Complete  graph.  Show  that  a graph  G with  n vertices 
can  have  at  most  n(n  — l)/2  edges,  and  G has  exactly 
n(n  — l)/2  edges  if  G is  complete,  that  is,  if  every  pair 
of  vertices  of  G is  joined  by  an  edge.  (Recall  that  loops 
and  multiple  edges  are  excluded.) 
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17.  In  what  case  are  all  the  off-diagonal  entries  of  the 
adjacency  matrix  of  a graph  G equal  to  one? 


19.  Incidence  matrix  B of  a digraph.  The  definition  is 

B = [fojfc],  where 


18.  Incidence  matrix  B of  a graph.  The  definition  is 

B = [fejfc],  where 


1 if  edge  e k leaves  vertex  j, 
1 if  edge  e k enters  vertex  j, 


1 if  vertex  j is  an  endpoint  of  edge  e 


bjk 


0 otherwise. 


0 otherwise. 


Find  the  incidence  matrix  of  the  graph  in  Prob.  8. 


Find  the  incidence  matrix  of  the  digraph  in  Prob.  1 1 . 
20.  Make  the  vertex  incidence  list  of  the  digraph  in  Prob.  1 1. 


23.1  Shortest  Path  Problems.  Complexity 


The  rest  of  this  chapter  is  devoted  to  the  most  important  classes  of  problems  of 
combinatorial  optimization  that  can  be  represented  by  graphs  and  digraphs.  We  selected 
these  problems  because  of  their  importance  in  applications,  and  present  their  solutions 
in  algorithmic  form.  Although  basic  ideas  and  algorithms  will  be  explained  and 
illustrated  by  small  graphs,  you  should  keep  in  mind  that  real-life  problems  may  often 
involve  many  thousands  or  even  millions  of  vertices  and  edges.  Think  of  computer 
networks,  telephone  networks,  electric  power  grids,  worldwide  air  travel,  and  companies 
that  have  offices  and  stores  in  all  larger  cities.  You  can  also  think  of  other  ideas  for 
networks  related  to  the  Internet,  such  as  electronic  commerce  (networks  of  buyers  and 
sellers  of  goods  over  the  Internet)  and  social  networks  and  related  websites,  such  as 
Facebook.  Hence  reliable  and  efficient  systematic  methods  are  an  absolute  necessity — 
solutions  by  trial  and  error  would  no  longer  work,  even  if  “nearly  optimal”  solutions 
were  acceptable. 

We  begin  with  shortest  path  problems,  as  they  arise,  for  instance,  in  designing  shortest 
(or  least  expensive,  or  fastest)  routes  for  a traveling  salesman,  for  a cargo  ship,  etc.  Let 
us  first  explain  what  we  mean  by  a path. 

In  a graph  G = ( V , E)  we  can  walk  from  a vertex  v\  along  some  edges  to  some  other 
vertex  Vp.  Here  we  can 

(A)  make  no  restrictions,  or 

(B)  require  that  each  edge  of  G be  traversed  at  most  once,  or 

(C)  require  that  each  vertex  be  visited  at  most  once. 

In  case  (A)  we  call  this  a walk.  Thus  a walk  from  iq  to  Vp-  is  of  the  form 
(1)  (i?i,  v2),  (i>2,  U3),  • • ' , (Ufe- 1,  Ufc), 

where  some  of  these  edges  or  vertices  may  be  the  same.  In  case  (B),  where  each  edge 
may  occur  at  most  once,  we  call  the  walk  a trail.  Finally,  in  case  (C),  where  each  vertex 
may  occur  at  most  once  (and  thus  each  edge  automatically  occurs  at  most  once),  we  call 
the  trail  a path. 

We  admit  that  a walk,  trail,  or  path  may  end  at  the  vertex  it  started  from,  in  which  case 
we  call  it  closed;  then  Vp-  = t>i  in  (1). 
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A closed  path  is  called  a cycle.  A cycle  has  at  least  three  edges  (because  we  do  not 
have  double  edges;  see  Sec.  23.1).  Figure  481  illustrates  all  these  concepts. 

® © 

© © © 

Fig.  481.  Walk,  trail,  path,  cycle 

1 — 2 — 3 — 2 is  a walk  (not  a trail). 

4 — 1 — 2 — 3 — 4 — 5 is  a trail  (not  a path). 

1 — 2 — 3 — 4 — 5 is  a path  (not  a cycle). 

1 — 2 — 3 — 4 — 1 is  a cycle. 

Shortest  Path 

To  define  the  concept  of  a shortest  path,  we  assume  that  G = ( V , E)  is  a weighted  graph, 
that  is,  each  edge  ©,  Vj)  in  G has  a given  weight  or  length  /«,-  > 0.  Then  a shortest  path 
v i — > Up-  (with  fixed  Vp  and  Vp)  is  a path  (1)  such  that  the  sum  of  the  lengths  of  its  edges 

^12  + ^23  + ^34  + ' ' ■ + Ik- l,k 

(/ 12  = length  of  (v\,  v2),  etc.)  is  minimum  (as  small  as  possible  among  all  paths  from 
v i to  Vp).  Similarly,  a longest  path  Vp  — > Up-  is  one  for  which  that  sum  is  maximum. 

Shortest  (and  longest)  path  problems  are  among  the  most  important  optimization  problems. 
Here,  “length”  l.l7  (often  also  called  “cost”  or  “weight”)  can  be  an  actual  length  measured 
in  miles  or  travel  time  or  fuel  expenses,  but  it  may  also  be  something  entirely  different. 

For  instance,  the  traveling  salesman  problem  requires  the  determination  of  a shortest 
Hamiltonian1  cycle  in  a graph,  that  is,  a cycle  that  contains  all  the  vertices  of  the  graph. 

In  more  detail,  the  traveling  salesman  problem  in  its  most  basic  and  intuitive  form  can 
be  stated  as  follows.  You  have  a salesman  who  has  to  drive  by  car  to  his  customers.  He 
has  to  drive  to  n cities.  He  can  start  at  any  city  and  after  completion  of  the  trip  he  has  to 
return  to  that  city.  Furthermore,  he  can  only  visit  each  city  once.  All  the  cities  are  linked  by 
roads  to  each  other,  so  any  city  can  be  visited  from  any  other  city  directly,  that  is,  if  he 
wants  to  go  from  one  city  to  another  city,  there  is  only  one  direct  road  connecting  those  two 
cities.  He  has  to  find  the  optimal  route,  that  is,  the  route  with  the  shortest  total  mileage  for 
the  overall  trip.  This  is  a classic  problem  in  combinatorial  optimization  and  comes  up  in 
many  different  versions  and  applications.  The  maximum  number  of  possible  paths  to  be 
examined  in  the  process  of  selecting  the  optimal  path  for  n cities  is  (n  — l)!/2,  because, 
after  you  pick  the  first  city,  you  have  n — 1 choices  for  the  second  city,  n — 2 choices  for 
the  third  city,  etc.  You  get  a total  of  (n  — 1)!  (see  Sec.  24.4).  However,  since  the  mileage 
does  not  depend  on  the  direction  of  the  tour  (e.g.,  for  n = 4 (four  cities  1,  2,  3,  4),  the  tour 
1-2— 3—4-1  has  the  same  mileage  as  1-4— 3-2-1,  etc.,  so  that  we  counted  all  the  tours  twice!), 
the  final  answer  is  (n  — l)!/2.  Even  for  a small  number  of  cities,  say  n = 15,  the  maximum 
number  of  possible  paths  is  very  large.  Use  your  calculator  or  CAS  to  see  for  yourself!  This 
means  that  this  is  a very  difficult  problem  for  larger  n and  typical  of  problems  in 
combinatorial  optimization,  in  that  you  want  a discrete  solution  but  where  it  might  become 
nearly  impossible  to  explicitly  search  through  all  the  possibilities  and  therefore  some 
heuristics  (rules  of  thumbs,  shortcuts)  might  be  used,  and  a less  than  optimal  answer  suffices. 


1WILLIAM  ROWAN  HAMILTON  (1805-1865),  Irish  mathematician,  known  for  his  work  in  dynamics. 
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A variation  of  the  traveling  salesman  problem  is  the  following.  By  choosing  the  “most 
profitable”  route  V\  — > Vp,  a salesman  may  want  to  maximize  X/y,  where  Zy  is  his  expected 
commission  minus  his  travel  expenses  for  going  from  town  i to  town  j. 

In  an  investment  problem,  i may  be  the  day  an  investment  is  made,  j the  day  it  matures, 
and  /y  the  resulting  profit,  and  one  gets  a graph  by  considering  the  various  possibilities 
of  investing  and  reinvesting  over  a given  period  of  time. 


Shortest  Path  If  All  Edges  Have  Length  / = 1 

Obviously,  if  all  edges  have  length  /,  then  a shortest  path  Vy^Vp  is  one  that  has  the 
smallest  number  of  edges  among  all  paths  — > Up  in  a given  graph  G.  For  this  problem 

we  discuss  a BFS  algorithm.  BFS  stands  for  Breadth  First  Search.  This  means  that  in 
each  step  the  algorithm  visits  all  neighboring  (all  adjacent)  vertices  of  a vertex  reached, 
as  opposed  to  a DFS  algorithm  (Depth  First  Search  algorithm),  which  makes  a long  trail 
(as  in  a maze).  This  widely  used  BFS  algorithm  is  shown  in  Table  23.1. 

We  want  to  find  a shortest  path  in  G from  a vertex  ,v  (start)  to  a vertex  t (terminal).  To 
guarantee  that  there  is  a path  from  s to  t,  we  make  sure  that  G does  not  consist  of  separate 
portions.  Thus  we  assume  that  G is  connected,  that  is,  for  any  two  vertices  v and  w there 
is  a path  in  G.  (Recall  that  a vertex  v is  called  adjacent  to  a vertex  u if  there  is 

an  edge  (u,  v)  in  G.) 


Table  23.1  Moore’s2  BFS  for  Shortest  Path  (All  Lengths  One) 

Proceedings  of  the  International  Symposium  for  Switching  Theory,  Part  II.  pp.  285-292.  Cambridge:  Harvard 
University  Press,  1959. 

ALGORITHM  MOORE  [G  = (V,  E),  s,  t ] 

This  algorithm  determines  a shortest  path  in  a connected  graph  G = (V,  E)  from  a vertex 
s to  a vertex  t. 

INPUT:  Connected  graph  G = (V,  E),  in  which  one  vertex  is  denoted  by  s and 

one  by  t,  and  each  edge  (i,j)  has  length  ll3  = 1.  Initially  all  vertices  are 
unlabeled. 

OUTPUT:  A shortest  path  s — » t in  G = (V,  E) 

1.  Label  s with  0. 

2.  Set  i = 0. 

3.  Find  all  unlabeled  vertices  adjacent  to  a vertex  labeled  i. 

4.  Label  the  vertices  just  found  with  i + 1 . 

5.  If  vertex  t is  labeled,  then  “backtracking”  gives  the  shortest  path 

k (=  label  of  t),k-  1,  k - 2,  ■ ■ ■ , 0 

OUTPUT  k,  k - 1,  k - 2,  • • • , 0.  Stop 
Else  increase  i by  1.  Go  to  Step  3. 

End  MOORE 


2EDWARD  FORREST  MOORE  (1925-2003),  American  mathematician  and  computer  scientist,  who  did 
pioneering  work  in  theoretical  computer  science  (automata  theory,  Turing  machines). 
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EXAMPLE  1 


Application  of  Moore’s  BFS  Algorithm 

Find  a shortest  path  s — > t in  the  graph  G shown  in  Fig.  482. 

Solution.  Figure  482  shows  the  labels.  The  blue  edges  form  a shortest  path  (length  4).  There  is  another 
shortest  path  s —■ > t.  (Can  you  find  it?)  Hence  in  the  program  we  must  introduce  a rule  that  makes  backtracking 
unique  because  otherwise  the  computer  would  not  know  what  to  do  next  if  at  some  step  there  is  a choice  (for 
instance,  in  Fig.  482  when  it  got  back  to  the  vertex  labeled  2).  The  following  rule  seems  to  be  natural. 

Backtracking  rule.  Using  the  numbering  of  the  vertices  from  1 to  n (not  the  labeling!),  at  each  step,  if  a 
vertex  labeled  i is  reached,  take  as  the  next  vertex  that  with  the  smallest  number  (not  label!)  among  all  the 
vertices  labeled  i — 1 . 


2 


Fig.  482.  Example  1,  given  graph  and  result  of  labeling 

Complexity  of  an  Algorithm 

Complexity  of  Moore’s  algorithm.  To  find  the  vertices  to  be  labeled  1,  we  have  to  scan 
all  edges  incident  with  s.  Next,  when  i = 1 , we  have  to  scan  all  edges  incident  with  vertices 
labeled  1,  etc.  Hence  each  edge  is  scanned  twice.  These  are  2m  operations  (m  = number  of 
edges  of  G).  This  is  a function  c(m).  Whether  it  is  2m  or  5m  + 3 or  12 m is  not  so  essential; 
it  is  essential  that  c(m)  is  proportional  to  m (not  m2,  for  example);  it  is  of  the  “order”  m. 
We  write  for  any  function  am  + b simply  O(m),  for  any  function  am 2 + bm  + d simply 

o 

0(m  ),  and  so  on;  here,  O suggests  order.  The  underlying  idea  and  practical  aspect  are 
as  follows. 

In  judging  an  algorithm,  we  are  mostly  interested  in  its  behavior  for  very  large  problems 
(large  m in  the  present  case),  since  these  are  going  to  determine  the  limits  of  the 
applicability  of  the  algorithm.  Thus,  the  essential  item  is  the  fastest  growing  term 
(am  in  am  + bm  + d,  etc.)  since  it  will  overwhelm  the  others  when  m is  large  enough. 
Also,  a constant  factor  in  this  term  is  not  very  essential;  for  instance,  the  difference  between 
two  algorithms  of  orders,  say,  5m 2 and  8 m2  is  generally  not  very  essential  and  can  be 
made  irrelevant  by  a modest  increase  in  the  speed  of  computers.  However,  it  does  make 
a great  practical  difference  whether  an  algorithm  is  of  order  m or  m2  or  of  a still  higher 
power  mp.  And  the  biggest  difference  occurs  between  these  “polynomial  orders”  and 
“exponential  orders,”  such  as  2m. 

For  instance,  on  a computer  that  does  109  operations  per  second,  a problem  of  size 
m = 50  will  take  0.3  sec  with  an  algorithm  that  requires  m5  operations,  but  13  days  with 
an  algorithm  that  requires  2m  operations.  But  this  is  not  our  only  reason  for  regarding 
polynomial  orders  as  good  and  exponential  orders  as  bad.  Another  reason  is  the  gain  in 
using  a faster  computer.  For  example,  let  two  algorithms  be  0(m)  and  0(m2).  Then,  since 
1000  = 31.6  , an  increase  in  speed  by  a factor  1000  has  the  effect  that  per  hour  we  can 
do  problems  1000  and  31.6  times  as  big,  respectively.  But  since  1000  = 29'97,  with  an 
algorithm  that  is  0( 2m),  all  we  gain  is  a relatively  modest  increase  of  10  in  problem  size 
because  29'97  • 2m  = 2m+ 997 
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The  symbol  O is  quite  practical  and  commonly  used  whenever  the  order  of  growth  is 
essential,  but  not  the  specific  form  of  a function.  Thus  if  a function  g(m)  is  of  the  form 

g(m)  = kh(m)  + more  slowly  growing  terms  ( k # 0,  constant), 

we  say  that  g(m)  is  of  the  order  h(m)  and  write 

g(m)  = 0(h(m)). 


For  instance, 

am  + b = 0(m),  am2  + bm  + d = 0(m\  5 • 2m  + 3 m2  = 0( 2m). 

We  want  an  algorithm  si  to  be  “efficient,”  that  is,  “good”  with  respect  to 

(i)  Time  (number  c^(m)  of  computer  operations),  or 

(ii)  Space  (storage  needed  in  the  internal  memory) 

or  both.  Here  ty  suggests  “complexity”  of  si.  Two  popular  choices  for  cy  are 

(Worst  case ) c.,fi(m)  = longest  time  si  takes  for  a problem  of  size  m, 

(Average  case ) cy(m)  = average  time  si  takes  for  a problem  of  size  m. 

In  problems  on  graphs,  the  “size”  will  often  be  m (number  of  edges)  or  n (number  of 
vertices).  For  Moore’s  algorithm,  cy(m)  = 2m  in  both  cases.  Hence  the  complexity  of 
Moore’s  algorithm  is  of  order  O(m). 

For  a “good”  algorithm  si,  we  want  that  c.A(m)  does  not  grow  too  fast.  Accordingly, 
we  call  si  efficient  if  c:A(m ) = 0(mk ) for  some  integer  k =5  0;  that  is,  cy  may  contain 
only  powers  of  m (or  functions  that  grow  even  more  slowly,  such  as  In  m),  but  no 
exponential  functions.  Furthermore,  we  call  si  polynomially  bounded  if  s/i  is  efficient 
when  we  choose  the  “worst  case”  cy(m).  These  conventional  concepts  have  intuitive 
appeal,  as  our  discussion  shows. 

Complexity  should  be  investigated  for  every  algorithm,  so  that  one  can  also  compare 
different  algorithms  for  the  same  task.  This  may  often  exceed  the  level  in  this  chapter; 
accordingly,  we  shall  confine  ourselves  to  a few  occasional  comments  in  this  direction. 


SHORTEST  PATHS,  MOORE’S  BFS 
(All  edges  length  one) 


1-4  Find  a shortest  path  P.  s—>t  and  its  length  by 
Moore’s  algorithm.  Sketch  the  graph  with  the  labels  and 
indicate  P by  heavier  lines  as  in  Fig.  482. 


5.  Moore’s  algorithm.  Show  that  if  vertex  v has  label 
A(u)  = k,  then  there  is  a path  i — » v of  length  k. 

6.  Maximum  length.  What  is  the  maximum  number  of 
edges  that  a shortest  path  between  any  two  vertices  in 
a graph  with  n vertices  can  have?  Give  a reason.  In  a 
complete  graph  with  all  edges  of  length  1? 
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7.  Nonuniqueness.  Find  another  shortest  path  from  ,v  to 
t in  Example  1 of  the  text. 

8.  Moore’s  algorithm.  Call  the  length  of  a shortest  path 
s—*v  the  distance  of  u from  s.  Show  that  if  v has 
distance  /,  it  has  label  \(v)  = l. 

9.  CAS  PROBLEM.  Moore’s  Algorithm.  Write  a 
computer  program  for  the  algorithm  in  Table  23.1.  Test 
the  program  with  the  graph  in  Example  1 . Apply  it  to 
Probs.  1-3  and  to  some  graphs  of  your  own  choice. 


10-12 


HAMILTONIAN  CYCLE 


10.  Find  and  sketch  a Hamiltonian  cycle  in  the  graph  of  a 
dodecahedron,  which  has  12  pentagonal  faces  and  20 
vertices  (Fig.  483).  This  is  a problem  Hamilton  himself 
considered. 


Fig.  483.  Problem  10 


s 


4 


Fig.  484  Problem  13 


14.  Show  that  the  length  of  a shortest  postman  trail  is  the 
same  for  every  starting  vertex. 
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EULER  GRAPHS 


15.  An  Euler  graph  G is  a graph  that  has  a closed  Euler 
trail.  An  Euler  trail  is  a trail  that  contains  every  edge 
of  G exactly  once.  Which  subgraph  with  four  edges  of 
the  graph  in  Example  1,  Sec.  23.1,  is  an  Euler  graph? 


16.  Find  four  different  closed  Euler  trails  in  Fig.  485. 


2 4 


1 3 5 

Fig.  485.  Problem  16 


11.  Find  and  sketch  a Hamiltonian  cycle  in  Prob.  1. 

12.  Does  the  graph  in  Prob.  4 have  a Hamiltonian  cycle? 
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POSTMAN  PROBLEM 


13.  The  postman  problem  is  the  problem  of  finding  a 
closed  walk  W:  s — * s ( s the  post  office)  in  a graph  G 
with  edges  (i,j)  of  length  hj>  0 such  that  every  edge 
of  G is  traversed  at  least  once  and  the  length  of  W is 
minimum.  Find  a solution  for  the  graph  in  Fig.  484  by 
inspection.  (The  problem  is  also  called  the  Chinese 
postman  problem  since  it  was  published  in  the  journal 
Chinese  Mathematics  1 (1962),  273-277 .) 


17.  Is  the  graph  in  Fig.  484  an  Euler  graph.  Give  reason. 


18-20 


ORDER 


18.  Show  that  0(m3)  + 0{m3)  = 0(m3)  and  kO(mp)  = 
0(mp). 


19.  Show  that  Vl  + m2  = O(m),  0.02em  + 100m2  = 
0(em). 


20.  If  we  switch  from  one  computer  to  another  that  is  100 
times  as  fast,  what  is  our  gain  in  problem  size  per  hour 
in  the  use  of  an  algorithm  that  is  O(m),  0(m2),  0(m5), 
0(em)l 


23.:  Bellmans  Principle.  Dijkstras  Algorithm 

We  continue  our  discussion  of  the  shortest  path  problem  in  a graph  G.  The  last  section 
concerned  the  special  case  that  all  edges  had  length  1 . But  in  most  applications  the  edges 
(;,  j)  will  have  any  lengths  /.(J  > 0,  and  we  now  turn  to  this  general  case,  which  is  of 
greater  practical  importance.  We  write  = °o  for  any  edge  (i,  j)  that  does  not  exist  in  G 
(setting  oo  + a = oo  for  any  number  a , as  usual). 

We  consider  the  problem  of  finding  shortest  paths  from  a given  vertex,  denoted  by  1 
and  called  the  origin,  to  all  other  vertices  2,  3,  • ■ ■ , n of  G.  We  let  Lj  denote  the  length 
of  a shortest  path  Pc.  1 —*j  in  G. 


SEC.  23.3  Bellman’s  Principle.  Dijkstra’s  Algorithm 
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THEOREM  1 


PROOF 


Bellman’s  Minimality  Principle  or  Optimality  Principle3 

IfPj:  1 —*j  is  a shortest  path  from  1 to  j in  G and  ( i,j ) is  the  last  edge  ofPj  (Fig.  486), 
then  Pp  1 — * i [obtained  by  dropping  (;,  j)  from  Pf\  is  a shortest  path  1 — » i. 


P. 

l 


Fig.  486.  Paths  P and  P,  in  Bellman’s  minimality  principle 

Suppose  that  the  conclusion  is  false.  Then  there  is  a path  P* : 1 —*i  that  is  shorter  than 
P^.  Hence,  if  we  now  add  (;,  j ) to  P* , we  get  a path  1 —*j  that  is  shorter  than  If  This 
contradicts  our  assumption  that  Pj  is  shortest.  ■ 

From  Bellman’s  principle  we  can  derive  basic  equations  as  follows.  For  fixed  j we  may 
obtain  various  paths  1 —*j  by  taking  shortest  paths  !\  for  various  i for  which  there  is  in 
G an  edge  (i,  j),  and  add  (i,j)  to  the  corresponding  Pj.  These  paths  obviously  have  lengths 
L,  + fj  ( Lj  = length  of  Pj).  We  can  now  take  the  minimum  over  i,  that  is,  pick  an  i for 
which  Lj  + /j j is  smallest.  By  the  Bellman  principle,  this  gives  a shortest  path  1 — » j.  It 
has  the  length 


L,  = 0 

(1)  j = 2,  • • • , n. 

Lj  = min  (Lj  + fj), 

i*j 

These  are  the  Bellman  equations.  Since  Ip  = 0 by  definition,  instead  of  min,  f j we  can 
simply  write  minj.  These  equations  suggest  the  idea  of  one  of  the  best-known  algorithms 
for  the  shortest  path  problem,  as  follows. 


Dijkstra’s  Algorithm  for  Shortest  Paths 

Dijkstra’s4  algorithm  is  shown  in  Table  23.2,  where  a connected  graph  G is  a graph  in 
which,  for  any  two  vertices  v and  w in  G,  there  is  a path  1 1 —>w.  The  algorithm  is  a 
labeling  procedure.  At  each  stage  of  the  computation,  each  vertex  v gets  a label,  either 

(PL)  a permanent  label  = length  Lv  of  a shortest  path  1 — > v 


or 


(TL)  a temporary  label  = upper  bound  L„  for  the  length  of  a shortest  path  1 — > v. 


3RICHARD  BELLMAN  (1920-1984),  American  mathematician,  known  for  his  work  in  dynamic  programming. 

4EDSGER  WYBE  DIJKSTRA  (1930-2002),  Dutch  computer  scientist,  1972  recipient  of  the  ACM  Turing 
Award.  His  algorithm  appeared  in  Numerische  Mathematik  1 (1959),  269-27 1 . 
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We  denote  by  and  ° T£E  the  sets  of  vertices  with  a permanent  label  and  with  a temporary 

label,  respectively.  The  algorithm  has  an  initial  step  in  which  vertex  1 gets  the  permanent 
label  Li  = 0 and  the  other  vertices  get  temporary  labels,  and  then  the  algorithm  alternates 
between  Steps  2 and  3.  In  Step  2 the  idea  is  to  pick  k “minimally.”  In  Step  3 the  idea  is 
that  the  upper  bounds  will  in  general  improve  (decrease)  and  must  be  updated  accordingly. 
Namely,  the  new  temporary  label  Lj  of  vertex  j will  be  the  old  one  if  there  is  no 
improvement  or  it  will  be  Lk  + I kj  if  there  is. 

Table  23.2  Dijkstra’s  Algorithm  for  Shortest  Paths 

ALGORITHM  DIJKSTRA  \G  = (V,  E),  V = {1,  ■ ■ ■ , n],  /„  for  all  (i,  j ) in  E] 

Given  a connected  graph  G = ( V , E)  with  vertices  1,  • • • , n and  edges  (i,  j ) having 
lengths  lij  > 0,  this  algorithm  determines  the  lengths  of  shortest  paths  from  vertex  1 to 
the  vertices  2,  • • • , n. 

INPUT:  Number  of  vertices  n,  edges  (i,  j),  and  lengths 
OUTPUT:  Lengths  Lj  of  shortest  paths  1 — * j,  j = 2,  • • • , n 

1.  Initial  step 

Vertex  1 gets  PL:  A,  = 0. 

Vertex  j (=  2,  ■ • • , n)  gets  TL:  Lj  = lkj  (=  °°  if  there  is  no  edge  (1,  /)  in  G). 
SetSP££  = { 1 } , 2T££  = {2,  3,  • • • , n}. 

2.  Fixing  a permanent  label 

Find  a A:  in  2 TiL  for  which  Lk  is  miminum,  set  Lk  = Lk.  Take  the  smallest  k if 
there  are  several.  Delete  k from  2 T!£  and  include  it  in  2 ?!£. 

If  2T£g  = 0 (that  is,  is  empty)  then 

OUTPUT  L2,  • • • , Ln.  Stop 

Else  continue  (that  is,  go  to  Step  3). 

3.  Updating  temporary  labels 

For  all  j in  2T££,  set  L j = minfc  {Lj,  Lk  + lkj } (that  is,  take  the  smaller  of  Lj  and 
Lk  + lkj  as  your  new  Lj). 

Go  to  Step  2. 

End  DIJKSTRA 


EXAMPLE  Application  of  Dijkstra’s  Algorithm 

Applying  Dijkstra’s  algorithm  to  the  graph  in  Fig.  487a,  find  shortest  paths  from  vertex  1 to  vertices  2,  3,  4. 
Solution.  We  list  the  steps  and  computations. 

1.  G = 0, Z2  = 8, Z3  = 5, Z4  = 7,  3'S=jl).  2T2  = {2,  3,  4) 

2.  L3  = min  {Lz,  Z3,  Z4)  = 5,  k = 3,  3'2=jl,3),  2T2  = {2,  4) 

3.  L2  = min  {8,  L3  + 132)  = min  {8,  5 + 1 } = 6 

L4  = min  {7,  L3  + /34)  = min  (7,  “}  = 7 

2.  L2  = min  {Lz.Li}  = min  {6,  7}  = 6,  k = 2,  9*2  ={1,2,3},  = { 4) 

3.  L4  = min  (7,  L2  + /24)  = min  {7,  6 + 2}  = 7 

2.  L4  = 7,  k = 4 


SPS  = {1,  2,  3,  4) 


T2  = 0. 
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Figure  487b  shows  the  resulting  shortest  paths,  of  lengths  L2  = 6,  L3  = 5,  /.4 


= 7. 


(b)  Shortest  paths  in  G 

Example  1 


Complexity.  Dijkstra’s  algorithm  is  0(nz). 

PROOF  Step  2 requires  comparison  of  elements,  first  n — 2,  the  next  time  n — 3,  etc.,  a total 
of  (n  — 2 )(n  — l)/2.  Step  3 requires  the  same  number  of  comparisons,  a total  of 
(n  — 2 )(n  — l)/2,  as  well  as  additions,  first  n — 2,  the  next  time  n — 3,  etc.,  again  a total  of 
(n  — 2)(n  — )/2.  Hence  the  total  number  of  operations  is  3 (n  — 2 )(n  — l)/2  = 0(n2). 


1.  The  net  of  roads  in  Fig.  488  connecting  four  villages 
is  to  be  reduced  to  minimum  length,  but  so  that  one 
can  still  reach  every  village  from  every  other  village. 
Which  of  the  roads  should  be  retained?  Find  the 
solution  (a)  by  inspection,  (b)  by  Dijkstra’s  algorithm. 


F'g-  488  Problem  1 


2.  Show  that  in  Dijkstra’s  algorithm,  for  L ^ there  is  a path 
P:  1 — » k of  length  L^. 

3.  Show  that  in  Dijkstra’s  algorithm,  at  each  instant  the 
demand  on  storage  is  light  (data  for  fewer  than  n edges). 


4-9 


DIJKSTRA’S  ALGORITHM 


For  each  graph  find  the  shortest  paths. 


1 


6 
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23.4  Shortest  Spanning  Trees:  Greedy  Algorithm 

So  far  we  have  discussed  shortest  path  problems.  We  now  turn  to  a particularly  important 
kind  of  graph,  called  a tree,  along  with  related  optimization  problems  that  arise  quite 
often  in  practice. 

By  definition,  a tree  T is  a graph  that  is  connected  and  has  no  cycles.  “Connected” 
was  defined  in  Sec.  23.3;  it  means  that  there  is  a path  from  any  vertex  in  T to  any  other 
vertex  in  T.  A cycle  is  a path  .v  — > t of  at  least  three  edges  that  is  closed  (f  = 5);  see  also 
Sec.  23.2.  Figure  489a  shows  an  example. 

CAUTION!  The  terminology  varies;  cycles  are  sometimes  also  called  circuits. 

A spanning  tree  T in  a given  connected  graph  G = (V,  E)  is  a tree  containing  all  the 
n vertices  of  G.  See  Fig.  489b.  Such  a tree  has  n — 1 edges.  (Proof?) 

A shortest  spanning  tree  T in  a connected  graph  G (whose  edges  (/',  j)  have  lengths 
kj  > 0)  is  a spanning  tree  for  which  Xkj  (sum  over  all  edges  of  T)  is  minimum  compared 
to  S/y  for  any  other  spanning  tree  in  G. 


Fig.  489.  Example  of  (a)  a cycle,  (b)  a spanning  tree  in  a graph 

Trees  are  among  the  most  important  types  of  graphs,  and  they  occur  in  various 
applications.  Familiar  examples  are  family  trees  and  organization  charts.  Trees  can  be 
used  to  exhibit,  organize,  or  analyze  electrical  networks,  producer-consumer  and  other 
business  relations,  information  in  database  systems,  syntactic  structure  of  computer 
programs,  etc.  We  mention  a few  specific  applications  that  need  no  lengthy  additional 
explanations. 

The  set  of  shortest  paths  from  vertex  1 to  the  vertices  2,  • • ■ , n in  the  last  section  forms 
a spanning  tree. 

Railway  lines  connecting  a number  of  cities  (the  vertices)  can  be  set  up  in  the  form  of 
a spanning  tree,  the  “length”  of  a line  (edge)  being  the  construction  cost,  and  one  wants 
to  minimize  the  total  construction  cost.  Similarly  for  bus  lines,  where  “length”  may  be 
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EXAMPLE  1 


the  average  annual  operating  cost.  Or  for  steamship  lines  (freight  lines),  where  “length” 
may  be  profit  and  the  goal  is  the  maximization  of  total  profit.  Or  in  a network  of  telephone 
lines  between  some  cities,  a shortest  spanning  tree  may  simply  represent  a selection  of 
lines  that  connect  all  the  cities  at  minimal  cost.  In  addition  to  these  examples  we  could 
mention  others  from  distribution  networks,  and  so  on. 

We  shall  now  discuss  a simple  algorithm  for  the  problem  of  finding  a shortest  spanning 
tree.  This  algorithm  (Table  23.3)  is  particularly  suitable  for  sparse  graphs  (graphs  with 
very  few  edges;  see  Sec.  23.1). 

Table  23.3  Kruskal’s5  Greedy  Algorithm  for  Shortest  Spanning  Trees 

Proceedings  of  the  American  Mathematical  Society  7 (1956),  48-50. 

ALGORITHM  KRUSKAL  [G  = (V,  E),  l,3  for  all  (i,  j)  in  E ] 

Given  a connected  graph  G = (V,  E)  with  vertices  1,  2,  ■ • • , n and  edges  (i,j)  having 
length  hj  > the  algorithm  determines  a shortest  spanning  tree  T in  G. 

INPUT:  Edges  (i,j)  of  G and  their  lengths  Zy 

OUTPUT:  Shortest  spanning  tree  T in  G 

1.  Order  the  edges  of  G in  ascending  order  of  length. 

2.  Choose  them  in  this  order  as  edges  of  T,  rejecting  an  edge  only  if  it  forms  a 
cycle  with  edges  already  chosen. 

If  w — 1 edges  have  been  chosen,  then 
OUTPUT  T (=  the  set  of  edges  chosen).  Stop 

End  KRUSKAL 


Application  of  Kruskal’s  Algorithm 

Using  Kruskal’s  algorithm,  we  shall  determine  a shortest  spanning  tree  in  the  graph  in  Fig.  490. 


Table  23.4 

Solution  in 

Example  1 

Edge 

Length 

Choice 

(3, 

6) 

1 

1st 

(1. 

2) 

2 

2nd 

(1. 

3) 

4 

3rd 

(4, 

5) 

6 

4th 

(2, 

3) 

7 

Reject 

(3, 

4) 

8 

5th 

(5, 

6) 

9 

(2, 

4) 

11 

Solution.  See  Table  23.4.  In  some  of  the  intermediate  stages  the  edges  chosen  form  a disconnected  graph 
(see  Fig.  491);  this  is  typical.  We  stop  after  n — 1 = 5 choices  since  a spanning  tree  has  n — 1 edges.  In  our 
problem  the  edges  chosen  are  in  the  upper  part  of  the  list.  This  is  typical  of  problems  of  any  size;  in  general, 
edges  farther  down  in  the  list  have  a smaller  chance  of  being  chosen. 


5JOSEPH  BERNARD  KRUSKAL  (1928-  ),  American  mathematician  who  worked  at  Bell  Laboratories. 
He  is  known  for  his  contributions  to  graph  theory  and  statistics. 
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The  efficiency  of  Kruskal’s  method  is  greatly  increased  by  double  labeling  of 
vertices. 

Double  Labeling  of  Vertices.  Each  vertex  i carries  a double  label  (r„  pj),  where 

rj  = Root  of  the  subtree  to  which  i belongs. 

Pi  = Predecessor  of  i in  its  subtree. 

Pi  = 0 for  roots. 

This  simplifies  rejecting. 

Rejecting.  If  (i,  j ) is  next  in  the  list  to  be  considered,  reject  (;,  j)  if  ri  = r.j  (that  is,  i and 
j are  in  the  same  subtree,  so  that  they  are  already  joined  by  edges  and  (i,  j)  would  thus 
create  a cycle).  If  ^ rp  include  (i,  j)  in  T. 

If  there  are  several  choices  for  r,,  choose  the  smallest.  If  subtrees  merge  (become  a 
single  tree),  retain  the  smallest  root  as  the  root  of  the  new  subtree. 

For  Example  1 the  double-label  list  is  shown  in  Table  23.5.  In  storing  it,  at  each  instant 
one  may  retain  only  the  latest  double  label.  We  show  all  double  labels  in  order  to  exhibit 
the  process  in  all  its  stages.  Labels  that  remain  unchanged  are  not  listed  again. 
Underscored  are  the  two  l’s  that  are  the  common  root  of  vertices  2 and  3,  the  reason  for 
rejecting  the  edge  (2,  3).  By  reading  for  each  vertex  the  latest  label  we  can  read  from 
this  list  that  1 is  the  vertex  we  have  chosen  as  a root  and  the  tree  is  as  shown  in  the  last 
part  of  Fig.  491. 


1 2 1 

/ 

V 

\ 

\ \ 

\ / 

\ / 

First 

Second  Third 

Fig.  491.  Choice 

Fourth 

process  in  Example  1 

Fifth 

Table  23.5 

List  of  Double  Labels  in  Example  1 

Vertex 

Choice  1 
(3,6) 

Choice  2 
(1,2) 

Choice  3 
(1,3) 

Choice  4 
(4,  5) 

Choice  5 
(3,  4) 

1 

(1.0) 

2 

(1.  1) 

3 

(3,0) 

(1,  1) 

4 

(4,  0) 

(1,3) 

5 

(4,  4) 

(1,4) 

6 

(3,3) 

(1,3) 
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This  is  made  possible  by  the  predecessor  label  that  each  vertex  carries.  Also,  for  accepting 
or  rejecting  an  edge  we  have  to  make  only  one  comparison  (the  roots  of  the  two  endpoints 
of  the  edge). 

Ordering  is  the  more  expensive  part  of  the  algorithm.  It  is  a standard  process  in 
data  processing  for  which  various  methods  have  been  suggested  (see  Sorting  in  Ref. 
[E25]  listed  in  App.  1).  For  a complete  list  of  m edges,  an  algorithm  would  be 
0(m  log 2 m),  but  since  the  n — 1 edges  of  the  tree  are  most  likely  to  be  found  earlier, 
by  inspecting  the  q (<  m)  topmost  edges,  for  such  a list  of  q edges  one  would  have 
0(q  log  2 /n). 


FR  QBLEffl=^ET— 
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KRUSKAL’S  GREEDY  ALGORITHM 


Find  a shortest  spanning  tree  by  Kruskal’s  algorithm. 
Sketch  it. 


7.  CAS  PROBLEM.  Kruskal’s  Algorithm.  Write  a 
corresponding  program.  (Sorting  is  discussed  in  Ref. 
[E25]  listed  in  App.  1.) 

8.  To  get  a minimum  spanning  tree,  instead  of  adding 
shortest  edges,  one  could  think  of  deleting  longest 
edges.  For  what  graphs  would  this  be  feasible? 
Describe  an  algorithm  for  this. 

9.  Apply  the  method  suggested  in  Prob.  8 to  the  graph  in 
Example  1 . Do  you  get  the  same  tree? 

10.  Design  an  algorithm  for  obtaining  longest  spanning 
trees. 

11.  Apply  the  algorithm  in  Prob.  10  to  the  graph  in 
Example  1 . Compare  with  the  result  in  Example  1 . 

12.  Forest.  A (not  necessarily  connected)  graph  without 
cycles  is  called  a forest.  Give  typical  examples  of 
applications  in  which  graphs  occur  that  are  forests  or 
trees. 


4 


3 
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Dallas 

Denver 

Los  Angeles 

New  York 

Washington,  DC 

Chicago 

800 

900 

1800 

700 

650 

Dallas 

650 

1300 

1350 

1200 

Denver 

850 

1650 

1500 

Los  Angeles 

2500 

2350 

New  York 

200 

13.  Air  cargo.  Find  a shortest  spanning  tree  in  the 
complete  graph  of  all  possible  15  connections  between 
the  six  cities  given  (distances  by  airplane,  in  miles, 
rounded).  Can  you  think  of  a practical  application  of 
the  result? 


GENERAL  PROPERTIES  OF  TREES 

Prove  the  following.  Hint.  Use  Prob.  14  in  proving  15  and 
18;  use  Probs.  16  and  18  in  proving  20. 

14.  Uniqueness.  The  path  connecting  any  two  vertices  u 
and  v in  a tree  is  unique. 

15.  If  in  a graph  any  two  vertices  are  connected  by  a unique 
path,  the  graph  is  a tree. 


16.  If  a graph  has  no  cycles,  it  must  have  at  least  2 vertices 
of  degree  1 (definition  in  Sec.  23.1). 

17.  A tree  with  exactly  two  vertices  of  degree  1 must  be  a 
path. 

18.  A tree  with  n vertices  has  n — 1 edges.  (Proof  by 
induction.) 

19.  If  two  vertices  in  a tree  are  joined  by  a new  edge,  a 
cycle  is  formed. 

20.  A graph  with  n vertices  is  a tree  if  and  only  if  it  has 
n — 1 edges  and  has  no  cycles. 


23.5 


Prim’s6  algorithm,  shown  in  Table  23.6,  is  another  popular  algorithm  for  the  shortest 
spanning  tree  problem  (see  Sec.  23.4).  This  algorithm  avoids  ordering  edges  and  gives  a 
tree  T at  each  stage,  a property  that  Kruskal’s  algorithm  in  the  last  section  did  not  have 
(look  back  at  Fig.  491  if  you  did  not  notice  it). 

In  Prim’s  algorithm,  starting  from  any  single  vertex,  which  we  call  1,  we  “grow”  the 
tree  T by  adding  edges  to  it,  one  at  a time,  according  to  some  rule  (in  Table  23.6)  until 
T finally  becomes  a spanning  tree,  which  is  shortest. 

We  denote  by  U the  set  of  vertices  of  the  growing  tree  T and  by  S the  set  of  its  edges. 
Thus,  initially  U = { 1 } and  S = 0;  at  the  end,  U = V,  the  vertex  set  of  the  given  graph 
G = (V,  E),  whose  edges  (;,  j)  have  length  lvj  > 0,  as  before. 


Shortest  Spanning  Trees: 
Prims  Algorithm 


6ROBERT  CLAY  PRIM  (1921-  ),  American  computer  scientist  at  General  Electric,  Bell  Laboratories,  and 
Sandia  National  Laboratories. 


SEC.  23.5  Shortest  Spanning  Trees:  Prim’s  Algorithm 


989 


Thus  at  the  beginning  (Step  1)  the  labels 

A2,  ■ ■ • , A n °f  the  vertices  2,  • • ■ , n 

are  the  lengths  of  the  edges  connecting  them  to  vertex  1 (or  °°  if  there  is  no  such  edge  in 
G).  And  we  pick  (Step  2)  the  shortest  of  these  as  the  first  edge  of  the  growing  tree  T and 
include  its  other  end  j in  U (choosing  the  smallest  j if  there  are  several,  to  make  the  process 
unique).  Updating  labels  in  Step  3 (at  this  stage  and  at  any  later  stage)  concerns  each 
vertex  k not  yet  in  U.  Vertex  k has  label  A;-  = lup-)^-  from  before.  If  Ijp-  < \p-,  this  means 
that  k is  closer  to  the  new  member  j just  included  in  U than  k is  to  its  old  “closest  neighbor” 
i(k)  in  U.  Then  we  update  the  label  of  k,  replacing  A;-  = lppk\k  by  A k = Ijk  and  setting 
i(k)  = j.  If,  however,  Ijk  = A;,  (the  old  label  of  k),  we  don’t  touch  the  old  label.  Thus  the 
label  A/,  always  identifies  the  closest  neighbor  of  k in  U,  and  this  is  updated  in  Step  3 as 
U and  the  tree  T grow.  From  the  final  labels  we  can  backtrack  the  final  tree,  and  from  their 
numeric  values  we  compute  the  total  length  (sum  of  the  lengths  of  the  edges)  of  this  tree. 

Prim's  algorithm  is  useful  for  computer  network  design,  cable,  distribution  networks, 
and  transportation  networks. 


Table  23.(  Prim’s  Algorithm  for  Shortest  Spanning  Trees 

Bell  System  Technical  Journal  36  (1957),  1389-1401. 

For  an  improved  version  of  the  algorithm,  see  Cheriton  and  Tarjan,  SIAM  Journal  on  Computation  5 
(1976),  724-742. 


ALGORITHM  PRIM  [G  = (V,  E),  V = { 1,  • • • , n},  ltJ  for  all  (i,  j)  in  E] 

Given  a connected  graph  G = (V,  E)  with  vertices  1,  2,  • • • , n and  edges  (i,  j)  having 
length  hj  > 0,  this  algorithm  determines  a shortest  spanning  tree  T in  G and  its  length 

UT). 

INPUT:  n,  edges  (;',  /)  of  G and  their  lengths  l ^ 

OUTPUT:  Edge  set  S of  a shortest  spanning  tree  T in  G;  UT) 

[Initially,  all  vertices  are  unlabeled .] 

1.  Initial  step 

Set  i(k)  = 1,  U = {1},  S = 0. 

Label  vertex  k (=  2,  • • • , n)  with  Afc  = lik  [=  °°  if  G has  no  edge  (1,  k)]. 

2.  Addition  of  an  edge  to  the  tree  T 

Let  A j be  the  smallest  Afc  for  vertex  k not  in  U.  Include  vertex  j in  U and  edge 
(i(j),j)  in  S. 

If  U = V then  compute 

L(T)  = 2/y  (sum  over  all  edges  in  S’) 

OUTPUT  S,  UT).  Stop 

[S’  is  the  edge  set  of  a shortest  spanning  tree  T in  G.] 

Else  continue  (that  is,  go  to  Step  3). 

3.  Label  updating 

For  every  k not  in  U,  if  fk  < \k,  then  set  Afc  = fk  and  i(k)  = j. 

Go  to  Step  2. 


End  PRIM 
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EXAMPLE  1 Application  of  Prim’s  Algorithm 

Find  a shortest  spanning  tree  in  the  graph  in  Fig.  492  (which  is  the  same  as  in  Example  1,  Sec.  23.4, 
so  that  we  can  compare). 

Solution.  The  steps  are  as  follows. 

1.  i(k)  = 1,  U = { 1 },  S = 0,  initial  labels  see  Table  23.7. 

2.  A2  = /12  — 2 is  smallest,  U = { 1,  2},  S'  = {(1,  2)}. 

3.  Update  labels  as  shown  in  Table  23.7,  column  (I). 

2.  A3  = /13  = 4 is  smallest,  t/  = {1,  2,  3},  S = {(1,  2),  (1,  3)}. 

3.  Update  labels  as  shown  in  Table  23.7,  column  (II). 

2.  A6  = /36  = 1 is  smallest,  U = { 1,  2,  3,  6},  S = {(1,  2),  (1,  3),  (3,  6)}. 

3.  Update  labels  as  shown  in  Table  23.7,  column  (III). 

2.  A4  = /34  = 8 is  smallest,  U = {1,  2,  3,  4,  6},  S = {(1,  2),  (1,  3),  (3,  4),  (3,  6)}. 

3.  Update  labels  as  shown  in  Table  23.7,  column  (IV). 

2.  A5  = /45  = 6 is  smallest,  U = V,  S = (1,  2),  (1,  3),  (3,  4),  (3,  6),  (4,  5).  Stop. 

The  tree  is  the  same  as  in  Example  1,  Sec.  23.4.  Its  length  is  21.  You  will  find  it  interesting  to 
compare  the  growth  process  of  the  present  tree  with  that  in  Sec.  23.4. 


Table  23.7  Labeling  of  Vertices  in  Example  1 


Vertex 

Initial 

Label 

Relabeling 

(I) 

(II) 

(III) 

(IV) 

2 

1 12  = 2 

— 

— 

— 

— 

3 

1 13  = 4 

/is  = 4 

— 

— 

— 

4 

00 

^24  = 11 

OO 

II 

CO 

00 

II 

CO 

— 

5 

00 

00 

00 

l65  = 9 

I45  = 6 

6 

00 

00 

I36  = 1 

— 

— 

Fig.  492,  Graph  in 
Example  1 
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SHORTEST  SPANNING  TREES.  PRIM’S 
ALGORITHM 

1.  When  will  S — E at  the  end  in  Prim’s  algorithm? 

2.  Complexity.  Show  that  Prim’s  algorithm  has  com- 
plexity 0(n2). 

3.  What  is  the  result  of  applying  Prim’s  algorithm  to  a 
graph  that  is  not  connected? 

4.  If  for  a complete  graph  (or  one  with  very  few  edges 
missing),  our  data  is  an  n X n distance  table  (as  in  Prob. 
13,  Sec.  23.4),  show  that  the  present  algorithm  [which 
is  0(n2)]  cannot  easily  be  replaced  by  an  algorithm  of 
order  less  than  0(n2). 

5.  How  does  Prim’s  algorithm  prevent  the  generation  of 
cycles  as  you  grow  T? 
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10.  For  the  graph  in  Prob.  6,  Sec.  23.4. 

11.  For  the  graph  in  Prob.  4,  Sec.  23.4. 

12.  For  the  graph  in  Prob.  2,  Sec.  23.4. 

13.  CAS  PROBLEM.  Prim’s  Algorithm.  Write  a program 
and  apply  it  to  Probs.  6-9. 

14.  TEAM  PROJECT.  Center  of  a Graph  and  Related 
Concepts,  (a)  Distance,  Eccentricity.  Call  the  length 
of  a shortest  path  u—*l)  in  a graph  G = (V,  E)  the 


distance  d{u , v)  from  u to  v.  For  fixed  u , call  the 
greatest  d(u,  v)  as  v ranges  over  V the  eccentricity  e(u) 
of  u.  Find  the  eccentricity  of  vertices  1,  2,  3 in  the 
graph  in  Prob.  7. 

(b)  Diameter,  Radius,  Center.  The  diameter  d(G) 
of  a graph  G = (V,  E)  is  the  maximum  of  d(u , v ) as  u 
and  v vary  over  V,  and  the  radius  r(G)  is  the  smallest 
eccentricity  e(v)  of  the  vertices  v.  A vertex  v with 
e(v)  = r{G)  is  called  a central  vertex.  The  set  of  all 
central  vertices  is  called  the  center  of  G.  Find 
d(G),  r{G),  and  the  center  of  the  graph  in  Prob.  7. 

(c)  What  are  the  diameter,  radius,  and  center  of  the 
spanning  tree  in  Example  1 of  the  text? 

(d)  Explain  how  the  idea  of  a center  can  be  used  in  setting 
up  an  emergency  service  facility  on  a transportation 
network.  In  setting  up  a fire  station,  a shopping  center. 
How  would  you  generalize  the  concepts  in  the  case  of  two 
or  more  such  facilities? 

(e)  Show  that  a tree  T whose  edges  all  have  length  1 
has  center  consisting  of  either  one  vertex  or  two 
adjacent  vertices. 

(f)  Set  up  an  algorithm  of  complexity  O(n)  for  finding 
the  center  of  a tree  T. 


23.6  Flows  in  Networks 


After  shortest  path  problems  and  problems  for  trees,  as  a third  large  area  in  combinatorial 
optimization  we  discuss  flow  problems  in  networks  (electrical,  water,  communication, 
traffic,  business  connections,  etc.),  turning  from  graphs  to  digraphs  (directed  graphs;  see 
Sec.  23.1). 

By  definition,  a network  is  a digraph  G = (V,  E)  in  which  each  edge  (i,j)  has  assigned 
to  it  a capacity  Cy  > 0 [=  maximum  possible  flow  along  (i,  y ) ] . and  at  one  vertex,  s, 
called  the  source,  a flow  is  produced  that  flows  along  the  edges  of  the  digraph  G to  another 
vertex,  f,  called  the  target  or  sink,  where  the  flow  disappears. 

In  applications,  this  may  be  the  flow  of  electricity  in  wires,  of  water  in  pipes,  of  cars 
on  roads,  of  people  in  a public  transportation  system,  of  goods  from  a producer  to 
consumers,  of  e-mail  from  senders  to  recipients  over  the  Internet,  and  so  on. 

We  denote  the  flow  along  a (directed!)  edge  (/,_/)  by  fy  and  impose  two  conditions: 

1.  For  each  edge  (i,  j)  in  G the  flow  does  not  exceed  the  capacity  ctj, 

(1)  0 =ftJ  = Cij  (“Edge  condition”). 

2.  For  each  vertex  i,  not  s or  t. 


Inflow  = Outflow 


(“Vertex  condition,”  “Kirchhoff’s  law”); 
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in  a formula, 

(2) 


Inflow 


2/«  = 


Outflow 


0 if  vertex  i =A  s,  i + t, 
—f  at  the  source  s, 
fat  the  target  (sink)  t. 


where  / is  the  total  flow  (and  at  s the  inflow  is  zero,  whereas  at  f the  outflow  is  zero). 
Figure  493  illustrates  the  notation  (for  some  hypothetical  figures). 


Fig.  493.  Notation  in  (2):  inflow  and  outflow  for  a vertex  i (not  s or  t) 


Paths 

By  a path  V\  —*  vk  from  a vertex  V\  to  a vertex  vk  in  a digraph  G we  mean  a sequence 
of  edges 


(ui,  v2),  (v2,  v3),  • • • , (Ufe-r,  vk), 

regardless  of  their  directions  in  G,  that  forms  a path  as  in  a graph  (see  Sec.  23.2).  Hence 
when  we  travel  along  this  path  from  V\  to  vk  we  may  traverse  some  edge  in  its  given 
direction — then  we  call  it  a forward  edge  of  our  path — or  opposite  to  its  given  direction — 
then  we  call  it  a backward  edge  of  our  path.  In  other  words,  our  path  consists  of  one-way 
streets,  and  forward  edges  (backward  edges)  are  those  that  we  travel  in  the  right  direction 
(in  the  wrong  direction).  Figure  494  shows  a forward  edge  (u,  v)  and  a backward  edge  (w,  v) 
of  a path  —*vk. 

CAUTION!  Each  edge  in  a network  has  a given  direction,  which  we  cannot  change. 
Accordingly,  if  (m,  v)  is  a forward  edge  in  a path  v1—*vk,  then  (u,  v)  can  become  a 
backward  edge  only  in  another  path  xi  — in  which  it  is  an  edge  and  is  traversed  in  the 
opposite  direction  as  one  goes  from  X\  to  xf  see  Fig.  495.  Keep  this  in  mind,  to  avoid 
misunderstandings. 


Fig.  494.  Forward  edge  (u,  v)  and  Fig.  495.  Edge  (u,  v)  as  forward  edge  in  the  path 

backward  edge  (w,  v)  of  a path  v,  — > vk  v,  — » vk  and  as  backward  edge  in  the  path  x1  — > xy 

Flow  Augmenting  Paths 

Our  goal  will  be  to  maximize  the  flow  from  the  source  5 to  the  target  t of  a given  network. 
We  shall  do  this  by  developing  methods  for  increasing  an  existing  flow  (including  the 
special  case  in  which  the  latter  is  zero).  The  idea  then  is  to  find  a path  P:  s—>t  all  of 
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DEFINITION 


EXAMPLE  1 


whose  edges  are  not  fully  used,  so  that  we  can  push  additional  flow  through  P.  This 
suggests  the  following  concept. 


Flow  Augmenting  Path 

A flow  augmenting  path  in  a network  with  a given  flow  fj  on  each  edge  ( i , j ) is  a 
path  P:  s^t  such  that 

(i)  no  forward  edge  is  used  to  capacity;  thus  fj  < Cij  for  these; 

(ii)  no  backward  edge  has  flow  0;  thus  fj  > 0 for  these. 


Flow  Augmenting  Paths 

Find  flow  augmenting  paths  in  the  network  in  Fig.  496,  where  the  first  number  is  the  capacity  and  the  second 
number  a given  flow. 


First  number  = Capacity,  Second  number  = Given  flow 

Solution.  In  practical  problems,  networks  are  large  and  one  needs  a systematic  method  for  augmenting 
flows , which  we  discuss  in  the  next  section.  In  our  small  network,  which  should  help  to  illustrate  and  clarify 
the  concepts  and  ideas,  we  can  find  flow  augmenting  paths  by  inspection  and  augment  the  existing  flow  / = 9 
in  Fig.  496.  (The  outflow  from  s is  5 + 4 = 9,  which  equals  the  inflow  6 + 3 into  t.) 

We  use  the  notation 

A ij  = — fj  for  forward  edges 

A ij  = fy  for  backward  edges 

A = min  A ^ taken  over  all  edges  of  a path. 

From  Fig.  496  we  see  that  a flow  augmenting  path  P±:  t is  P±:  1 — 2 — 3 — 6 (Fig.  497),  with 

A 12  = 20  — 5 = 15,  etc.,  and  A = 3.  Hence  we  can  use  Pi  to  increase  the  given  flow  9 to  / = 9 + 3 = 12. 
All  three  edges  of  Pi  are  forward  edges.  We  augment  the  flow  by  3.  Then  the  flow  in  each  of  the  edges  of  Pi 
is  increased  by  3,  so  that  we  now  have/12  = 8 (instead  of  5),/23  =11  (instead  of  8),  and  f^Q  = 9 (instead  of 
6).  Edge  (2,  3)  is  now  used  to  capacity.  The  flow  in  the  other  edges  remains  as  before. 

We  shall  now  try  to  increase  the  flow  in  this  network  in  Fig.  496  beyond  / = 12. 

There  is  another  flow  augmenting  path  P2:  s—>t,  namely,  P2il— 4 — 5 — 3 — 6 (Fig.  497).  It  shows  how 
a backward  edge  comes  in  and  how  it  is  handled.  Edge  (3,  5)  is  a backward  edge.  It  has  flow  2,  so  that  A 35  = 2. 
We  compute  A 14  = 10  — 4 = 6,  etc.  (Fig.  497)  and  A = 2.  Hence  we  can  use  P2  for  another  augmentation  to 
get / = 12  + 2 = 14.  The  new  flow  is  shown  in  Fig.  498.  No  further  augmentation  is  possible.  We  shall  confirm 
later  that  / = 14  is  maximum. 


Fig.  497.  Flow  augmenting  paths  in  Example  1 
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Cut  Sets 

A cut  set  is  a set  of  edges  in  a network.  The  underlying  idea  is  simple  and  natural.  If  we 
want  to  find  out  what  is  flowing  from  j to  t in  a network,  we  may  cut  the  network 
somewhere  between  s and  t (Fig.  498  shows  an  example)  and  see  what  is  flowing  in  the 
edges  hit  by  the  cut,  because  any  flow  from  s to  t must  sometimes  pass  through  some  of 
these  edges.  These  form  what  is  called  a cut  set.  [In  Fig.  498,  the  cut  set  consists  of  the 
edges  (2,  3),  (5,  2),  (4,  5).]  We  denote  this  cut  set  by  ( S , T).  Here  S is  the  set  of  vertices 
on  that  side  of  the  cut  on  which  .v  lies  (S  = {.v,  2,  4 } for  the  cut  in  Fig.  498)  and  T is  the 
set  of  the  other  vertices  (T  = { 3,  5,  t]  in  Fig.  498).  We  say  that  a cut  partitions  the  vertex 
set  V into  two  parts  S and  T.  Obviously,  the  corresponding  cut  set  (.S',  T)  consists  of  all 
the  edges  in  the  network  with  one  end  in  S and  the  other  end  in  T. 


Cut 


Fig.  498.  Maximum  flow  in  Example  1 


By  definition,  the  capacity  cap  ( S , T)  of  a cut  set  (S,  T)  is  the  sum  of  the  capacities  of 
all  forward  edges  in  ( S , T)  (forward  edges  only!),  that  is,  the  edges  that  are  directed  from 
S to  T, 

(3)  cap  (.S,  T)  = Sty  [sum  over  the  forward  edges  of  (S,  7')]. 

Thus,  cap  (S’,  T)  = 1 1 + 7 = 18  in  Fig.  498. 

Explanation.  This  can  be  seen  as  follows.  Look  at  Fig.  498.  Recall  that  for  each  edge 
in  that  figure,  the  first  number  denotes  capacity  and  the  second  number  flow.  Intuitively, 
you  can  think  of  the  edges  as  roads,  where  the  capacity  of  the  road  is  how  many  cars  can 
actually  be  on  the  road,  and  the  flow  denotes  how  many  cars  actually  are  on  the  road.  To 
compute  capacity  cap  ( S , T)  we  are  only  looking  at  the  first  number  on  the  edges.  Take 
a look  and  see  that  the  cut  physically  cuts  three  edges,  that  is,  (2,  3),  (4,  5),  and  (5,  2). 
The  cut  concerns  only  forward  edges  that  are  being  cut,  so  it  concerns  edges  (2,  3)  and 
(4,  5)  (and  does  not  include  edge  (5,  2)  which  is  also  being  cut,  but  since  it  goes  backwards, 
it  does  not  count).  Hence  (2,  3)  contributes  1 1 and  (4,  5)  contributes  7 to  the  capacity  cap 
(5,  T),  for  a total  of  18  in  Fig.  498.  Hence  cap  (S,  T ) = 18. 

The  other  edges  (directed  from  T to  S ) are  called  backward  edges  of  the  cut  set  ( S , T), 
and  by  the  net  flow  through  a cut  set  we  mean  the  sum  of  the  flows  in  the  forward  edges 
minus  the  sum  of  the  flows  in  the  backward  edges  of  the  cut  set. 

CAUTION!  Distinguish  well  between  forward  and  backward  edges  in  a cut  set  and  in 
a path:  (5,  2)  in  Fig.  498  is  a backward  edge  for  the  cut  shown  but  a forward  edge  in  the 
path  1— 4 — 5 — 2 — 3 — 6. 

For  the  cut  in  Fig.  498  the  net  flow  is  1 1 + 6 — 3 = 14.  For  the  same  cut  in  Fig.  496 
(not  indicated  there),  the  net  flow  is  8 + 4 — 3 = 9.  In  both  cases  it  equals  the  flow  /. 
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THEOREM  1 


PROOF 


THEOREM  2 


PROOF 


We  claim  that  this  is  not  just  by  chance,  but  cuts  do  serve  the  purpose  for  which  we  have 
introduced  them: 


Net  Flow  in  Cut  Sets 

Any  given  flow  in  a network  G is  the  net  flow  through  any  cut  set  ( S , T)  of  G. 


By  Kirchhoff  s law  (2),  multiplied  by  —1,  at  a vertex  i we  have 


ro 

(4)  'Zh  - 2 fu  = \ 

, j . . If 

Outflow  Inflow 

Here  we  can  sum  over  j and  / from  1 to  n (=  number  of  vertices)  by  putting  f)j  = 0 for 
j = i and  also  for  edges  without  flow  or  nonexisting  edges;  hence  we  can  write  the  two 
sums  as  one, 


if  i + s,  t, 
if  i = s. 


2( fa -fa ) 


if  i # j,  t, 
if  i = s. 


We  now  sum  over  all  i in  S.  Since  s is  in  S,  this  sum  equals/: 


(5) 


v 2 (/«-/*)  =/• 

ieS  jeV 


We  claim  that  in  this  sum,  only  the  edges  belonging  to  the  cut  set  contribute.  Indeed, 
edges  with  both  ends  in  T cannot  contribute,  since  we  sum  only  over  i in  S;  but  edges 
(i,j)  with  both  ends  in  S contribute  +fj  at  one  end  and  —fj  at  the  other,  a total  contribution 
of  0.  Hence  the  left  side  of  (5)  equals  the  net  flow  through  the  cut  set.  By  (5),  this  is  equal 
to  the  flow  / and  proves  the  theorem.  ■ 


This  theorem  has  the  following  consequence,  which  we  shall  also  need  later  in  this 
section. 


Upper  Bound  for  Flows 

A flow  fin  a network  G cannot  exceed  the  capacity  of  any  cut  set  ( S , T ) in  G. 


By  Theorem  1 the  flow  / equals  the  net  flow  through  the  cut  set,  f = f\—  /2,  where  j\ 
is  the  sum  of  the  flows  through  the  forward  edges  and  f2  (=  0)  is  the  sum  of  the  flows 
through  the  backward  edges  of  the  cut  set.  Thus  f = fi-  Now/!  cannot  exceed  the  sum 
of  the  capacities  of  the  forward  edges;  but  this  sum  equals  the  capacity  of  the  cut  set,  by 
definition.  Together,/^  cap  (.S',  T),  as  asserted. 
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THEOREM  3 


PROOF 


THEOREM  4 


PROOF 


Cut  sets  will  now  bring  out  the  full  importance  of  augmenting  paths: 


Main  Theorem.  Augmenting  Path  Theorem  for  Flows 

A flow  from  s to  t in  a network  G is  maximum  if  and  only  if  there  does  not  exist  a 
flow  augmenting  path  s — » t in  G. 


(a)  If  there  is  a flow  augmenting  path  P:  s — » t,  we  can  use  it  to  push  through  it  an  additional 
flow.  Hence  the  given  flow  cannot  be  maximum. 

(b)  On  the  other  hand,  suppose  that  there  is  no  flow  augmenting  path  s — » t in  G. 
Let  So  be  the  set  of  all  vertices  i (including  s)  such  that  there  is  a flow  augmenting 
path  s —*  i,  and  let  To  be  the  set  of  the  other  vertices  in  G.  Consider  any  edge  (i,  j ) with 
i in  So  and  j in  To-  Then  we  have  a flow  augmenting  path  s — » i since  i is  in  So,  but 
s — > i — * j is  not  flow  augmenting  because  j is  not  in  So-  Hence  we  must  have 


Otherwise  we  could  use  (i,  j ) to  get  a flow  augmenting  path  s — » i — » j.  Now  (5o,  To) 
defines  a cut  set  (since  t is  in  To;  why?).  Since  by  (6),  forward  edges  are  used  to  capacity 
and  backward  edges  carry  no  flow,  the  net  flow  through  the  cut  set  (So,  To)  equals  the 
sum  of  the  capacities  of  the  forward  edges,  which  is  cap  (So,  To)  by  definition.  This 
net  flow  equals  the  given  flow  / by  Theorem  1 . Thus  / = cap  (So,  To).  We  also  have 
/ Si  cap  (So,  To)  by  Theorem  2.  Hence  / must  be  maximum  since  we  have  reached 
equality.  ■ 

The  end  of  this  proof  yields  another  basic  result  (by  Ford  and  Fulkerson,  Canadian  Journal 
of  Mathematics  8 (1956),  399-404),  namely,  the  so-called 


Max-Flow  Min-Cut  Theorem 

The  maximum  flow  in  any  network  G equals  the  capacity  of  a “minimum  cut  set” 
(=  a cut  set  of  minimum  capacity)  in  G. 


We  have  just  seen  that/  = cap  (So,  To)  for  a maximum  flow/and  a suitable  cut  set  (So,  To). 
Now  by  Theorem  2 we  also  have  / Si  cap  (S,  T)  for  this  / and  any  cut  set  (S,  T)  in  G. 
Together,  cap  (So,  To)  Si  cap  (S,  T).  Hence  (So,  To)  is  a minimum  cut  set. 

The  existence  of  a maximum  flow  in  this  theorem  follows  for  rational  capacities  from 
the  algorithm  in  the  next  section  and  for  arbitrary  capacities  from  the  Edmonds-Karp  BFS 
also  in  that  section. 


The  two  basic  tools  in  connection  with  networks  are  flow  augmenting  paths  and  cut  sets. 
In  the  next  section  we  show  how  flow  augmenting  paths  can  be  used  in  an  algorithm  for 
maximum  flows. 
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1-6  CUT  SETS,  CAPACITY 
Find  T and  cap  (S,  T)  for: 

1.  Fig.  498,  S = {1,2,4,5) 

2.  Fig.  499,  S = {1,2,3) 

3.  Fig.  498,  S = (1,2,3) 

4.  Fig.  499,  S = {1,2} 

5.  Fig.  499,  S=  {1,2,4,  5} 


6.  Fig.  498,  S = {1,3,5) 


7-8 


MINIMUM  CUT  SET 


Find  a minimum  cut  set  and  its  capacity  for  the  network: 

7.  In  Fig.  499 

8.  In  Fig.  496.  Verify  that  its  capacity  equals  the  maximum 
flow. 


9.  Why  are  backward  edges  not  considered  in  the 
definition  of  the  capacity  of  a cut  set? 

10.  Incremental  network.  Sketch  the  network  in  Fig.  499, 
and  on  each  edge  (i,  j)  write  Cy  — fj  and  j\j.  Do  you 
recognize  that  from  this  “incremental  network”  one  can 
more  easily  see  flow  augmenting  paths? 


11.  Omission  of  edges.  Which  edges  could  be  omitted 
from  the  network  in  Fig.  499  without  decreasing  the 
maximum  flow? 


12-15 


FLOW  AUGMENTING  PATHS 


Find  flow  augmenting  paths: 


12. 


i,  o 


16-19 


MAXIMUM  FLOW 


Find  the  maximum  flow  by  inspection: 


16.  In  Prob.  13 


18.  In  Prob.  12 

19. 

s 


5 


20.  Find  another  maximum  flow/=  15  in  Prob.  19. 
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23.7  Maximum  Flow:  Ford-Fulkerson  Algorithm 

Flow  augmenting  paths,  as  discussed  in  the  last  section,  are  used  as  the  basic  tool  in  the 
Ford-Fulkerson7  algorithm  in  Table  23.8  in  which  a given  flow  (for  instance,  zero  flow  in 
all  edges)  is  increased  until  it  is  maximum.  The  algorithm  accomplishes  the  increase  by  a 
stepwise  construction  of  flow  augmenting  paths,  one  at  a time,  until  no  further  such  paths 
can  be  constructed,  which  happens  precisely  when  the  flow  is  maximum. 

In  Step  1,  an  initial  flow  may  be  given.  In  Step  3,  a vertex  j can  be  labeled  if  there  is 
an  edge  (i,  j)  with  i labeled  and 


Cij  > fij  (“forward  edge”) 

or  if  there  is  an  edge  (j,  i ) with  i labeled  and 

fji  > 0 (“backward  edge”). 

To  scan  a labeled  vertex  i means  to  label  every  unlabeled  vertex  j adjacent  to  i that  can  be 
labeled.  Before  scanning  a labeled  vertex  i , scan  all  the  vertices  that  got  labeled  before  i. 
This  BFS  (Breadth  First  Search)  strategy  was  suggested  by  Edmonds  and  Karp  in  1972 
(. Journal  of  the  Association  for  Computing  Machinery  19,  248-64).  It  has  the  effect  that  one 
gets  shortest  possible  augmenting  paths. 

Table  23.8  Ford-Fulkerson  Algorithm  for  Maximum  Flow 

Canadian  Journal  of  Mathematics  9 (1957),  2 10-2  IB 

ALGORITHM  FORD-FULKERSON 

[G  = (V,  E),  vertices  1 (=  s),  ■ ■ ■ , n (=  r),  edges  (i,  /),  Cy] 

This  algorithm  computes  the  maximum  flow  in  a network  G with  source  s,  sink  f,  and 
capacities  Cy  > 0 of  the  edges  (/,  j). 

INPUT:  n,  s = 1 , t = n,  edges  (i,  j)  of  G,  q; 

OUTPUT:  Maximum  flow  / in  G 

1.  Assign  an  initial  flow  fj  (for  instance,  fij  = 0 for  all  edges),  compute  f. 

2.  Label  s by  0.  Mark  the  other  vertices  “unlabeled.  ” 

3.  Find  a labeled  vertex  i that  has  not  yet  been  scanned.  Scan  i as  follows.  For  every 
unlabeled  adjacent  vertex  j,  if  Cy  > fij,  compute 

[Ay  if  i = I 

Ay  Cij  — fij  and  A j s 

(min  (Aj,  Ay)  if  z > 1 

and  label  j with  a “forward  label”  ( i+ , Aj);  or  if  fji  > 0,  compute 

A j = min  (A  hfji) 

and  label  j by  a “backward  label”  (i~,  Aj). 


7LESTER  RANDOLPH  FORD  Jr.  (1927-  ) and  DELBERT  RAY  FULKERSON  (1924-1976),  American 
mathematicians  known  for  their  pioneering  work  on  flow  algorithms. 
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If  no  such  j exists  then  OUTPUT  f.  Stop 
[/  is  the  maximum  flow.] 

Else  continue  (that  is,  go  to  Step  4). 

4.  Repeat  Step  3 until  t is  reached. 

[This  gives  a flow  augmenting  path  P:  s — > f.] 

If  it  is  impossible  to  reach  t then  OUTPUT  f.  Stop 
[/  is  the  maximum  flow.] 

Else  continue  (that  is,  go  to  Step  5). 

5.  Backtrack  the  path  P,  using  the  labels. 

6.  Using  P,  augment  the  existing  flow  by  At.  Set  / = / + A t. 

7.  Remove  all  labels  from  vertices  2,  ■ ■ • , n.  Go  to  Step  3. 
End  FORD-FULKERSON 


EXAMPLE  Ford-Fulkerson  Algorithm 

Applying  the  Ford-Fulkerson  algorithm,  determine  the  maximum  flow  for  the  network  in  Fig.  500  (which  is 
the  same  as  that  in  Example  1,  Sec.  23.6,  so  that  we  can  compare). 

Solution.  The  algorithm  proceeds  as  follows. 

1.  An  initial  flow  / = 9 is  given. 

2.  Label  s (=  1)  by  0.  Mark  2,  3,  4,  5,  6 “unlabeled.” 


Fig.  500.  Network  in  Example  1 with  capacities  (first  numbers)  and  given  flow 

3.  Scan  1. 

Compute  A12  = 20  — 5 = 15  = A2.  Label  2 by  (1+,  15). 

Compute  A14  = 10  — 4 = 6 = A4.  Label  4 by  (1+,  6). 

4.  Scan  2. 

Compute  A23  =11  — 8 = 3,  A3  = min  (A2,  3)  = 3.  Label  3 by  (2+,  3). 

Compute  A5  = min  (A2,  3)  = 3.  Label  5 by  ( 2~ , 3). 

Scan  3. 

Compute  A36  = 13  — 6 = 7,  Ag  = At  = min  (A3,  7)  = 3.  Label  6 by  (3+,  3). 

5.  P:  1 — 2 — 3 — 6 (=  t)  is  a flow  augmenting  path. 

6.  At  = 3.  Augmentation  gives  /12  = 8,  f'23  = 11,  f%6  = 9,  other  fy  unchanged.  Augmented  flow 
/ = 9 + 3 = 12. 

7.  Remove  labels  on  vertices  2,  • • • , 6.  Go  to  Step  3. 

3.  Scan  1. 

Compute  A 12  = 20  — 8 = 12  = A2.  Label  2 by  (1+,  12). 

Compute  A14  = 10  — 4 = 6 = A4.  Label  4 by  (1+,  6). 
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4.  Scan  2. 

Compute  A5  = min  (A2,  3)  — 3.  Label  5 by  (2-,  3). 

Scan  4.  [No  vertex  left  for  labeling .] 

Scan  5. 

Compute  A3  = min  (A5,  2)  = 2.  Label  3 by  (5-,  2). 

Scan  3. 

Compute  A36  = 13  — 9 = 4,  A6  = min  (A3,  4)  = 2.  Label  6 by  (3+,  2). 

5.  P:  1 — 2 — 5 — 3 — 6 (=  t)  is  a flow  augmenting  path. 

6.  At  = 2.  Augmentation  gives  /12  — 10,  f^2  — L/35  — ®,f36  = 11,  other  unchanged.  Augmented 
flow  f = 12  + 2 = 14. 

7.  Remove  labels  on  vertices  2,  • • • , 6.  Go  to  Step  3. 

One  can  now  scan  1 and  then  scan  2,  as  before,  but  in  scanning  4 and  then  5 one  finds  that  no  vertex  is  left  for 
labeling.  Thus  one  can  no  longer  reach  t.  Hence  the  flow  obtained  (Fig.  501)  is  maximum,  in  agreement  with 
our  result  in  the  last  section. 


gBEg^^M=5IT~n=gE 


1.  Do  the  computations  indicated  near  the  end  of  Exam- 
ple 1 in  detail. 

2.  Solve  Example  1 by  Ford-Fulkerson  with  initial  flow  0. 
Is  it  more  work  than  in  Example  1? 

3.  Which  are  the  “bottleneck”  edges  by  which  the  flow  in 
Example  1 is  actually  limited?  Hence  which  capacities 
could  be  decreased  without  decreasing  the  maximum 
flow? 

4.  What  is  the  (simple)  reason  that  Kirchhoff’s  law  is 
preserved  in  augmenting  a flow  by  the  use  of  a flow 
augmenting  path? 

5.  How  does  Ford-Fulkerson  prevent  the  formation  of 
cycles? 

MAXIMUM  FLOW 

Find  the  maximum  flow  by  Ford-Fulkerson: 

6.  In  Prob.  12,  Sec.  23.6 

7.  In  Prob.  15,  Sec.  23.6 

8.  In  Prob.  14,  Sec.  23.6 


10.  Integer  flow  theorem.  Prove  that,  if  the  capacities  in 
a network  G are  integers,  then  a maximum  flow  exists 
and  is  an  integer. 

11.  CAS  PROBLEM.  Ford-Fulkerson.  Write  a program 
and  apply  it  to  Probs.  6-9. 

12.  How  can  you  see  that  Ford-Fulkerson  follows  a BFS 
technique? 

13.  Are  the  consecutive  flow  augmenting  paths  produced 
by  Ford-Fulkerson  unique? 

14.  If  the  Ford-Fulkerson  algorithm  stops  without  reach- 
ing t,  show  that  the  edges  with  one  end  labeled  and  the 
other  end  unlabeled  form  a cut  set  (5,  T)  whose  capacity 
equals  the  maximum  flow. 

15.  Find  a minimum  cut  set  in  Fig.  500  and  its  capacity. 

16.  Show  that  in  a network  G with  all  Cy  = 1 , the  maximum 
flow  equals  the  number  of  edge-disjoint  paths  s — » t. 

17.  In  Prob.  15,  the  cut  set  contains  precisely  all  forward 
edges  used  to  capacity  by  the  maximum  flow  (Fig.  501). 
Is  this  just  by  chance? 

18.  Show  that  in  a network  G with  capacities  all  equal  to  1, 
the  capacity  of  a minimum  cut  set  (5,  T)  equals  the 
minimum  number  q of  edges  whose  deletion  destroys 
all  directed  paths  s — * t.  (A  directed  path  u — » w is  a 
path  in  which  each  edge  has  the  direction  in  which  it  is 
traversed  in  going  from  v to  w.) 
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19.  Several  sources  and  sinks.  If  a network  has  several 
sources  ■ ■ ■ , s^,  show  that  it  can  be  reduced  to  the 
case  of  a single-source  network  by  introducing  a new 
vertex  s and  connecting  s to  Si,  ■ ■ ■ , by  k edges  of 
capacity  °o.  Similarly  if  there  are  several  sinks.  Illustrate 
this  idea  by  a network  with  two  sources  and  two  sinks. 

20.  Find  the  maximum  flow  in  the  network  in  Fig.  502  with 
two  sources  (factories)  and  two  sinks  (consumers). 


23.8  Bipartite  Graphs.  Assignment  Problems 

From  digraphs  we  return  to  graphs  and  discuss  another  important  class  of  combinatorial 
optimization  problems  that  arises  in  assignment  problems  of  workers  to  jobs,  jobs  to 
machines,  goods  to  storage,  ships  to  piers,  classes  to  classrooms,  exams  to  time  periods, 
and  so  on.  To  explain  the  problem,  we  need  the  following  concepts. 

A bipartite  graph  G = ( V , E ) is  a graph  in  which  the  vertex  set  V is  partitioned  into 
two  sets  S and  T (without  common  elements,  by  the  definition  of  a partition)  such  that 
every  edge  of  G has  one  end  in  S and  the  other  in  T.  Hence  there  are  no  edges  in  G that 
have  both  ends  in  S or  both  ends  in  T.  Such  a graph  G = (V,  E ) is  also  written 
G = ( S , T;  E). 

Figure  503  shows  an  illustration.  V consists  of  seven  elements,  three  workers  a,  b,  c, 
making  up  the  set  S,  and  four  jobs  1,  2,  3,  4,  making  up  the  set  T.  The  edges  indicate  that 
worker  a can  do  the  jobs  1 and  2,  worker  b the  jobs  1,  2,  3,  and  worker  c the  job  4.  The 
problem  is  to  assign  one  job  to  each  worker  so  that  every  worker  gets  one  job  to  do.  This 
suggests  the  next  concept,  as  follows. 


DEFINITION 


Maximum  Cardinality  Matching 

A matching  in  G = (.S',  T;  E)  is  a set  M of  edges  of  G such  that  no  two  of  them 
have  a vertex  in  common.  If  M consists  of  the  greatest  possible  number  of  edges, 
we  call  it  a maximum  cardinality  matching  in  G. 


For  instance,  a matching  in  Fig.  503  is  M i = {(a,  2),  ( b , 1)}.  Another  is  M 2 = {(a,  1), 
( b , 3),  (c,  4)};  obviously,  this  is  of  maximum  cardinality. 


S T 


Fig.  503.  Bipartite  graph  in  the  assignment  of  a set  S = {a,  b,  c} 
of  workers  to  a set  T = {I,  2,  3,  4}  of  jobs 


A vertex  v is  exposed  (or  not  covered)  by  a matching  M if  v is  not  an  endpoint  of  an 
edge  of  M.  This  concept,  which  always  refers  to  some  matching,  will  be  of  interest  when 
we  begin  to  augment  given  matchings  (below).  If  a matching  leaves  no  vertex  exposed, 
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THEOREM  1 


PROOF 


we  call  it  a complete  matching.  Obviously,  a complete  matching  can  exist  only  if  S and 
T consist  of  the  same  number  of  vertices. 

We  now  want  to  show  how  one  can  stepwise  increase  the  cardinality  of  a matching  M 
until  it  becomes  maximum.  Central  in  this  task  is  the  concept  of  an  augmenting  path. 

An  alternating  path  is  a path  that  consists  alternately  of  edges  in  M and  not  in  M 
(Fig.  504A).  An  augmenting  path  is  an  alternating  path  both  of  whose  endpoints  (a  and  b 
in  Fig.  504B)  are  exposed.  By  dropping  from  the  matching  M the  edges  that  are  on  an 
augmenting  path  P (two  edges  in  Fig.  504B)  and  adding  to  M the  other  edges  of  P (three 
in  the  figure),  we  get  a new  matching,  with  one  more  edge  than  M.  This  is  how  we  use 
an  augmenting  path  in  augmenting  a given  matching  by  one  edge.  We  assert  that  this 
will  always  lead,  after  a number  of  steps,  to  a maximum  cardinality  matching.  Indeed, 
the  basic  role  of  augmenting  paths  is  expressed  in  the  following  theorem. 


(A)  Alternating  path 


a 


(B)  Augmenting  path  P 

Fig.  504.  Alternating  and  augmenting  paths. 
Heavy  edges  are  those  belonging  to  a matching  M 


Augmenting  Path  Theorem  for  Bipartite  Matching 

A matching  M in  a bipartite  graph  G = ( S , T;  E)  is  of  maximum  cardinality  if  and 
only  if  there  does  not  exist  an  augmenting  path  P with  respect  to  M. 


(a)  We  show  that  if  such  a path  P exists,  then  M is  not  of  maximum  cardinality.  Let  P have 
q edges  belonging  to  M.  Then  P has  <7+1  edges  not  belonging  to  M.  (In  Fig.  504B  we 
have  q = 2.)  The  endpoints  a and  b of  P are  exposed,  and  all  the  other  vertices  on  P are 
endpoints  of  edges  in  M,  by  the  definition  of  an  alternating  path.  Flence  if  an  edge  of  M is 
not  an  edge  of  P,  it  cannot  have  an  endpoint  on  P since  then  M would  not  be  a matching. 
Consequently,  the  edges  of  M not  on  P,  together  with  the  q + 1 edges  of  P not  belonging 
to  M form  a matching  of  cardinality  one  more  than  the  cardinality  of  M because  we  omitted 
q edges  from  M and  added  <7+1  instead.  Hence  M cannot  be  of  maximum  cardinality. 

(b)  We  now  show  that  if  there  is  no  augmenting  path  for  M,  then  M is  of  maximum 
cardinality.  Let  M*  be  a maximum  cardinality  matching  and  consider  the  graph  H 
consisting  of  all  edges  that  belong  either  to  M or  to  M*,  but  not  to  both.  Then  it  is  possible 
that  two  edges  of  H have  a vertex  in  common,  but  three  edges  cannot  have  a vertex  in 
common  since  then  two  of  the  three  would  have  to  belong  to  M (or  to  M*),  violating  that 
M and  M*  are  matchings.  So  every  v in  V can  be  in  common  with  two  edges  of  H or  with 
one  or  none.  Hence  we  can  characterize  each  “component”  (=  maximal  connected  subset) 
of  H as  follows. 

(A)  A component  of  H can  be  a closed  path  with  an  even  number  of  edges  (in  the  case 
of  an  odd  number,  two  edges  from  M or  two  from  M*  would  meet,  violating  the  matching 
property).  See  (A)  in  Fig.  505. 
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(B)  A component  of  H can  be  an  open  path  P with  the  same  number  of  edges  from  M 
and  edges  from  M*,  for  the  following  reason.  P must  be  alternating,  that  is,  an  edge  of 
M is  followed  by  an  edge  of  M*,  etc.  (since  M and  M*  are  matchings).  Now  if  P had  an 
edge  more  from  M*,  then  P would  be  augmenting  for  M [see  (B2)  in  Fig.  505], 
contradicting  our  assumption  that  there  is  no  augmenting  path  for  M.  If  P had  an  edge 
more  from  M,  it  would  be  augmenting  for  M*  [see  (B3)  in  Fig.  505],  violating  the 
maximum  cardinality  of  M*,  by  part  (a)  of  this  proof.  Hence  in  each  component  of  H,  the 
two  matchings  have  the  same  number  of  edges.  Adding  to  this  the  number  of  edges  that 
belong  to  both  M and  M*  (which  we  left  aside  when  we  made  up  H ),  we  conclude  that 
M and  M*  must  have  the  same  number  of  edges.  Since  M*  is  of  maximum  cardinality, 
this  shows  that  the  same  holds  for  M,  as  we  wanted  to  prove. 


(A) 


Edge  from  M 
■ Edge  from  M* 


(Bl) 

(B2) 

(B3) 


(Possible) 
(Augmenting  for  M) 
(Augmenting  for  M*) 


Fig.  505.  Proof  of  the  augmenting  path  theorem  for  bipartite  matching 


This  theorem  suggests  the  algorithm  in  Table  23.9  for  obtaining  augmenting  paths,  in 
which  vertices  are  labeled  for  the  purpose  of  backtracking  paths.  Such  a label  is  in 
addition  to  the  number  of  the  vertex,  which  is  also  retained.  Clearly,  to  get  an  augmenting 
path,  one  must  start  from  an  exposed  vertex,  and  then  trace  an  alternating  path  until  one 
arrives  at  another  exposed  vertex.  After  Step  3 all  vertices  in  S are  labeled.  In  Step  4, 
the  set  T contains  at  least  one  exposed  vertex,  since  otherwise  we  would  have  stopped 
at  Step  1. 

Table  23.9  Bipartite  Maximum  Cardinality  Matching 

ALGORITHM  MATCHING  [G  = ( S , T;  E),  M,  n] 

This  algorithm  determines  a maximum  cardinality  matching  M in  a bipartite  graph  G by 
augmenting  a given  matching  in  G. 

INPUT:  Bipartite  graph  G = (S,  T;  E)  with  vertices  1 matching  M in  G (for 
instance,  M — 0) 

OUTPUT:  Maximum  cardinality  matching  M in  G 

1.  If  there  is  no  exposed  vertex  in  S then 

OUTPUT  M.  Stop 

[M  is  of  maximum  cardinality  in  G.] 

Else  label  all  exposed  vertices  in  S with  0. 

2.  For  each  i in  S and  edge  (i,  j ) not  in  M,  label  j with  i,  unless  already  labeled. 


1004 


CHAP.  23  Graphs.  Combinatorial  Optimization 


3.  For  each  nonexposed  j in  T,  label  i with  j,  where  i is  the  other  end 

of  the  unique  edge  ( i , j)  in  M. 

4.  Backtrack  the  alternating  path  P ending  on  an  exposed  vertex  in  T 

by  using  the  labels  on  the  vertices. 

5.  If  no  P in  Step  4 is  augmenting  then 

OUTPUT  M.  Stop 

[M  is  of  maximum  cardinality  in  G.] 

Else  augment  M by  using  an  augmenting  path  P. 

Remove  all  labels. 

Go  to  Step  1. 

End  MATCHING 


Maximum  Cardinality  Matching 

Is  the  matching  Mi  in  Fig.  506a  of  maximum  cardinality?  If  not,  augment  it  until  maximum  cardinality  is  reached. 


s 

T 

s 

T 

0(T 

1 ( |)3 

I ill ) J 

I^U J J 

/ l o 

0(4 

r \D3 

0(4 J 

vD3 

(a) 

Given  graph 

(b)  Matching  Af2 

and  matching  M1 

and  new  labels 

Fig.  506. 

Example  1 

Solution.  We  apply  the  algorithm. 

1.  Label  1 and  4 with  0. 

2.  Label  7 with  1.  Label  5,  6,  8 with  3. 

3.  Label  2 with  6,  and  3 with  7. 

[All  vertices  are  now  labeled  as  shown  in  Fig.  506a.] 

4.  Pi.  1 — 7 — 3 — 5.  [By  backtracking.  Pi  is  augmenting .] 

P2:  1 — 7 — 3 — 8.  [P2  is  augmenting.] 

5.  Augment  Mi  by  using  l\ , dropping  (3,  7)  from  Mi  and  including  (1,  7)  and  (3,  5).  Remove  all  labels. 
Go  to  Step  1. 

Figure  506b  shows  the  resulting  matching  M2  = 1(1,  7),  (2,  6),  (3,  5)}. 

1.  Label  4 with  0. 

2.  Label  7 with  2.  Label  6 and  8 with  3. 

3.  Label  1 with  7,  and  2 with  6,  and  3 with  5. 

4.  P3:  5 — 3 — 8.  [P3  is  alternating  but  not  augmenting.] 

5.  Stop.  M2  is  of  maximum  cardinality  (namely,  3). 
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FRQB-gE^=S^T— 


1-7 


BIPARTITE  OR  NOT? 


If  you  answer  is  yes,  find  S and 


T: 


- -® 

-4) 

8.  Can  you  obtain  the  answer  to  Prob.  3 from  that  to 
Prob.  1? 

9.  Can  you  obtain  a bipartite  subgraph  in  Prob.  4 by 
omitting  two  edges?  Any  two  edges?  Any  two  edges 
without  a common  vertex? 


10-12 


MATCHING.  AUGMENTING  PATHS 


Find  an  augmenting  path: 


12. 


13-15 


MAXIMUM  CARDINALITY  MATCHING 


Using  augmenting  paths,  find  a maximum  cardinality 
matching: 


13.  In  Prob.  1 1 

14.  In  Prob.  10 

15.  In  Prob.  12 

16.  Complete  bipartite  graphs.  A bipartite  graph 
G = (S',  T;  E)  is  called  complete  if  every  vertex  in  S is 
joined  to  every  vertex  in  T by  an  edge,  and  is  denoted 
by  Ky tl„2,  where  n i and  n2  are  the  numbers  of  vertices 
in  S and  T,  respectively.  How  many  edges  does  this 
graph  have? 

17.  Planar  graph.  A planar  graph  is  a graph  that  can  be 
drawn  on  a sheet  of  paper  so  that  no  two  edges  cross. 
Show  that  the  complete  graph  K 4 with  four  vertices  is 
planar.  The  complete  graph  K5  with  five  vertices  is  not 
planar.  Make  this  plausible  by  attempting  to  draw  K5 
so  that  no  edges  cross.  Interpret  the  result  in  terms  of 
a net  of  roads  between  five  cities. 


18.  Bipartite  graph  KS  3 not  planar.  Three  factories  1, 
2,  3 are  each  supplied  underground  by  water,  gas,  and 
electricity,  from  points  A,  B,  C,  respectively.  Show  that 
this  can  be  represented  by  ^3  3 (the  complete  bipartite 
graph  G = (5,  T ; E)  with  S and  T consisting  of  three 
vertices  each)  and  that  eight  of  the  nine  supply  lines 
(edges)  can  be  laid  out  without  crossing.  Make  it 
plausible  that  ^33  is  not  planar  by  attempting  to  draw 
the  ninth  line  without  crossing  the  others. 


19-25 


VERTEX  COLORING 


19.  Vertex  coloring  and  exam  scheduling.  What  is  the 
smallest  number  of  exam  periods  for  six  subjects  a,  b , 
c,  d,  e,/if  some  of  the  students  simultaneously  take  a, 
b,  f some  c,  d,  e,  some  a,  c,  e,  and  some  c,  e?  Solve 
this  as  follows.  Sketch  a graph  with  six  vertices  a,  ■ ■ ■ ,/ 
and  join  vertices  if  they  represent  subjects  simul- 
taneously taken  by  some  students.  Color  the  vertices 
so  that  adjacent  vertices  receive  different  colors.  (Use 
numbers  1,2,---  instead  of  actual  colors  if  you  want.) 
What  is  the  minimum  number  of  colors  you  need?  For 
any  graph  G,  this  minimum  number  is  called  the 
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(vertex)  chromatic  number  XV(G).  Why  is  this  the 
answer  to  the  problem?  Write  down  a possible 
schedule. 

20.  Scheduling  and  matching.  Three  teachers  x1,x2,x3 
teach  four  classes  ylt  y2,  y3,  V4  for  these  numbers  of 
periods: 


yi 

T2 

>’3 

JA 

*1 

1 

0 

1 

1 

*2 

1 

1 

1 

1 

*3 

0 

1 

1 

1 

Show  that  this  arrangement  can  be  represented  by  a 
bipartite  graph  G and  that  a teaching  schedule  for  one 
period  corresponds  to  a matching  in  G.  Set  up  a 
teaching  schedule  with  the  smallest  possible  number  of 
periods. 

21.  How  many  colors  do  you  need  for  vertex  coloring  any 
tree? 

22.  Harbor  management.  How  many  piers  does  a harbor 
master  need  for  accommodating  six  cruise  ships 
5i,  ■ ■ • , 5g  with  expected  dates  of  arrival  A and  departure 
D in  July,  (A,  D)  = (10,  13),  (13,15),  (14,17), 
(12,  15),  (16,  18),  (14,  17),  respectively,  if  each  pier  can 


accommodate  only  one  ship,  arrival  being  at  6 am  and 
departures  at  1 1 pm?  Hint.  Join  Si  and  Sj  by  an  edge  if 
their  intervals  overlap.  Then  color  vertices. 

23.  What  would  be  the  answer  to  Prob.  22  if  only  the  five 
ships  Si,  ■ ■ ■ , S5  had  to  be  accommodated? 

24.  Four-  (vertex)  color  theorem.  The  famous  four-color 
theorem  states  that  one  can  color  the  vertices  of  any 
planar  graph  (so  that  adjacent  vertices  get  different 
colors)  with  at  most  four  colors.  It  had  been  conjectured 
for  a long  time  and  was  eventually  proved  in  1976  by 
Appel  and  Haken  [Illinois  J.  Math  21  (1977),  429-567]. 
Can  you  color  the  complete  graph  with  four  colors? 
Does  the  result  contradict  the  four-color  theorem?  (For 
more  details,  see  Ref.  [FI]  in  App.  1.) 

25.  Find  a graph,  as  simple  as  possible,  that  cannot  be 
vertex  colored  with  three  colors.  Why  is  this  of  interest 
in  connection  with  Prob.  24? 

26.  Edge  coloring.  The  edge  chromatic  number  X,,(G)  of 
a graph  G is  the  minimum  number  of  colors  needed  for 
coloring  the  edges  of  G so  that  incident  edges  get 
different  colors.  Clearly,  Xe(G)  2 max  d(u),  where  d(u ) 
is  the  degree  of  vertex  u.  If  G = (5,  T;  E ) is  bipartite, 
the  equality  sign  holds.  Prove  this  for  Knn  the  complete 
(cf.  Sec.  23.1)  bipartite  graph  G = (S,  T , E)  with  S and 
T consisting  of  n vertices  each. 


EffieraEraE2EEBS3aS30E^ffiSTIONS  AND  PROBLEMS 


1.  What  is  a graph,  a digraph,  a cycle,  a tree? 

2.  State  some  typical  problems  that  can  be  modeled  and 
solved  by  graphs  or  digraphs. 

3.  State  from  memory  how  graphs  can  be  handled  on 
computers. 

4.  What  is  a shortest  path  problem?  Give  applications. 

5.  What  situations  can  be  handled  in  terms  of  the  traveling 
salesman  problem? 

6.  Give  typical  applications  involving  spanning  trees. 

7.  What  are  the  basic  ideas  and  concepts  in  handling  flows? 

8.  What  is  combinatorial  optimization?  Which  sections  of 
this  chapter  involved  it?  Explain  details. 

9.  Define  bipartite  graphs  and  describe  some  typical 
applications  of  them. 

10.  What  is  BFS?  DFS?  In  what  connection  did  these 
concepts  occur? 
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MATRICES  FOR  GRAPHS  AND  DIGRAPHS 


Find  the  adjacency  matrix  of: 

11. 


14-16 


Sketch  the  graph  whose  adjacency  matrix  is: 


~o 

1 

1 
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1 

0 

1 

1 

14. 

1 

1 

0 

1 

15. 

1 

1 

1 

0 

1 

1 

1 

0 


0 

1 


1 

1 


1 1 1 
0 0 1 
0 0 1 
1 1 0 


17.  Vertex  incidence  list.  Make  it  for  the  graph  in  Prob.  15. 


4 
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Summary  of  Chapter  23 
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18.  Find  a shortest  path  and  its  length  by  Moore’s  BFS 
algorithm,  assuming  that  all  the  edges  have  length  1. 


19.  Find  shortest  paths  by  Dijkstra’s  algorithm. 


20.  Find  a shortest  spanning  tree. 


21.  Company  A has  offices  in  Chicago,  Los  Angeles,  and 
New  York;  Company  B in  Boston  and  New  York; 
Company  C in  Chicago,  Dallas,  and  Los  Angeles. 
Represent  this  by  a bipartite  graph. 


22.  Find  flow  augmenting  paths  and  the  maximum  flow. 


23.  Using  augmenting  paths,  find  a maximum  cardinality 
matching. 


24.  Find  an  augmenting  path, 


SUMMARY  Of  CHAPTER  23 

Graphs.  Combinatorial  Optimization 


Combinatorial  optimization  concerns  optimization  problems  of  a discrete  or 
combinatorial  structure.  It  uses  graphs  and  digraphs  (Sec.  23.1)  as  basic  tools. 

A graph  G = (V,  E ) consists  of  a set  V of  vertices  V\,  l>2,  • ■ • , vn  (often  simply 
denoted  by  1,  2,  • • • , n)  and  a set  E of  edges  e±,  e^,  ■ ■ ■ , em,  each  of  which  connects 
two  vertices.  We  also  write  (;,  j)  for  an  edge  with  vertices  i and  j as  endpoints.  A 
digraph  (=  directed  graph)  is  a graph  in  which  each  edge  has  a direction  (indicated 
by  an  arrow).  For  handling  graphs  and  digraphs  in  computers,  one  can  use  matrices 
or  lists  (Sec.  23.1). 

This  chapter  is  devoted  to  important  classes  of  optimization  problems  for  graphs 
and  digraphs  that  all  arise  from  practical  applications,  and  corresponding  algorithms, 
as  follows. 
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In  a shortest  path  problem  (Sec.  23.2)  we  determine  a path  of  minimum  length 
(consisting  of  edges)  from  a vertex  stoa  vertex  t in  a graph  whose  edges  (i,  j ) have 
a “length”  Zy  > 0,  which  may  be  an  actual  length  or  a travel  time  or  cost  or  an 
electrical  resistance  [if  (i,  j ) is  a wire  in  a net],  and  so  on.  Dijkstra’s  algorithm 
(Sec.  23.3)  or,  when  all  ll7  = 1,  Moore’s  algorithm  (Sec.  23.2)  are  suitable  for  these 
problems. 

A tree  is  a graph  that  is  connected  and  has  no  cycles  (no  closed  paths).  Trees  are 
very  important  in  practice.  A spanning  tree  in  a graph  G is  a tree  containing  all  the 
vertices  of  G.  If  the  edges  of  G have  lengths,  we  can  determine  a shortest  spanning 
tree,  for  which  the  sum  of  the  lengths  of  all  its  edges  is  minimum,  by  Kruskal’s 
algorithm  or  Prim’s  algorithm  (Secs.  23.4,  23.5). 

A network  (Sec.  23.6)  is  a digraph  in  which  each  edge  ( i , j)  has  a capacity 
Gj  > 0 [=  maximum  possible  flow  along  (/',  /)]  and  at  one  vertex,  the  source  s,  a 
flow  is  produced  that  flows  along  the  edges  to  a vertex  t,  the  sink  or  target,  where 
the  flow  disappears.  The  problem  is  to  maximize  the  flow,  for  instance,  by  applying 
the  Ford-Fulkerson  algorithm  (Sec.  23.7),  which  uses  flow  augmenting  paths 
(Sec.  23.6).  Another  related  concept  is  that  of  a cut  set,  as  defined  in  Sec.  23.6. 

A bipartite  graph  G = (V,  E)  (Sec.  23.8)  is  a graph  whose  vertex  set  V consists 
of  two  parts  S and  T such  that  every  edge  of  G has  one  end  in  S and  the  other  in  T, 
so  that  there  are  no  edges  connecting  vertices  in  S or  vertices  in  T.  A matching  in 
G is  a set  of  edges,  no  two  of  which  have  an  endpoint  in  common.  The  problem 
then  is  to  find  a maximum  cardinality  matching  in  G,  that  is,  a matching  M that 
has  a maximum  number  of  edges.  For  an  algorithm,  see  Sec.  23.8. 


PART 


G 


Probability, 

Statistics 


CHAPTER  24  Data  Analysis.  Probability  Theory 

CHAPTER  25  Mathematical  Statistics 

Probability  theory  (Chap.  24)  provides  models  of  probability  distributions  (theoretical 
models  of  the  observable  reality  involving  chance  effects)  to  be  tested  by  statistical  methods, 
and  it  will  also  supply  the  mathematical  foundation  of  these  methods  in  Chap.  25. 

Modern  mathematical  statistics  (Chap.  25)  has  various  engineering  applications,  for 
instance,  in  testing  materials,  control  of  production  processes,  quality  control  of  production 
outputs,  performance  tests  of  systems,  robotics,  and  automatization  in  general,  production 
planning,  marketing  analysis,  and  so  on. 

To  this  we  could  add  a long  list  of  fields  of  applications,  for  instance,  in  agriculture, 
biology,  computer  science,  demography,  economics,  geography,  management  of  natural 
resources,  medicine,  meteorology,  politics,  psychology,  sociology,  traffic  control,  urban 
planning,  etc.  Although  these  applications  are  very  heterogeneous,  we  shall  see  that  most 
statistical  methods  are  universal  in  the  sense  that  each  of  them  can  be  applied  in  various 
fields. 


Additional  Software  for 
Probability  and  Statistics 


See  also  the  list  of  software  at  the  beginning  of  Part  E on  Numerical  Analysis. 

Data  Desk.  Data  Description,  Inc.,  Ithaca,  NY.  Phone  1-800-573-5121  or  (607)  257-1000, 
website  at  www.datadesk.com. 
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MINITAB.  Minitab,  Inc.,  State  College,  PA.  Phone  1-800-448-3555  or  (814)  238-3280, 
website  at  www.minitab.com. 

SAS.  SAS  Institute,  Inc.,  Cary,  NC.  Phone  1-800-727-0025  or  (919)  677-8000,  website 
at  www.sas.com. 

R.  website  at  www.r-project.org.  Free  software,  part  of  the  GNU/Free  Software  Foundation 
project. 

SPSS.  SPSS,  Inc.,  Chicago,  IL.  (part  of  IBM)  Phone  1-800-543-2185  or  (312)  651-3000, 
website  at  www.spss.com. 

STATISTICA.  StatSoft,  Inc.,  Tulsa,  OK.  Phone  (918)  749-1119,  website  at 
www.statsoft.com. 

TIBCO  Spotfire  S+.  TIBCO  Software  Inc.,  Palo  Alto,  CA;  Office  for  this  software: 
Somerville,  MA.  Phone  1-866-240-0491  (toll-free),  (617)  702-1602,  website  at  spotfire. 
tibco.com/products/s-plus/statistical-analysis-software.aspx 
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Data  Analysis. 
Probability  Theory 


We  first  show  how  to  handle  data  numerically  or  in  terms  of  graphs,  and  how  to  extract 
information  (average  size,  spread  of  data,  etc.)  from  them.  If  these  data  are  influenced  by 
“chance,”  by  factors  whose  effect  we  cannot  predict  exactly  (e.g.,  weather  data,  stock 
prices,  life  spans  of  tires,  etc.),  we  have  to  rely  on  probability  theory.  This  theory 
originated  in  games  of  chance,  such  as  flipping  coins,  rolling  dice,  or  playing  cards. 
Nowadays  it  gives  mathematical  models  of  chance  processes  called  random  experiments 
or,  briefly,  experiments.  In  such  an  experiment  we  observe  a random  variable  X,  that 
is,  a function  whose  values  in  a trial  (a  performance  of  an  experiment)  occur  “by  chance” 
(Sec.  24.3)  according  to  a probability  distribution  that  gives  the  individual  probabilities 
with  which  possible  values  of  X may  occur  in  the  long  run.  (Example:  Each  of  the  six 
faces  of  a die  should  occur  with  the  same  probability,  1/6.)  Or  we  may  simultaneously 
observe  more  than  one  random  variable,  for  instance,  height  and  weight  of  persons  or 
hardness  and  tensile  strength  of  steel.  This  is  discussed  in  Sec.  24.9,  which  will  also  give 
the  basis  for  the  mathematical  justification  of  the  statistical  methods  in  Chapter  25. 

Prerequisite:  Calculus. 

References  and  Answers  to  Problems:  App.  1 Part  G,  App.  2. 


24.1  Data  Representation.  Average.  Spread 

Data  can  be  represented  numerically  or  graphically  in  various  ways.  For  instance,  your 
daily  newspaper  may  contain  tables  of  stock  prices  and  money  exchange  rates,  curves  or 
bar  charts  illustrating  economical  or  political  developments,  or  pie  charts  showing  how 
your  tax  dollar  is  spent.  And  there  are  numerous  other  representations  of  data  for  special 
purposes. 

In  this  section  we  discuss  the  use  of  standard  representations  of  data  in  statistics.  (For 
these,  software  packages,  such  as  DATA  DESK,  R,  and  MINITAB,  are  available,  and 
Maple  or  Mathematica  may  also  be  helpful;  see  pp.  789  and  1009)  We  explain  corresponding 
concepts  and  methods  in  terms  of  typical  examples. 

EXAM  Recording  and  Sorting 

Sample  values  (observations,  measurements)  should  be  recorded  in  the  order  in  which  they  occur.  Sorting,  that 

is,  ordering  the  sample  values  by  size,  is  done  as  a first  step  of  investigating  properties  of  the  sample  and  graphing 

it.  Sorting  is  a standard  process  on  the  computer;  see  Ref.  [E35],  listed  in  App.  1. 
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EXAMPLE  2 


EXAMPLE  3 


Super  alloys  is  a collective  name  for  alloys  used  in  jet  engines  and  rocket  motors,  requiring  high  temperature 
(typically  1800°F),  high  strength,  and  excellent  resistance  to  oxidation.  Thirty  specimens  of  Hastelloy  C (nickel- 
based  steel,  investment  cast)  had  the  tensile  strength  (in  1000  lb/sq  in.),  recorded  in  the  order  obtained  and 
rounded  to  integer  values, 

89  77  88  91  88  93  99  79  87  84  86  82  88  89  78 

(1) 

90  91  81  90  83  83  92  87  89  86  89  81  87  84  89 


Sorting  gives 


(2) 


77 

78 

79 

81 

81 

82 

83 

83 

84 

84 

86 

86 

87 

87 

87 

88 

88 

88 

89 

89 

89 

89 

89 

90 

90 

91 

91 

92 

93 

99 

Graphic  Representation  of  Data 

We  shall  now  discuss  standard  graphic  representations  used  in  statistics  for  obtaining 
information  on  properties  of  data. 

Stem-and-Leaf  Plot  (Fig.  507) 

This  is  one  of  the  simplest  but  most  useful  representations  of  data.  For  (1)  it  is  shown  in  Fig.  507.  The  numbers 
in  (1)  range  from  78  to  99;  see  (2).  We  divide  these  numbers  into  5 groups,  75-79,  80-84,  85-89,  90-94, 
95-99.  The  integers  in  the  tens  position  of  the  groups  are  7,  8,  8,  9,  9.  These  form  the  stem  in  Fig.  507.  The 
first  leaf  is  789,  representing  77,  78,  79.  The  second  leaf  is  1123344,  representing  81,  81,  82,  83,  83,  84,  84. 
And  so  on. 

The  number  of  times  a value  occurs  is  called  its  absolute  frequency.  Thus  78  has  absolute  frequency  1,  the 
value  89  has  absolute  frequency  5,  etc.  The  column  to  the  extreme  left  in  Fig.  507  shows  the  cumulative  absolute 
frequencies,  that  is,  the  sum  of  the  absolute  frequencies  of  the  values  up  to  the  line  of  the  leaf.  Thus,  the  number 
10  in  the  second  line  on  the  left  shows  that  (1)  has  10  values  up  to  and  including  84.  The  number  23  in  the  next 
line  shows  that  there  are  23  values  not  exceeding  89,  etc.  Dividing  the  cumulative  absolute  frequencies  by 
n (=  30  in  Fig.  507)  gives  the  cumulative  relative  frequencies  0.1,  0.33,  0.76,  0.93,  1.00. 

Histogram  (Fig.  508) 

For  large  sets  of  data,  histograms  are  better  in  displaying  the  distribution  of  data  than  stem-and-leaf  plots.  The 
principle  is  explained  in  Fig.  508.  (An  application  to  a larger  data  set  is  shown  in  Sec.  25.7).  The  bases  of  the 
rectangles  in  Fig.  508  are  the  x-intervals  (known  as  class  intervals)  74.5-79.5,  79.5-84.5,  84.5-89.5,  89.5-94.5, 
94.5-99.5,  whose  midpoints  (known  as  class  marks)  are  x = 77,  82,  87,  92,  97,  respectively.  The  height  of  a 
rectangle  with  class  mark  x is  the  relative  class  frequency  frei(x),  defined  as  the  number  of  data  values  in  that 
class  interval,  divided  by  n (=  30  in  our  case).  Hence  the  areas  of  the  rectangles  are  proportional  to  these 
relative  frequencies,  0.10,  0.23,  0.43,  0.17,  0.07,  so  that  histograms  give  a good  impression  of  the  distribution 
of  data. 


Leaf  unit  =1.0 


3 

7 

789 

10 

8 

1123344 

23 

8 

6677788899999 

29 

9 

001123 

30 

9 

9 

Fig.  507.  Stem-and-leaf  plot 
of  the  data  in  Example  1 


Fig.  508  Histogram  of  the  data  in 
Example  1 (grouped  as  in  Fig.  507) 


SEC  24.1  Data  Representation.  Average.  Spread 
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Boxplot.  Median.  Interquartile  Range.  Outlier 

A boxplot  of  a set  of  data  illustrates  the  average  size  and  the  spread  of  the  values,  in  many  cases  the  two  most 
important  quantities  characterizing  the  set,  as  follows. 

The  average  size  is  measured  by  the  median,  or  middle  quartile,  qM.  If  the  number  n of  values  of  the  set  is  odd, 
then  qM  is  the  middlemost  of  the  values  when  ordered  as  in  (2).  If  n is  even,  then  qM  is  the  average  of  the  two 
middlemost  values  of  the  ordered  set.  In  (2)  we  have  n = 30  and  thus  qM  = 2(^15  + *16)  = 2 (87  + 88)  = 87.5. 
(In  general,  qM  will  be  a fraction  if  n is  even.) 

The  spread  of  values  can  be  measured  by  the  range  R = xmax  — Amin,  the  largest  value  minus  the  smallest 
one. 

Better  information  on  the  spread  gives  the  interquartile  range  IQR  = qu  ~ 4l-  Here  qu  is  the  middlemost 
value  (or  the  average  of  the  two  middlemost  values)  in  the  data  above  the  median;  and  q j,  is  the  middlemost 
value  (or  the  average  of  the  two  middlemost  values)  in  the  data  below  the  median.  Hence  in  (2)  we  have 
qu  = a 23  = 89,  q^  — a g = 83,  and  IQR  = 89  — 83  = 6. 

The  box  in  Fig.  509  extends  vertically  from  q^  to  qu\  it  has  height  IQR  = 6.  The  vertical  lines  below  and 
above  the  box  extend  from  xmin  = 77  to  xmax  = 99,  so  that  they  show  R = 22. 


100 
95  - 


90  - 


85  - 


80  - 


<*u 

7m 

7l 


75  L 


Data  set  (1) 


Fig.  509.  Boxplot  of  the  data  set  (1) 


The  line  above  the  box  is  suspiciously  long.  This  suggests  the  concept  of  an  outlier,  a value  that  is  more 
than  1.5  times  the  IQR  away  from  either  end  of  the  box;  here  1.5  is  purely  conventional.  An  outlier  indicates 
that  something  might  have  gone  wrong  in  the  data  collection.  In  (2)  we  have  89  + 1.5  IQR  = 98,  and  we  regal'd 
99  as  an  outlier. 


Mean.  Standard  Deviation.  Variance. 

Empirical  Rule 

Medians  and  quartiles  are  easily  obtained  by  ordering  and  counting,  practically  without 
calculation.  But  they  do  not  give  full  information  on  data:  you  can  change  data  values  to 
some  extent  without  changing  the  median.  Similarly  for  the  quartiles. 

The  average  size  of  the  data  values  can  be  measured  in  a more  refined  way  by  the 

mean 


j=i 


n 


Oi  + x2  + ••  • + xn). 


(3) 
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EXAMPLE  5 


This  is  the  arithmetic  mean  of  the  data  values,  obtained  by  taking  their  sum  and  dividing 
by  the  data  size  rt.  Thus  in  (1), 

x = (89  + 77  + • • • + 89)  = ^ = 86.7. 

Every  data  value  contributes,  and  changing  one  of  them  will  change  the  mean. 

Similarly,  the  spread  (variability)  of  the  data  values  can  be  measured  in  a more  refined 
way  by  the  standard  deviation  s or  by  its  square,  the  variance 

1 n 1 

(4)  s2  = 2 (xj  ~ = 7 [<Ti  - xf  + " • + (xn  - xf]. 

n - 1 ^ J n - 1 

Thus,  to  obtain  the  variance  of  the  data,  take  the  difference  Xj  — x of  each  data  value  from 
the  mean,  square  it,  take  the  sum  of  these  n squares,  and  divide  it  by  n — 1 (not  n,  as  we 
motivate  in  Sec.  25.2).  To  get  the  standard  deviation  s,  take  the  square  root  of  sz. 

For  example,  using  x = 260/3,  we  get  for  the  data  (1)  the  variance 

S -29 [(89 3-)  +(77 3-)  + • • • + (89 3-)  J - -87- ~ 23.06 

Hence  the  standard  deviation  is  s = V2006/87  ~ 4.802.  Note  that  the  standard  deviation 
has  the  same  dimension  as  the  data  values  (kg/mm2,  see  at  the  beginning),  which  is  an 
advantage.  On  the  other  hand,  the  variance  is  preferable  to  the  standard  deviation  in 
developing  statistical  methods,  as  we  shall  see  in  Chap.  25. 

CAUTION!  Your  CAS  (Maple,  for  instance)  may  use  l/n  instead  of  l/(n  — 1)  in  (4), 
but  the  latter  is  better  when  n is  small  (see  Sec.  25.2). 

Mean  and  standard  deviation,  introduced  to  give  center  and  spread,  actually  give  much 
more  information  according  to  this  rule. 

Empirical  Rule.  For  any  mound-shaped,  nearly  symmetric  distribution  of  data  the  intervals 
x ± .S',  x ± 2 s,  x ± 3s  contain  about  68%,  95%,  99.7%, 
respectively,  of  the  data  points. 

Empirical  Rule  and  Outliers.  z-Score 

For  (1),  with  x — 86.7  and  s = 4.8,  the  three  intervals  in  the  Rule  are  81.9  x ^ 91.5,  77.1  ^ x ^ 96.3, 
72.3  ^ x ^ 101.1  and  contain  73%  (22  values  remain,  5 are  too  small,  and  5 too  large),  93%  (28  values, 
1 too  small,  and  1 too  large),  and  100%,  respectively. 

If  we  reduce  the  sample  by  omitting  the  outlier  99,  mean  and  standard  deviation  reduce  to  xred  = 86.2,  Are(j  = 4.3, 
approximately,  and  the  percentage  values  become  67%  (5  and  5 values  outside),  93%  (1  and  1 outside),  and  100%. 

Finally,  the  relative  position  of  a value  x in  a set  of  mean  x.  and  standard  deviation  5 can  be  measured  by  the 
z-score 


z(s)  = 


X — X 


s 


This  is  the  distance  of  x from  the  mean  x measured  in  multiples  of  s.  For  instance,  z(83)  = (83  — 86.7)/ 
4.8  = —0.77.  This  is  negative  because  83  lies  below  the  mean.  By  the  Empirical  Rule,  the  extreme  z-values 
are  about  —3  and  3. 
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1-10 


DATA  REPRESENTATIONS 


Represent  the  data  by  a stem-and-leaf  plot,  a histogram,  and 
a boxplot: 


1.  Length  of  nails  [mm] 


19  21  19  20  19  20  21  20 


2.  Phone  calls  per  minute  in  an  office  between  9:00  a.m. 
and  9:10  a.m. 


6642170467 


3.  Systolic  blood  pressure  of  15  female  patients  of  ages 
20-22 

156  158  154  133  141  130  144  137 

151  146  156  138  138  149  139 


4.  Iron  content  [%]  of  15  specimens  of  hermatite  (Fe2C>3) 

72.8  70.4  71.2  69.2  70.3  68.9  71.1  69.8 

71.5  69.7  70.5  71.3  69.1  70.9  70.6 

5.  Weight  of  filled  bags  [g]  in  an  automatic  filling 

203  199  198  201  200  201  201 


6.  Gasoline  consumption  [miles  per  gallon,  rounded]  of 
six  cars  of  the  same  model  under  similar  conditions 

15.0  15.5  14.5  15.0  15.5  15.0 


7.  Release  time  [sec]  of  a relay 


1.3  1.2  1.4  1.5  1.3  1.3  1.4  1.1 

1.6  1.3  1.5  1.1  1.4  1.2  1.3  1.5 


1.5  1.4 

1.4  1.4 


8.  Foundrax  test  of  Brinell  hardness  (2.5  mm  steel  ball, 
62.5  kg  load,  30  sec)  of  20  copper  plates  (values  in 
kg/mm2) 

86  86  87  89  76  85  82  86  87  85 

90  88  89  90  88  80  84  89  90  89 


9.  Efficiency  [%]  of  seven  Voith  Francis  turbines  of 
runner  diameter  2.3  m under  a head  range  of  185  m 


91.8 
10.  -0.51 

11-16 


89.1  89.9  92.5  90.7 

0.12  -0.47  0.95  0.25 

AVERAGE  AND  SPREAD 


91.2 

-0.18 


91.0 

-0.54 


Find  the  mean  and  compare  it  with  the  median.  Find  the 

standard  deviation  and  compare  it  with  the  interquartile  range. 

11.  For  the  data  in  Prob.  1 

12.  For  the  phone  call  data  in  Prob.  2 

13.  For  the  medical  data  in  Prob.  3 

14.  For  the  iron  contents  in  Prob.  4 

15.  For  the  release  times  in  Prob.  7 

16.  For  the  Brinell  hardness  data  in  Prob.  8 

17.  Outlier,  reduced  data.  Calculate  5 for  the  data 

4 1 3 10  2.  Then  reduce  the  data  by  deleting 

the  outlier  and  calculate  5.  Comment. 

18.  Outlier,  reduction.  Do  the  same  tasks  as  in  Prob.  17 
for  the  hardness  data  in  Prob.  8. 

19.  Construct  the  simplest  possible  data  with  x = 100  but 
c[m  — 0.  What  is  the  point  of  this  problem? 

20.  Mean.  Prove  that  x must  always  lie  between  the 
smallest  and  the  largest  data  values. 


24 .2  Experiments,  Outcomes,  Events 


We  now  turn  to  probability  theory.  This  theory  has  the  purpose  of  providing  mathematical 
models  of  situations  affected  or  even  governed  by  “chance  effects,”  for  instance,  in  weather 
forecasting,  life  insurance,  quality  of  technical  products  (computers,  batteries,  steel  sheets, 
etc.),  traffic  problems,  and,  of  course,  games  of  chance  with  cards  or  dice.  And  the  accuracy 
of  these  models  can  be  tested  by  suitable  observations  or  experiments — this  is  a main 
purpose  of  statistics  to  be  explained  in  Chap.  25. 

We  begin  by  defining  some  standard  terms.  An  experiment  is  a process  of  measurement 
or  observation,  in  a laboratory,  in  a factory,  on  the  street,  in  nature,  or  wherever;  so 
“experiment”  is  used  in  a rather  general  sense.  Our  interest  is  in  experiments  that  involve 
randomness,  chance  effects,  so  that  we  cannot  predict  a result  exactly.  A trial  is  a single 
performance  of  an  experiment.  Its  result  is  called  an  outcome  or  a sample  point,  n trials 
then  give  a sample  of  size  n consisting  of  n sample  points.  The  sample  space  S of  an 
experiment  is  the  set  of  all  possible  outcomes. 
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EXAMPLES  1-6 


EXAMPLE  7 


Random  Experiments.  Sample  Spaces 

(1)  Inspecting  a lightbulb.  S = { Defective,  Nondefective } . 

(2)  Rolling  a die.  S = { 1,  2,  3,  4,  5,  6). 

(3)  Measuring  tensile  strength  of  wire.  S the  numbers  in  some  interval. 

(4)  Measuring  copper  content  of  brass.  S:  50%  to  90%,  say. 

(5)  Counting  daily  traffic  accidents  in  New  York.  S the  integers  in  some  interval. 

(6)  Asking  for  opinion  about  a new  car  model.  S = {Like,  Dislike,  Undecided). 


The  subsets  of  S are  called  events  and  the  outcomes  simple  events. 

Events 

In  (2),  events  are  A = { 1,  3,  5)  {"Odd  number”),  B = {2,  4,  6)  {"Even  number”),  C = {5,  6).  etc.  Simple 
events  are  { 1 },  (2),  • ■ ■ , (6).  I 


If,  in  a trial,  an  outcome  a happens  and  a E A (a  is  an  element  of  A),  we  say  that  A 
happens.  For  instance,  if  a die  turns  up  a 3,  the  event  A:  Odd  number  happens.  Similarly, 
if  C in  Example  7 happens  (meaning  5 or  6 turns  up),  then,  say,  D = {4,  5,  6)  happens. 
Also  note  that  S happens  in  each  trial,  meaning  that  some  event  of  S always  happens.  All 
this  is  quite  natural. 


Unions,  Intersections,  Complements  of  Events 

In  connection  with  basic  probability  laws  we  shall  need  the  following  concepts  and  facts 
about  events  (subsets)  A,  B,  C,  • ■ • of  a given  sample  space  S. 

The  union  A U B of  A and  B consists  of  all  points  in  A or  B or  both. 

The  intersection  A D B of  A and  B consists  of  all  points  that  are  in  both  A and  B. 

If  A and  B have  no  points  in  common,  we  write 

A n B = 0 

where  0 is  the  empty  set  (set  with  no  elements)  and  we  call  A and  B mutually  exclusive 
(or  disjoint)  because,  in  a trial,  the  occurrence  of  A excludes  that  of  B (and  conversely) — 
if  your  die  turns  up  an  odd  number,  it  cannot  turn  up  an  even  number  in  the  same  trial. 
Similarly,  a coin  cannot  turn  up  Head  and  Tail  at  the  same  time. 

Complement  Ac  of  A.  This  is  the  set  of  all  the  points  of  S not  in  A.  Thus, 

A n Ac  = 0,  A U Ac  = S. 

In  Example  7 we  have  Ac  = B,  hence  A U Ac  = { 1,  2,  3,  4,  5,  6}  = S. 

Another  notation  for  the  complement  of  A is  A (instead  of  Ac),  but  we  shall  not 
use  this  because  in  set  theory  A is  used  to  denote  the  closure  of  A (not  needed  in 
our  work). 

Unions  and  intersections  of  more  events  are  defined  similarly.  The  union 

m 

U Aj  = Ai  U A2  U • • • U Am 

3 = 1 
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of  events  A1;  • • ■ , Am  consists  of  all  points  that  are  in  at  least  one  Aj.  Similarly  for  the 
union  A1  U A2  U ■ • • of  infinitely  many  subsets  A 1;  A2,  ■ ■ • of  an  infinite  sample  space 
S (that  is,  S consists  of  infinitely  many  points).  The  intersection 

m 

n Aj  — Ai  n a 2 o ■ ■ * n Am 

3 = 1 

of  Ai,  ■ ■ ■ , Am  consists  of  the  points  of  S that  are  in  each  of  these  events.  Similarly  for 
the  intersection  A±  fl  A2  f~l  ■ • • of  infinitely  many  subsets  of  S. 

Working  with  events  can  be  illustrated  and  facilitated  by  Venn  diagrams1  for  showing 
unions,  intersections,  and  complements,  as  in  Figs.  510  and  511,  which  are  typical 
examples  that  give  the  idea. 

EXAMPLE  8 Unions  and  Intersections  of  3 Events 

In  rolling  a die,  consider  the  events 

A:  Number  greater  than?>,  B:  Number  less  than  6,  C:  Even  number. 

Then  A C\  B = { 4,  5},  B D C = {2,  4},  C Pi  A = {4,  6},  A D B D C = {4}.  Can  you  sketch  a Venn  diagram 
of  this?  Furthermore,  A U B = S,  hence  AUfiUC=5  (why?).  H 


Fig.  510.  Venn  diagrams  showing  two  events  A and  B in  a sample  space  S 
and  their  union  A U B (colored)  and  intersection  A H B (colored) 


Fig.  511.  Venn  diagram  for  the  experiment  of  rolling  a die,  showing  S, 
A = (1,  3,  5),  C = (5,  6},  A U C = {1,  3,  5,  6},  A (T  C = {5} 


FRQBL  EM^SiErT^^a 


SAMPLE  SPACES,  EVENTS 

Graph  a sample  space  for  the  experiments: 

1.  Drawing  3 screws  from  a lot  of  right-handed  and  left- 
handed  screws 

2.  Tossing  2 coins 


3.  Rolling  2 dice 

4.  Rolling  a die  until  the  first  Six  appears 

5.  Tossing  a coin  until  the  first  Head  appears 

6.  Recording  the  lifetime  of  each  of  3 lightbulbs 


1JOHN  VENN  (1834—1923),  English  mathematician. 
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7.  Recording  the  daily  maximum  temperature  X and  the 
daily  maximum  air  pressure  Y at  Times  Square  in  New 
York 

8.  Choosing  a committee  of  2 from  a group  of  5 people 

9.  Drawing  gaskets  from  a lot  of  10,  containing  one 
defective  D,  unitil  D is  drawn,  one  at  a time  and 
assuming  sampling  without  replacement,  that  is, 
gaskets  drawn  are  not  returned  to  the  lot.  (More  about 
this  in  Sec.  24.6) 

10.  In  rolling  3 dice,  are  the  events  A:  Sum  divisible  by  3 
and  B:  Sum  divisible  by  5 mutually  exclusive? 

11.  Answer  the  questions  in  Prob.  10  for  rolling  2 dice. 

12.  List  all  8 subsets  of  the  sample  space  S = {a,  b,c}. 

13.  In  Prob.  3 circle  and  mark  the  events  A:  Faces  are  equal, 
B:  Sum  of  faces  less  than  5,  A U B,  A PI  B,AC,  Bc. 

14.  In  drawing  2 screws  from  a lot  of  right-handed  and 
left-handed  screws,  let  A,  B,  C,  D mean  at  a least 

1 right-handed,  at  least  1 left-handed,  2 right-handed, 

2 left-handed,  respectively.  Are  A and  B mutually 
exclusive?  C and  D7 


VENN  DIAGRAMS 

15.  In  connection  with  a trip  to  Europe  by  some  students, 
consider  the  events  P that  they  see  Paris,  G that  they 
have  a good  time,  and  M that  they  run  out  of  money, 
and  describe  in  words  the  events  1,  ■ ■ • , 7 in  the 
diagram. 


G 


16.  Show  that,  by  the  definition  of  complement,  for  any 
subset  A of  a sample  space  S. 

(Ac)c  = A,  Sc  = 0,  0C  = S, 

A U Ac  = 5,  A Cl  Ac  = 0. 

17.  Using  a Venn  diagram,  show  that  A C B if  and  only  if 
AU8=fi. 

18.  Using  a Venn  diagram,  show  that  A C B if  and  only  if 
A Cl  B = A. 

19.  (De  Morgan’s  laws)  Using  Venn  diagrams,  graph  and 
check  De  Morgan ’s  laws 

(A  U B)c  = Ac  Cl  Bc 
(A  n Bf  = Ac  U Bc. 

20.  Using  Venn  diagrams,  graph  and  check  the  rules 

A U (B  (T  O = (A  U B)  Cl  (A  U Q 
a n (B  u o = (A  n B)  u (A  n q. 


24.3  Probability 

The  “probability”  of  an  event  A in  an  experiment  is  supposed  to  measure  how  frequently 
A is  about  to  occur  if  we  make  many  trials.  If  we  flip  a coin,  then  heads  H and  tails  T 
will  appear  about  equally  often — we  say  that  H and  T are  “equally  likely.”  Similarly,  for 
a regularly  shaped  die  of  homogeneous  material  (“fair  die”)  each  of  the  six  outcomes 
1,  • • • , 6 will  be  equally  likely.  These  are  examples  of  experiments  in  which  the  sample 
space  S consists  of  finitely  many  outcomes  (points)  that  for  reasons  of  some  symmetry 
can  be  regarded  as  equally  likely.  This  suggests  the  following  definition. 


DEFINITION  1 


First  Definition  of  Probability 

If  the  sample  space  S of  an  experiment  consists  of  finitely  many  outcomes  (points) 
that  are  equally  likely,  then  the  probability  P(A)  of  an  event  A is 


(1) 


P(A) 


Number  of  points  in  A 
Number  of  points  in  S 
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EXAMPLE  1 


From  this  definition  it  follows  immediately  that,  in  particular, 
(2)  P(S)  = 1. 


Fair  Die 

In  rolling  a fair  die  once,  what  is  the  probability  P(A ) of  A of  obtaining  a 5 or  a 6?  The  probability  of  B:  “Even 
number ”? 

Solution.  The  six  outcomes  are  equally  likely,  so  that  each  has  probability  1/6.  Thus  P(A)  = 2/6  = 1/3 
because  A = (5,  6}  has  2 points,  and  P(B ) = 3/6  = 1/2. 

Definition  1 takes  care  of  many  games  as  well  as  some  practical  applications,  as  we  shall 
see,  but  certainly  not  of  all  experiments,  simply  because  in  many  problems  we  do  not 
have  finitely  many  equally  likely  outcomes.  To  arrive  at  a more  general  definition  of 
probability,  we  regard  probability  as  the  counterpart  of  relative  frequency.  Recall  from 
Sec.  24.1  that  the  absolute  frequency /(A)  of  an  event  A in  n trials  is  the  number  of  times 
A occurs,  and  the  relative  frequency  of  A in  these  trials  is  f{A)/n;  thus 


(3) 


frel(A) 


m 

n 


Number  of  times  A occurs 
Number  of  trials 


Now  if  A did  not  occur,  then /(A)  = 0.  If  A always  occurred,  then /(A)  = n.  These  are 
the  extreme  cases.  Division  by  n gives 

(4*)  0g/rel(A)gl. 

In  particular,  for  A = S we  have  f(S)  = n because  S always  occurs  (meaning  that 
some  event  always  occurs;  if  necessary,  see  Sec.  24.2,  after  Example  7).  Division 
by  n gives 

(5*)  /rel(S)  = 1. 

Finally,  if  A and  B are  mutually  exclusive,  they  cannot  occur  together.  Hence  the  absolute 
frequency  of  their  union  A U B must  equal  the  sum  of  the  absolute  frequencies  of  A and 
B.  Division  by  n gives  the  same  relation  for  the  relative  frequencies, 

(6*)  ,/reiCA  U B)=  /rel(A)  + frel(B)  (A  n B = 0). 

We  are  now  ready  to  extend  the  definition  of  probability  to  experiments  in  which  equally 
likely  outcomes  are  not  available.  Of  course,  the  extended  definition  should  include 
Definition  1 . Since  probabilities  are  supposed  to  be  the  theoretical  counterpart  of  relative 
frequencies,  we  choose  the  properties  in  (4*),  (5*),  (6*)  as  axioms.  (Historically,  such  a 
choice  is  the  result  of  a long  process  of  gaining  experience  on  what  might  be  best  and 
most  practical.) 
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DEFINITION  2 


THEOREM  1 


PROOF 


General  Definition  of  Probability 

Given  a sample  space  S,  with  each  event  A of  .S'  (subset  of  S ) there  is  associated  a 
number  P(A),  called  the  probability  of  A,  such  that  the  following  axioms  of 
probability  are  satisfied. 

1.  For  every  A in  S, 

(4)  0 S P(A)  g 1. 

2.  The  entire  sample  space  S has  the  probability 

(5)  P(S ) = 1. 

3.  For  mutually  exclusive  events  A and  B (A  IT  B = 0;  see  Sec.  24.2), 

(6)  P(A  U B)  = P(A)  + P[B)  (A  n B = 0). 

If  S is  infinite  (has  infinitely  many  points),  Axiom  3 has  to  be  replaced  by 
3' . For  mutually  exclusive  events  Ai,  A 2,  • ■ ■ , 

(6')  P(Ai  U A2  U • • ■ ) = P(Ai)  + P(A2 ) + • • • . 


In  the  infinite  case  the  subsets  of  S on  which  P(A)  is  defined  are  restricted  to  form  a 
so-called  cr-algebra,  as  explained  in  Ref.  [GenRef6]  (not  [G6] !)  in  App.  1.  This  is  of  no 
practical  consequence  to  us. 


Basic  Theorems  of  Probability 

We  shall  see  that  the  axioms  of  probability  will  enable  us  to  build  up  probability  theory 
and  its  application  to  statistics.  We  begin  with  three  basic  theorems.  The  first  of  them 
is  useful  if  we  can  get  the  probability  of  the  complement  Ac  more  easily  than  P(A ) 
itself. 


Complementation  Rule 

For  an  event  A and  its  complement  Ac  in  a sample  space  S, 
(7)  P(AC)  = 1 - P(A). 


By  the  definition  of  complement  (Sec.  24.2),  we  have  S = A U Ac  and  A fl  Ac  = 0. 
Flence  by  Axioms  2 and  3, 

1 = P(S)  = P(A)  + P(AC),  thus  P(AC)  = 1 - P(A). 
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EXAMPLE  2 


THEOREM  2 


EXAMPLE  3 


THEOREM  3 


PROOF 


Coin  Tossing 

Five  coins  are  tossed  simultaneously.  Find  the  probability  of  the  event  A:  At  least  one  head  turns  up.  Assume 
that  the  coins  are  fair. 

Solution.  Since  each  coin  can  turn  up  heads  or  tails,  the  sample  space  consists  of  25  = 32  outcomes.  Since 
the  coins  are  fair,  we  may  assign  the  same  probability  (1/32)  to  each  outcome.  Then  the  event  Ac  (No  heads 
turn  up)  consists  of  only  1 outcome.  Hence  P(AC ) = 1/32,  and  the  answer  is  P(A)  = 1 — P(AC)  = 31/32. 


The  next  theorem  is  a simple  extension  of  Axiom  3,  which  you  can  readily  prove  by 
induction. 


Addition  Rule  for  Mutually  Exclusive  Events 

For  mutually  exclusive  events  Ai,  • • • , Am  in  a sample  space  S, 

(8)  P(A1  U A2  U Am)  = P(AX)  + P(A2)  + ■■■  + P(Am). 


Mutually  Exclusive  Events 

If  the  probability  that  on  any  workday  a garage  will  get  10-20,  21-30,  31-40,  over  40  cars  to  service  is  0.20, 
0.35,  0.25,  0.12,  respectively,  what  is  the  probability  that  on  a given  workday  the  garage  gets  at  least  21  cars 
to  service? 

Solution.  Since  these  are  mutually  exclusive  events,  Theorem  2 gives  the  answer  0.35  + 0.25  + 0.12  = 0.72. 
Check  this  by  the  complementation  mle. 


In  many  cases,  events  will  not  be  mutually  exclusive.  Then  we  have 


Addition  Rule  for  Arbitrary  Events 

For  events  A and  B in  a sample  space, 

(9)  P{A  U B)  = P(A)  + P(B)  - P(A  n B). 


C,  D,  E in  Fig.  512  make  up  A U B and  are  mutually  exclusive  (disjoint).  Hence  by 
Theorem  2, 


P(A  U B)  = P(C ) + P(D)  + P(E). 

This  gives  (9)  because  on  the  right  P{C)  + P(D)  = P(A)  by  Axiom  3 and  disjointness; 
and  P(E)  = P(B ) — P(D)  = P(B)  — P(A  D B),  also  by  Axiom  3 and  disjointness.  ■ 


Fig.  512.  Proof  of  Theorem  3 
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EXAMPLE  4 


THEOREM  4 


EXAMPLE  5 


Note  that  for  mutually  exclusive  events  A and  B we  have  A IT  B = 0 by  definition  and, 
by  comparing  (9)  and  (6), 

(10)  P(0)  = 0. 


(Can  you  also  prove  this  by  (5)  and  (7)?) 


Union  of  Arbitrary  Events 

In  tossing  a fair  die,  what  is  the  probability  of  getting  an  odd  number  or  a number  less  than  4? 

Solution.  Let  A be  the  event  “ Odd  number ” and  B the  event  “ Number  less  than  4.”  Then  Theorem  3 gives 
the  answer 


P(A  U B)  = | + | 

because  A fl  B = “ Odd  number  less  than  4”  = {1,3}. 


2 _ 2 
6 — 3 


Conditional  Probability.  Independent  Events 

Often  it  is  required  to  find  the  probability  of  an  event  B under  the  condition  that  an  event 
A occurs.  This  probability  is  called  the  conditional  probability  ofB  given  A and  is  denoted 
by  P(B\A).  In  this  case  A serves  as  a new  (reduced)  sample  space,  and  that  probability  is 
the  fraction  of  P(A)  which  corresponds  to  A Cl  B.  Thus 

P(A  Cl  B) 

(11)  P(B\A)  = — [P(A)  A 0], 

Similarly,  the  conditional  probability  of  A given  B is 

P(A  n B) 

02)  P(A\B)  = [P(B)  A 0], 


Solving  (11)  and  (12)  for  P(A  IT  B),  we  obtain 


Multiplication  Rule 

If  A and  B are  events  in  a sample  space  S and  P(A)  A 0,  P(B)  A 0,  then 
(13)  P(A  n B)  = P(A)P(B\A ) = P(B)P(A\B). 


Multiplication  Rule 

In  producing  screws,  let  A mean  “screw  too  slim”  and  B “screw  too  short.”  Let  P(A)  = 0.1  and  let  the  conditional 
probability  that  a slim  screw  is  also  too  short  be  P(B\A ) = 0.2.  What  is  the  probability  that  a screw  that  we  pick 
randomly  from  the  lot  produced  will  be  both  too  slim  and  too  short? 

Solution.  P(A  n B)  = P(A)P(B\A)  = 0.1  • 0.2  = 0.02  = 2%,  by  Theorem  4. 

Independent  Events.  If  events  A and  B are  such  that 


(14) 


P(A  n B)  = P(A)P(B), 
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they  are  called  independent  events.  Assuming  P(A)  # 0.  P(B)  A 0,  we  see  from  (11 )— (1 3) 
that  in  this  case 


P(A\B)  = P(A),  P(B\A ) = P(B). 

This  means  that  the  probability  of  A does  not  depend  on  the  occurrence  or  nonoccurrence 
of  B,  and  conversely.  This  justifies  the  term  “independent.” 


Independence  of  m Events.  Similarly,  m events  A1;  • • • , Am  are  called  independent  if 
(15a)  P(A1  n n Am)  = P(A1)  ■ ■ ■ P(Am ) 

as  well  as  for  every  k different  events  Ajv  Aj2,  • • • , Ajk. 


( 1 5b)  m„  n Ai2  n • ■ • n AJk)  = P(Ah)P{Aj2)  ■ ■ ■ p(Ajk) 

where  k = 2,  3,  • • • , m — 1 . 

Accordingly,  three  events  A,  B,  C are  independent  if  and  only  if 

P(A  n B)  = P(A)P(B ), 

(16)  P{B  n C)  = P{B)P(C), 

P(C  n A)  = P(C)P(A), 

P(A  n B n C)  = P(A)P{B)P(C). 

Sampling.  Our  next  example  has  to  do  with  randomly  drawing  objects,  one  at  a time, 
from  a given  set  of  objects.  This  is  called  sampling  from  a population,  and  there  are 
two  ways  of  sampling,  as  follows. 

1.  In  sampling  with  replacement,  the  object  that  was  drawn  at  random  is  placed  back  to 
the  given  set  and  the  set  is  mixed  thoroughly.  Then  we  draw  the  next  object  at  random. 

2.  In  sampling  without  replacement  the  object  that  was  drawn  is  put  aside. 


Sampling  With  and  Without  Replacement 

A box  contains  10  screws,  three  of  which  are  defective.  Two  screws  are  drawn  at  random.  Find  the  probability 
that  neither  of  the  two  screws  is  defective. 

Solution.  We  consider  the  events 


A:  First  drawn  screw  nondefective. 

B:  Second  drawn  screw  nondefective. 

Clearly,  P(A)  = ^ because  7 of  the  10  screws  are  nondefective  and  we  sample  at  random,  so  that  each  screw 
has  the  same  probability  (jo)  of  being  picked.  If  we  sample  with  replacement,  the  situation  before  the  second 
drawing  is  the  same  as  at  the  beginning,  and  P(B ) = ^ . The  events  are  independent,  and  the  answer  is 

P{A  H B)  = P(A)P(B)  = 0.7  • 0.7  = 0.49  - 49%. 

If  we  sample  without  replacement,  then  P(A)  = ^ , as  before.  If  A has  occurred,  then  there  are  9 screws  left 
in  the  box,  3 of  which  are  defective.  Thus  P{B\A)  — 9 — 3,  and  Theorem  4 yields  the  answer 

P(A  fl  B)  = ^ • § = 47%. 

Is  it  intuitively  clear  that  this  value  must  be  smaller  than  the  preceding  one? 
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1.  In  rolling  3 fair  dice,  what  is  the  probability  of  obtaining 
a sum  not  greater  than  16? 

2.  In  rolling  2 fair  dice,  what  is  the  probability  of  a sum 
greater  than  3 but  not  exceeding  6? 

3.  Three  screws  are  drawn  at  random  from  a lot  of  100 
screws,  10  of  which  are  defective.  Find  the  probability 
of  the  event  that  all  3 screws  drawn  are  nondefective, 
assuming  that  we  draw  (a)  with  replacement,  (b)  without 
replacement. 

4.  In  Prob.  3 find  the  probability  of  E:  At  least  1 defective 
(i)  directly,  (ii)  by  using  complements;  in  both  cases 

(a)  and  (b). 

5.  If  a box  contains  10  left-handed  and  20  right-handed 
screws,  what  is  the  probability  of  obtaining  at  least 
one  right-handed  screw  in  drawing  2 screws  with 
replacement? 

6.  Will  the  probability  in  Prob.  5 increase  or  decrease  if  we 
draw  without  replacement.  First  guess,  then  calculate. 

7.  Under  what  conditions  will  it  make  practically  no 
difference  whether  we  sample  with  or  without 
replacement? 

8.  If  a certain  kind  of  tire  has  a life  exceeding  40,000  miles 
with  probability  0.90,  what  is  the  probability  that  a set 
of  these  tires  on  a car  will  last  longer  than  40,000  miles? 

9.  If  we  inspect  photocopy  paper  by  randomly  drawing  5 
sheets  without  replacement  from  every  pack  of  500, 
what  is  the  probability  of  getting  5 clean  sheets  although 
0.4%  of  the  sheets  contain  spots? 

10.  Suppose  that  we  draw  cards  repeatedly  and  with 
replacement  from  a file  of  100  cards,  50  of  which  refer 
to  male  and  50  to  female  persons.  What  is  the 
probability  of  obtaining  the  second  “female”  card  before 
the  third  “male”  card? 

11.  A batch  of  200  iron  rods  consists  of  50  oversized  rods, 
50  undersized  rods,  and  100  rods  of  the  desired  length. 
If  two  rods  are  drawn  at  random  without  replacement, 
what  is  the  probability  of  obtaining  (a)  two  rods  of  the 


desired  length,  (b)  exactly  one  of  the  desired  length, 
(c)  none  of  the  desired  length? 

12.  If  a circuit  contains  four  automatic  switches  and  we 
want  that,  with  a probability  of  99%,  during  a given 
time  interval  the  switches  to  be  all  working,  what 
probability  of  failure  per  time  interval  can  we  admit 
for  a single  switch? 

13.  A pressure  control  apparatus  contains  3 electronic 
tubes.  The  apparatus  will  not  work  unless  all  tubes  are 
operative.  If  the  probability  of  failure  of  each  tube 
during  some  interval  of  time  is  0.04,  what  is  the 
corresponding  probability  of  failure  of  the  apparatus? 

14.  Suppose  that  in  a production  of  spark  plugs  the  fraction 
of  defective  plugs  has  been  constant  at  2%  over  a long 
time  and  that  this  process  is  controlled  every  half  hour 
by  drawing  and  inspecting  two  just  produced.  Find  the 
probabilities  of  getting  (a)  no  defectives,  (b)  1 
defective,  (c)  2 defectives.  What  is  the  sum  of  these 
probabilities? 

15.  What  gives  the  greater  probability  of  hitting  at  least 
once:  (a)  hitting  with  probability  1/2  and  firing  1 shot, 

(b)  hitting  with  probability  1/4  and  firing  2 shots, 

(c)  hitting  with  probability  1/8  and  firing  4 shots?  First 
guess. 

16.  You  may  wonder  whether  in  (16)  the  last  relation 
follows  from  the  others,  but  the  answer  is  no.  To  see 
this,  imagine  that  a chip  is  drawn  from  a box  containing 
4 chips  numbered  000,  011,  101,  110,  and  let  A,  B,  C 
be  the  events  that  the  first,  second,  and  third  digit, 
respectively,  on  the  drawn  chip  is  1.  Show  that  then 
the  first  three  formulas  in  (16)  hold  but  the  last  one 
does  not  hold. 

17.  Show  that  if  B is  a subset  of  A,  then  P(B)  £ P(A). 

18.  Extending  Theorem  4,  show  that  P(A  PI  B D C)  = 
P(A)P(B\A)P(C\A  H B). 

19.  Make  up  an  example  similar  to  Prob.  16,  for  instance, 
in  terms  of  divisibility  of  numbers. 


24.4  Permutations  and  Combinations 


Permutations  and  combinations  help  in  finding  probabilities  P{A)  = a/k  by  systematically 
counting  the  number  a of  points  of  which  an  event  A consists;  here,  k is  the  number  of 
points  of  the  sample  space  S.  The  practical  difficulty  is  that  a may  often  be  surprisingly 
large,  so  that  actual  counting  becomes  hopeless.  For  example,  if  in  assembling  some 
instrument  you  need  10  different  screws  in  a certain  order  and  you  want  to  draw  them 
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randomly  from  a box  (which  contains  nothing  else)  the  probability  of  obtaining  them  in 
the  required  order  is  only  1/3,628,800  because  there  are 

10!  = l-2-3-4-5-6-7-8-9-10  = 3,628,800 

orders  in  which  they  can  be  drawn.  Similarly,  in  many  other  situations  the  numbers  of 
orders,  arrangements,  etc.  are  often  incredibly  large.  (If  you  are  unimpressed,  take  20 
screws — how  much  bigger  will  the  number  be?) 

Permutations 

A permutation  of  given  things  ( elements  or  objects ) is  an  arrangement  of  these  things  in 
a row  in  some  order.  For  example,  for  three  letters  a,  b,  c there  are  3!  = 1 ■ 2 • 3 = 6 
permutations:  abc,  acb,  bac,  bca,  cab,  cba.  This  illustrates  (a)  in  the  following  theorem. 


Permutations 

(a)  Different  things.  The  number  of  permutations  of  n different  things  taken 
all  at  a time  is 

(1)  n\  = 1 • 2 • 3 ■ • • n (read  “n  factorial”). 

(b)  Classes  of  equal  things.  If  n given  things  can  be  divided  into  c classes  of 
alike  things  differing  from  class  to  class,  then  the  number  of  permutations  of 
these  things  taken  all  at  a time  is 

n ! 

(2)  — - — , ' 7 («!  + n2  + •••  + nc  = n) 

nf.n2\  ■■■nc\ 

Where  nj  is  the  number  of  things  in  the  jth  class. 


(a)  There  are  n choices  for  filling  the  first  place  in  the  row.  Then  n — 1 things  are  still 
available  for  filling  the  second  place,  etc. 

(b)  n i alike  things  in  class  1 make  nf.  permutations  collapse  into  a single  permutation 
(those  in  which  class  1 things  occupy  the  same  n j positions),  etc.,  so  that  (2)  follows 
from  (1).  ■ 

Illustration  of  Theorem  1(b) 

If  a box  contains  6 red  and  4 blue  balls,  the  probability  of  drawing  first  the  red  and  then  the  blue  balls  is 

P = 6141/10!  = 1/210  = 0.5%. 

A permutation  of  n things  taken  k at  a time  is  a permutation  containing  only  k of  the 
n given  things.  Two  such  permutations  consisting  of  the  same  k elements,  in  a different 
order,  are  different,  by  definition.  For  example,  there  are  6 different  permutations  of  the 
three  letters  a,  b,  c,  taken  two  letters  at  a time,  ab,  ac,  be,  ba,  ca,  cb. 

A permutation  of  n things  taken  A:  at  a time  with  repetitions  is  an  arrangement  obtained 
by  putting  any  given  thing  in  the  first  position,  any  given  thing,  including  a repetition  of  the 
one  just  used,  in  the  second,  and  continuing  until  k positions  are  filled.  For  example,  there 
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are  32  = 9 different  such  permutations  of  a,  b , c taken  2 letters  at  a time,  namely,  the 
preceding  6 permutations  and  aa,  bb,  cc.  You  may  prove  (see  Team  Project  14): 


Permutations 

The  number  of  different  permutations  of  n different  things  taken  k at  a time  without 
repetitions  is 

(3a)  n(n  - l)(n  - 2)  • ■ ■ (n  - k + 1)  = - — — — 

( n — k)\ 

and  with  repetitions  is 

(3b)  nk. 


Illustration  of  Theorem  2 

In  an  encrypted  message  the  letters  are  arranged  in  groups  of  five  letters,  called  words.  From  (3b)  we  see  that 
the  number  of  different  such  words  is 

265  = 11,881,376. 

From  (3a)  it  follows  that  the  number  of  different  such  words  containing  each  letter  no  more  than  once  is 
26!/(26  - 5)!  = 26  • 25  • 24  • 23  • 22  = 7,893,600. 


Combinations 

In  a permutation,  the  order  of  the  selected  things  is  essential.  In  contrast,  a combination 
of  given  things  means  any  selection  of  one  or  more  things  without  regard  to  order.  There 
are  two  kinds  of  combinations,  as  follows. 

The  number  of  combinations  of  n different  things,  taken  k at  a time,  without 
repetitions  is  the  number  of  sets  that  can  be  made  up  from  the  n given  things,  each  set 
containing  k different  things  and  no  two  sets  containing  exactly  the  same  k things. 

The  number  of  combinations  of  n different  things,  taken  k at  a time,  with  repetitions 
is  the  number  of  sets  that  can  be  made  up  of  k things  chosen  from  the  given  n things, 
each  being  used  as  often  as  desired. 

For  example,  there  are  three  combinations  of  the  three  letters  a , b,  c,  taken  two  letters 
at  a time,  without  repetitions,  namely,  ab,  ac,  be,  and  six  such  combinations  with 
repetitions,  namely,  ab,  ac,  be,  aa,  bb,  cc. 


Combinations 

The  number  of  different  combinations  of  n different  things  taken , k at  a time,  without 
repetitions,  is 


(4a) 


n\  _ n\  _ n(n  — 1 )■■■  (n  — k + 1) 
k)  ~ k\(n  - k)\  ~ 1 • 2---k 


and  the  number  of  those  combinations 


(4b) 


n + k 
k 


SEC.  24.4  Permutations  and  Combinations 


1027 


PROOF 


EXAMPLE  3 


EXAMPLE  4 


The  statement  involving  (4a)  follows  from  the  first  part  of  Theorem  2 by  noting  that  there 
are  k\  permutations  of  k things  from  the  given  n things  that  differ  by  the  order  of  the 
elements  (see  Theorem  1),  but  there  is  only  a single  combination  of  those  k things  of  the 
type  characterized  in  the  first  statement  of  Theorem  3.  The  last  statement  of  Theorem  3 
can  be  proved  by  induction  (see  Team  Project  14).  ■ 

Illustration  of  Theorem  3 

The  number  of  samples  of  five  lightbulbs  that  can  be  selected  from  a lot  of  500  bulbs  is  [see  (4a)] 

/ 500\  500!  500  • 499  • 498  • 497  • 496 

= = = 255,244,687,600. 

V 5 ) 51495!  1 - 2-3 -4-5 

Factorial  Function 

In  (l)-(4)  the  factorial  function  is  basic.  By  definition, 

(5)  0!  = 1. 

Values  may  be  computed  recursively  from  given  values  by 

(6)  (n  + 1)!  = (n  + \)n\. 

For  large  n the  function  is  very  large  (see  Table  A3  in  App.  5).  A convenient  approximation 
for  large  n is  the  Stirling  formula2 


(7) 


n\  ~ V2777! 


(e  = 2.718  •••) 


where  ~ is  read  “asymptotically  equal”  and  means  that  the  ratio  of  the  two  sides  of  (7) 
approaches  1 as  n approaches  infinity. 

Stirling  Formula 


n\ 

By  (7) 

Exact  Value 

Relative  Error 

4! 

23.5 

24 

2.1% 

10! 

3,598,696 

3,628,800 

0.8% 

20! 

2.42279  • 1018 

2,432,902,008,176,640,000 

0.4% 

Binomial  Coefficients 

The  binomial  coefficients  are  defined  by  the  formula 


( a\  a(a  — \)(a  — 2)  ■■■  (a  — k + 1 ) 

(8)  (J  = (k  g 0,  integer). 


2JAMES  STIRLING  (1692-1770),  Scots  mathematician. 
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The  numerator  has  k factors.  Furthermore,  we  define 


(9) 


= 1, 


in  particular, 


= 1. 


For  integer  a = n we  obtain  from  (8) 

(10) 


n 

n — k 


(n  §:  0,  0 sg  k Si  n). 


Binomial  coefficients  may  be  computed  recursively,  because 


(11) 

Formula  (8)  also  yields 

(12) 


— m 

k 


+ 


a + 1 
k \ J V/c+l 


= (-1)* 


m + k — 1 
k 


There  are  numerous  further  relations;  we  mention  two  important  ones, 

n—1 


(13) 

and 

(14) 


2 

s= 0 


k + s\  ( n + k 
k + 1 


2 

fc=0 


kj  \r  — k 


p + q 


( k '=  0,  integer). 


( k is  0,  integer) 
(m  > 0). 


(k  §S  0,  n g 1 , 
both  integer) 


(r  0,  integer). 


F RFQ  BL  E-M=S  E^Z=2:4^~4 


Note  the  large  numbers  in  the  answers  to  some  of  these 
problems,  which  would  make  counting  cases  hopeless! 

1.  In  how  many  ways  can  a company  assign  10  drivers  to 
n buses,  one  driver  to  each  bus  and  conversely? 

2.  List  (a)  all  permutations,  (b)  all  combinations  without 
repetitions,  (c)  all  combinations  with  repetitions,  of  5 
letters  a,  e,  i,  o , u taken  2 at  a time. 

3.  If  a box  contains  4 rubber  gaskets  and  2 plastic  gaskets, 
what  is  the  probability  of  drawing  (a)  first  the  plastic 
and  then  the  rubber  gaskets,  (b)  first  the  rubber  and 
then  the  plastic  ones?  Do  this  by  using  a theorem  and 
checking  it  by  multiplying  probabilities. 

4.  An  urn  contains  2 green,  3 yellow,  and  5 red  balls.  We 
draw  1 ball  at  random  and  put  it  aside.  Then  we  draw 
the  next  ball,  and  so  on.  Find  the  probability  of  drawing 


at  first  the  2 green  balls,  then  the  3 yellow  ones,  and 
finally  the  red  ones. 

5.  In  how  many  different  ways  can  we  select  a committee 
consisting  of  3 engineers,  2 physicists,  and  2 computer 
scientists  from  10  engineers,  5 physicists,  and  6 
computer  scientists?  First  guess. 

6.  How  many  different  samples  of  4 objects  can  we  draw 
from  a lot  of  50? 

7.  Of  a lot  of  10  items,  2 are  defective,  (a)  Find  the 
number  of  different  samples  of  4.  Find  the  number  of 
samples  of  4 containing  (b)  no  defectives,  (c)  1 
defective,  (d)  2 defectives. 

8.  Determine  the  number  of  different  bridge  hands.  (A 
bridge  hand  consists  of  13  cards  selected  from  a full 
deck  of  52  cards.) 
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9.  In  how  many  different  ways  can  6 people  be  seated  at 
a round  table? 

10.  If  a cage  contains  100  mice,  3 of  which  are  male,  what 
is  the  probability  that  the  3 male  mice  will  be  included 
if  10  mice  are  randomly  selected? 

11.  How  many  automobile  registrations  may  the  police 
have  to  check  in  a hit-and-run  accident  if  a witness 
reports  KDP7  and  cannot  remember  the  last  two  digits 
on  the  license  plate  but  is  certain  that  all  three  digits 
were  different? 

12.  If  3 suspects  who  committed  a burglary  and  6 innocent 
persons  are  lined  up,  what  is  the  probability  that  a 
witness  who  is  not  sure  and  has  to  pick  three  persons 
will  pick  the  three  suspects  by  chance?  That  the  witness 
picks  3 innocent  persons  by  chance? 

13.  CAS  PROJECT.  Stirling  formula,  (a)  Using  (7), 
compute  approximate  values  of  n\  for  n = 1,  • ■ ■ , 20. 

(b)  Determine  the  relative  error  in  (a).  Find  an 
empirical  formula  for  that  relative  error. 

(c)  An  upper  bound  for  that  relative  error  is 
gi/i2 n — 1 . Try  to  relate  your  empirical  formula  to  this. 

(d)  Search  through  the  literature  for  further  information 
on  Stirling’s  formula.  Write  a short  eassy  about  your 

24.5  Random  Variables. 

Probability  Distributions 

In  Sec.  24.1  we  considered  frequency  distributions  of  data.  These  distributions  show  the 
absolute  or  relative  frequency  of  the  data  values.  Similarly,  a probability  distribution 
or,  briefly,  a distribution,  shows  the  probabilities  of  events  in  an  experiment.  The  quantity 
that  we  observe  in  an  experiment  will  be  denoted  by  X and  called  a random  variable 
(or  stochastic  variable)  because  the  value  it  will  assume  in  the  next  trial  depends  on 
chance,  on  randomness — if  you  roll  a die,  you  get  one  of  the  numbers  from  1 to  6,  but 
you  don’t  know  which  one  will  show  up  next.  Thus  X = Number  a die  turns  up  is  a 
random  variable.  So  is  A = Elasticity  of  rubber  (elongation  at  break).  (“Stochastic”  means 
related  to  chance.) 

If  we  count  (cars  on  a road,  defective  screws  in  a production,  tosses  until  a die  shows 
the  first  Six),  we  have  a discrete  random  variable  and  distribution.  If  we  measure 
(electric  voltage,  rainfall,  hardness  of  steel),  we  have  a continuous  random  variable  and 
distribution.  Precise  definitions  follow.  In  both  cases  the  distribution  of  X is  determined 

by  the  distribution  function 

(1)  F(x)  = P(X  g x); 

this  is  the  probability  that  in  a trial,  X will  assume  any  value  not  exceeding  x. 

CAUTION!  The  terminology  is  not  uniform.  F(x)  is  sometimes  also  called  the 

cumulative  distribution  function. 


findings,  arranged  in  logical  order  and  illustrated  with 
numeric  examples. 

14.  TEAM  PROJECT.  Permutations,  Combinations. 

(a)  Prove  Theorem  2. 

(b)  Prove  the  last  statement  of  Theorem  3. 

(c)  Derive  (11)  from  (8). 

(d)  By  the  binomial  theorem, 


(a  + b)n  = 2 \".)aKbT 

k=0  ' ' 


so  that  akbn  k has  the  coefficient  (Tj.  Can  you 

conclude  this  from  Theorem  3 or  is  this  a mere 
coincidence? 

(e)  Prove  (14)  by  using  the  binomial  theorem. 

(f)  Collect  further  formulas  for  binomial  coefficients 
from  the  literature  and  illustrate  them  numerically. 

15.  Birthday  problem.  What  is  the  probability  that  in  a 
group  of  20  people  (that  includes  no  twins)  at  least 
two  have  the  same  birthday,  if  we  assume  that  the 
probability  of  having  birthday  on  a given  day  is  1/365 
for  every  day.  First  guess.  Hint.  Consider  the  com- 
plementary event. 
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For  (1)  to  make  sense  in  both  the  discrete  and  the  continuous  case  we  formulate  con- 
ditions as  follows. 


DEFINITION 


Random  Variable 

A random  variable  A is  a function  defined  on  the  sample  space  S of  an  experiment. 
Its  values  are  real  numbers.  For  every  number  a the  probability 

P(X  = a) 

with  which  X assumes  a is  defined.  Similarly,  for  any  interval  I the  probability 

P(X  G /) 

with  which  X assumes  any  value  in  I is  defined. 


Although  this  definition  is  very  general,  in  practice  only  a very  small  number  of  distributions 
will  occur  over  and  over  again  in  applications. 

From  (1)  we  obtain  the  fundamental  formula  for  the  probability  corresponding  to  an 
interval  a < x g b, 

(2)  P(a  < A g h)  = F(b)  - F(d). 

This  follows  because  X g a ( “X  assumes  any  value  not  exceeding  a ”)  and  a < X = b 
( “X  assumes  any  value  in  the  interval  a < x g b”)  are  mutually  exclusive  events,  so  that 
by  (1)  and  Axiom  3 of  Definition  2 in  Sec.  24.3 

F(b)  = P(X  g b)  = P(X  g a)  + P(a  < X g b) 

= F(a)  + P(a  <Igi?) 

and  subtraction  of  F{a)  on  both  sides  gives  (2). 

Discrete  Random  Variables  and  Distributions 

By  definition,  a random  variable  X and  its  distribution  are  discrete  if  X assumes  only  finitely 
many  or  at  most  countably  many  values  x±,  x2,  V3,  ■ • • , called  the  possible  values  of  X, 
with  positive  probabilities  pi  = P(X  = x\),  p2  = P(X  = x2),  p2  = P(X  = x2),  • • ■ , 
whereas  the  probability  P(X  G I)  is  zero  for  any  interval  I containing  no  possible  value. 

Clearly,  the  discrete  distribution  of  X is  also  determined  by  the  probability  function 
f(x ) of  X,  defined  by 


(3) 


fix) 


Pj  if  x = Xj 
. 0 otherwise 


U=  Ij  2,  • ■ ■ ), 


From  this  we  get  the  values  of  the  distribution  function  F(x)  by  taking  sums, 

F(x)  = 2 f(Xj)  = 2 Pj 


(4) 
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EXAMPLE  1 


EXAMPLE  2 


where  for  any  given  x we  sum  all  the  probabilities  pj  for  which  Xj  is  smaller  than  or  equal 
to  that  of  x.  This  is  a step  function  with  upward  jumps  of  size  p,j  at  the  possible  values 
xj  of  X and  constant  in  between. 


Probability  Function  and  Distribution  Function 

Figure  513  shows  the  probability  function /(a)  and  the  distribution  function  F(x)  of  the  discrete  random  variable 

X = Number  a fair  die  turns  up. 

X has  the  possible  values  x = 1,  2,  3,  4,  5,  6 with  probability  1/6  each.  At  these  x the  distribution  function 
has  upward  jumps  of  magnitude  1/6.  Hence  from  the  graph  of  f(x)  we  can  construct  the  graph  of  F(x)  and 
conversely. 

In  Figure  513  (and  the  next  one)  at  each  jump  the  fat  dot  indicates  the  function  value  at  the  jump! 


f(x) 

m 

y6 

MINI 

y6 

, . 1 1 1 1 1 

Fig.  513.  Probability  function  f(x) 
and  distribution  function  F(x)  of  the 
random  variable  X = Number 
obtained  in  tossing  a fair  die  once 


Fig.  514,  Probability  function  f(x)  and 
distribution  function  F(x)  of  the  random 
variable  X = Sum  of  the  two  numbers 
obtained  in  tossing  two  fair  dice  once 


Probability  Function  and  Distribution  Function 

The  random  variable  X = Sum  of  the  two  numbers  two  fair  dice  turn  up  is  discrete  and  has  the  possible  values 
2 (=  1 + 1),  3,  4,  • • • , 12  (=  6 + 6).  There  are  6 • 6 = 36  equally  likely  outcomes  (1,  1)  (1,  2),  • • • , (6,  6), 
where  the  first  number  is  that  shown  on  the  first  die  and  the  second  number  that  on  the  other  die.  Each  such 
outcome  has  probability  1/36.  Now  X = 2 occurs  in  the  case  of  the  outcome  (1,  1);  X = 3 in  the  case  of  the 
two  outcomes  (1,  2)  and  (2,  1);  X = 4 in  the  case  of  the  three  outcomes  (1,  3),  (2,  2),  (3,  1);  and  so  on.  Hence 
f{x)  = P{X  = x)  and  F{x)  = P(X  = x)  have  the  values 


X 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

fix) 

1/36 

2/36 

3/36 

4/36 

5/36 

6/36 

5/36 

4/36 

3/36 

2/36 

1/36 

F{X) 

1/36 

3/36 

6/36 

10/36 

15/36 

21/36 

26/36 

30/36 

33/36 

35/36 

36/36 

Figure  514  shows  a bar  chart  of  this  function  and  the  graph  of  the  distribution  function,  which  is  again  a step 
function,  with  jumps  (of  different  height!)  at  the  possible  values  of  X. 


1032 


CHAP.  24  Data  Analysis.  Probability  Theory 


EXAMPLE  3 


EXAMPLE  4 


Two  useful  formulas  for  discrete  distributions  are  readily  obtained  as  follows.  For  the 
probability  corresponding  to  intervals  we  have  from  (2)  and  (4) 


^ P(a  < X b)  = F(b ) — F{a)  = ^ pj  ( X discrete). 

a <Xj^b 

This  is  the  sum  of  all  probabilities  pj  for  which  Xj  satisfies  a < xj  t=k  b.  (Be  careful  about 
< and  =§  ! ) From  this  and  /-’(.S')  = 1 (Sec.  24.3)  we  obtain  the  following  formula. 


(6) 


^ Pj  = 1 (sum  of  all  probabilities). 

j 


Illustration  of  Formula  (5) 

In  Example  2,  compute  the  probability  of  a sum  of  at  least  4 and  at  most  8. 

Solution.  P{ 3 < X £ 8)  = At 8)  - A(3)  = f§  - ^ = §§. 

Waiting  Time  Problem.  Countably  Infinite  Sample  Space 

In  tossing  a fair  coin,  let  X = Number  of  trials  until  the  first  head  appears.  Then,  by  independence  of  events 
(Sec.  24.3), 

P(X=  1)  = P(H ) =J  (H  = Head) 

P(X  = 2)  = P(TH)  =|-|  =\  (T  = Tail) 

P(X=3)  = P(TTH)  = l-l-l  = \,  etc. 

and  in  general  P(X  = n)  = (|  )n,  n = 1,  2,  • • • . Also,  (6)  can  be  confirmed  by  the  sum  formula  for  the  geometric 
series. 


Continuous  Random  Variables  and  Distributions 

Discrete  random  variables  appear  in  experiments  in  which  we  count  (defectives  in  a 
production,  days  of  sunshine  in  Chicago,  customers  standing  in  a line,  etc.).  Continuous 
random  variables  appear  in  experiments  in  which  we  measure  (lengths  of  screws,  voltage 
in  a power  line,  Brinell  hardness  of  steel,  etc.).  By  definition,  a random  variable  X and 
its  distribution  are  of  continuous  type  or,  briefly,  continuous,  if  its  distribution  function 
F(x ) [defined  in  (1)]  can  be  given  by  an  integral 


(7) 


F(x) 


rX 

f(v)  dv 
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EXAMPLE  5 


(we  write  v because  x is  needed  as  the  upper  limit  of  the  integral)  whose  integrand  f(x), 
called  the  density  of  the  distribution,  is  nonnegative,  and  is  continuous,  perhaps  except 
for  finitely  many  x-values.  Differentiation  gives  the  relation  of  /to  F as 

(8)  m = F\x) 


for  every  x at  which  /(x)  is  continuous. 

From  (2)  and  (7)  we  obtain  the  very  important  formula  for  the  probability  corresponding 
to  an  interval: 


(9) 


P(a<X^b)  = F(b ) - F(a) 


rb 

fiv)  dv. 


This  is  the  analog  of  (5). 

From  (7)  and  P(S)  = 1 (Sec.  24.3)  we  also  have  the  analog  of  (6): 


(10) 


f(v)  dv  = 1. 


Continuous  random  variables  are  simpler  than  discrete  ones  with  respect  to  intervals. 
Indeed,  in  the  continuous  case  the  four  probabilities  corresponding  to  a < X Si  b, 
a < X < b,  a X < b,  and  a Si  X b with  any  fixed  a and  b ( > a)  are  all  the  same. 
Can  you  see  why?  ( Answer . This  probability  is  the  area  under  the  density  curve,  as  in 
Fig.  515,  and  does  not  change  by  adding  or  subtracting  a single  point  in  the  interval  of 
integration.)  This  is  different  from  the  discrete  case!  (Explain.) 

The  next  example  illustrates  notations  and  typical  applications  of  our  present  formulas. 


Curve  of  density 


Fig.  515.  Example  illustrating  formula  (9) 


Continuous  Distribution 

Let  X have  the  density  function  /( x)  — 0.75(1  — x 2)  if  — 1 ^ x 1 and  zero  otherwise.  Find  the  distribution 
function.  Find  the  probabilities  P(— \ ^ X |)  and  P{\  ^ X 2).  Find  x such  that  P(X  ^ x)  = 0.95. 

Solution.  From  (7)  we  obtain  F(x)  = 0 if  x ^ —1, 


F(x)  = 0.75  [ (1  - v2)  dv  = 0.5  + 0.75x  - 0.25*3  if  -1  < * ^ 1, 

-'-l 

and  F{x)  = 1 if  x > 1 . From  this  and  (9)  we  get 

(1/2 

P(-l  sxsl)  = F(l)  - F(-\)  = 0.75  (1  - v2)  dv  = 68.75% 

-'-1/2 
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(because  P(— | S X S g)  = P(— g <XSg)  for  a continuous  distribution)  and 

P(jSXS  2)  = F(2)  - F(|)  = 0.75  [ (1  - v2)  dv  = 31.64%. 

i/4 

(Note  that  the  upper  limit  of  integration  is  1,  not  2.  Why?)  Finally, 

P(X  ^x)  = F(x)  = 0.5  + 0.75*  - 0.25*3  = 0.95. 

Algebraic  simplification  gives  3x  — x3  = 1.8.  A solution  is  x = 0.73,  approximately. 

Sketch /(x)  and  mark x = — and  0.73,  so  that  you  can  see  the  results  (the  probabilities)  as  areas  under 
the  curve.  Sketch  also  F(x). 

Further  examples  of  continuous  distributions  are  included  in  the  next  problem  set  and  in 
later  sections. 


P R Q B FE=M~ST  T~2  4^ 


1.  Graph  the  probability  function  /(x)  = kx2  (x  = 1,  2,  3, 
4,  5 ; k suitable)  and  the  distribution  function. 

2.  Graph  the  density  function  /( x)  = kx2  (0  S x S 5; 
k suitable)  and  the  distribution  function. 

3.  Uniform  distribution.  Graph/and  F when  the  density 
of  X is  f(x)  = k — const  if  —2  SiS2  and  0 else- 
where. Find  P(0SXS  2). 

4.  In  Prob.  3 find  c and  c such  that  P(—c<X<c)  = 
95%  and  P(0  < X < c)  = 95%. 

5.  Graph  / and  F when  /(— 2)  =/( 2)  = §,  /(  — 1)  = 
/(l)  = 1-  Can /have  further  positive  values? 

6.  A box  contains  4 right-handed  and  6 left-handed 

screws.  Two  screws  are  drawn  at  random  without 
replacement.  Let  X be  the  number  of  left-handed 
screws  drawn.  Find  the  probabilities  P{X  = 0), 
F(X=1),  P(X  = 2),  P(1  < A < 2),  P(XS1), 

P(X  £ 1),  P(X  > 1),  and  P(0.5  < X < 10). 

7.  Let  X be  the  number  of  years  before  a certain  kind  of 
pump  needs  replacement.  Let  X have  the  probability 
function  /(x)  = kx3,  x = 0,  1,  2,  3,  4.  Find  k.  Sketch/ 
and  F. 

8.  Graph  the  distribution  function  F(x)  = 1 — e~3x  if 
x > 0,  F( x)  = 0 if  x S 0,  and  the  density /(x).  Find  x 
such  that  F(x)  = 0.9. 

9.  Let  X [millimeters]  be  the  thickness  of  washers. 
Assume  that  X has  the  density  /(x)  = kx  if 
0.9  < x < 1.1  and  0 otherwise.  Find  k.  What  is  the 
probability  that  a washer  will  have  thickness  between 
0.95  mm  and  1.05  mm? 


10.  If  the  diameter  X of  axles  has  the  density  f(x)  = k if 
1 19.9  S x S 120.1  and  0 otherwise,  how  many 
defectives  will  a lot  of  500  axles  approximately  contain 
if  defectives  are  axles  slimmer  than  119.91  or  thicker 
than  120.09? 

11.  Find  the  probability  that  none  of  three  bulbs  in  a traffic 
signal  will  have  to  be  replaced  during  the  first  1500 
hours  of  operation  if  the  lifetime  X of  a bulb  is  a random 
variable  with  the  density /(x)  = 6[0.25  — (x  — 1.5)2] 
when  1 S x S 2 and/(x)  = 0 otherwise,  where  x is 
measured  in  multiples  of  1000  hours. 

12  Let  X be  the  ratio  of  sales  to  profits  of  some  company. 
Assume  that  X has  the  distribution  function  F(x)  = 0 if 
x < 2,  F(x)  = (x2  - 4)/5  if  2 S x < 3,  F(x)  = 1 if 
x £ 3.  Find  and  sketch  the  density.  What  is  the  probability 
that  X is  between  2.5  (40%  profit)  and  5 (20%  profit)? 

13.  Suppose  that  in  an  automatic  process  of  filling  oil 
cans,  the  content  of  a can  (in  gallons)  is  Y = 100  + X, 
where  X is  a random  variable  with  density 
/(x)  = 1 — |x|  when  |x|  S 1 and  0 when  |x|  > 1. 
Sketch  /(x)  and  F(x).  In  a lot  of  1000  cans,  about  how 
many  will  contain  100  gallons  or  more?  What  is  the 
probability  that  a can  will  contain  less  than  99.5 
gallons?  Less  than  99  gallons? 

14.  Find  the  probability  function  of  X = Number  of  times 
a fair  die  is  rolled  until  the  first  Six  appears  and  show 
that  it  satisfies  (6). 

15.  Let  X be  a random  variable  that  can  assume  every  real 
value.  What  are  the  complements  of  the  events  X S b, 
X<b,  XSc,  X>c,  bSXSc,  b<X£c? 
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24.6  Mean  and  Variance  of  a Distribution 

2 

The  mean  (jl  and  variance  cr  of  a random  variable  X and  of  its  distribution  are  the  theoretical 
counterparts  of  the  mean  .r  and  variance  s of  a frequency  distribution  in  Sec.  24.1  and 
serve  a similar  purpose.  Indeed,  the  mean  characterizes  the  central  location  and  the  variance 
the  spread  (the  variability)  of  the  distribution.  The  mean  /jl  (mu)  is  defined  by 


(a)  M = 2 xjfixj) 


(1) 


(b)  jx  = 


xf(x)  dx 


and  the  variance  cr2  (sigma  square)  by 


(a)  a2  = 2 (xj  - ixffixj) 


(2) 


(b)  cr2  = 


(x  ~ IX)  f{x)  dx 


(Discrete  distribution) 


(Continuous  distribution) 


(Discrete  distribution) 


(Continuous  distribution). 


cr  (the  positive  square  root  of  cr2)  is  called  the  standard  deviation  of  X and  its  distribution, 
/is  the  probability  function  or  the  density,  respectively,  in  (a)  and  (b). 

The  mean  /j.  is  also  denoted  by  E(X)  and  is  called  the  expectation  of  X because  it  gives 
the  average  value  of  X to  be  expected  in  many  trials.  Quantities  such  as  /jl  and  cr2  that 
measure  certain  properties  of  a distribution  are  called  parameters,  jx  and  cr2  are  the  two 
most  important  ones.  From  (2)  we  see  that 

(3)  o-2  > 0 

(except  for  a discrete  “distribution”  with  only  one  possible  value,  so  that  cr2  = 0).  We 
assume  that  i±  and  cr2  exist  (are  finite),  as  is  the  case  for  practically  all  distributions  that 
are  useful  in  applications. 


EXAMPLE  Mean  and  Variance 

The  random  variable  X = Number  of  heads  in  a single  toss  of  a fair  coin  has  the  possible  values  X = 0 and 
X = 1 with  probabilities  P(X  = 0)  = | and  P(X  = 1)  = From  (la)  we  thus  obtain  the  mean 
/jl  = 0*2  + 1 * 2 = 2 » and  (2a)  yields  the  variance 

cr  — (0  2)  2/  2 ~ 4- 

EXAMPLE  2 Uniform  Distribution.  Variance  Measures  Spread 

The  distribution  with  the  density 


fix)  = 


if 


a < x < b 
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and/  = 0 otherwise  is  called  the  uniform  distribution  on  the  interval  a < x < b.  From  (lb)  (or  from  Theorem  1, 
below)  we  find  that  ju,  = (a  + b)/ 2,  and  (2b)  yields  the  variance 


Fig.  516.  Uniform  distributions  having  the  same  mean  (0.5)  but  different  variances  a2 

Symmetry.  We  can  obtain  the  mean  p without  calculation  if  a distribution  is  symmetric. 
Indeed,  you  may  prove 


THEOREM  1 


Mean  of  a Symmetric  Distribution 

If  a distribution  is  symmetric  with  respect  to  x = c,  that  is,  f(c  — x)  = f(c  + x), 
then  p = c.  (Examples  1 and  2 illustrate  this.) 


Transformation  of  Mean  and  Variance 

Given  a random  variable  X with  mean  p and  variance  cr2,  we  want  to  calculate  the  mean 
and  variance  of  X*  = a\  + a2X,  where  a1  and  a2  are  given  constants.  This  problem  is 
important  in  statistics,  where  it  often  appears. 


THEOREM  2 


Transformation  of  Mean  and  Variance 

(a)  If  a random  variable  X has  mean  fi  and  variance  cr2,  then  the  random 
variable 

(4)  X*  = ay  + a2X  (a2  > 0) 

has  the  mean  pi*  and  variance  a*2,  where 

(5)  pi*  = fly  + a2p  and  cr*2  = flfcr2. 
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PROOF 


(b)  In  particular,  the  standardized  random  variable  Z corresponding  to  X, 
given  by 


(6) 


Z = 


X - p 
u 


has  the  mean  0 and  the  variance  1. 


We  prove  (5)  for  a continuous  distribution.  To  a small  interval  I of  length  Ax  on  the 
x-axis  there  corresponds  the  probability  fix) Ax  [approximately;  the  area  of  a rectangle 
of  base  Ax  and  height  fix)].  Then  the  probability  fix) Ax  must  equal  that  for  the 
corresponding  interval  on  the  x*-axis,  that  is,  /*(x*)Ax*,  where/*  is  the  density  of  X * 
and  Ax*  is  the  length  of  the  interval  on  the  x*-axis  corresponding  to  I.  Hence  for 
differentials  we  have  f*(x*)  dx*  = fix)  dx.  Also,  x*  = a\  + a2x  by  (4),  so  that  (lb) 
applied  to  A*  gives 


p* 


x*/*(x*)  dx* 


(a  i + a2x)fix)  dx 


oc 


r 00 


— 


fix)  dx  + a2 

— CO 


xf{x)  dx. 

— CO 


On  the  right  the  first  integral  equals  1,  by  (10)  in  Sec.  24.5.  The  second  intergral  is  p. 
This  proves  (5)  for  p*.  It  implies 

x*  — p*  = (a  i + 02x)  — (fli  + a2p)  = a2(x  — p). 

From  this  and  (2)  applied  to  X*,  again  using  f*(x*)  dx*  = fix)  dx,  we  obtain  the  second 
formula  in  (5), 


cr 


*2  _ 


CO  CO 

(x*  — p*)2f*(x*)  dx*  = a 2 (x  — pffix)  dx  = a^cr2. 


For  a discrete  distribution  the  proof  of  (5)  is  similar. 

Choosing  a i = —p/cr  and  a2  = 1/cr  we  obtain  (6)  from  (4),  writing  X*  = Z.  For  these 
fli,  a2  formula  (5)  gives  p*  = 0 and  cr*2  = 1,  as  claimed  in  (b). 


Expectation,  Moments 

Recall  that  (1)  defines  the  expectation  (the  mean)  of  X,  the  value  of  X to  be  expected  on 
the  average,  written  p = E(X ).  More  generally,  if  g(x)  is  nonconstant  and  continuous  for 
all  x,  then  g(X)  is  a random  variable.  Hence  its  mathematical  expectation  or,  briefly,  its 
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expectation  E(g(X ))  is  the  value  of  g(X ) to  be  expected  on  the  average,  defined  [similarly 
to  (1)]  by 


(7) 


E(g(X))  = 2 g{Xj)f(Xj)  or  E{gQ Q) 

3 


g(x)fix)  dx. 


In  the  first  formula,  / is  the  probability  function  of  the  discrete  random  variable  X.  In  the 
second  formula,  / is  the  density  of  the  continuous  random  variable  X.  Important  special 
cases  are  the  &th  moment  of  X (where  k = 1,  2,  • • •) 


(8)  E(Xk ) = 2 4f(xj)  or 

o 

and  the  fcth  central  moment  of  X{k  = 1,  2,  • • •) 


xj(x)  dx 


(9)  E([X  - ii]k)  = 2 (xj  ~ V)kf(xj)  or 

3 

This  includes  the  first  moment,  the  mean  of  X 


r 00 

(x  — /Ji)kf(x)  dx. 


(10) 


H = E(X) 


[(8)  with  k = 1], 


It  also  includes  the  second  central  moment,  the  variance  of  X 

(11)  o-2  = E([X  - /if)  [(9)  with  k = 2], 


For  later  use  you  may  prove 

(12)  E{\)  = 1. 


PROBIFM~StTT47S 


MEAN,  VARIANCE 

Find  the  mean  and  variance  of  the  random  variable  X with 
probability  function  or  density  fix). 

1.  f(x)  — kx  (0  S x S 2,  k suitable) 

2.  X = Number  a fair  die  turns  up 

3.  Uniform  distribution  on  [0,  277] 

4.  Y = V3(X  — fi)/ tt  with  X as  in  Prob.  3 

5.  fix)  = 4e~4xixS  0) 

6.  fix)  = k(\  — x2)  if  —1  S x S 1 and  0 otherwise 

7.  fix)  = Ce~x/2  ix  = 0) 

8.  X = Number  of  times  a fair  coin  is  flipped  until  the 
first  Head  appears.  (Calculate  p.  only.) 

9.  If  the  diameter  X [cm]  of  certain  bolts  has  the  density 
fix)  = kix  - 0.9)(1.1  - x)  for  0.9  < jc  < 1.1  and  0 
for  other  x,  what  are  k,  p,  and  cr2?  Sketch  fix). 


10.  If,  in  Prob.  9,  a defective  bolt  is  one  that  deviates  from 
1.00  cm  by  more  than  0.06  cm,  what  percentage  of 
defectives  should  we  expect? 

11.  For  what  choice  of  the  maximum  possible  deviation 
from  1.00  cm  shall  we  obtain  10%  defectives  in  Probs.  9 
and  10? 

12.  What  total  sum  can  you  expect  in  rolling  a fair  die 
20  times?  Do  the  experiment.  Repeat  it  a number  of 
times  and  record  how  the  sum  varies. 

13.  What  is  the  expected  daily  profit  if  a store  sells  X air 
conditioners  per  day  with  probability  /(10)  = 0.1, 
fill)  = 0.3,  /(12)  = 0.4,  /(13)  = 0.2  and  the  profit 
per  conditioner  is  $55? 

14.  Find  the  expectation  of  giX ) = X2,  where  X is  uniformly 
distributed  on  the  interval  — 1 SrS  1 . 
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15.  A small  filling  station  is  supplied  with  gasoline  every 
Saturday  afternoon.  Assume  that  its  volume  X of  sales 
in  ten  thousands  of  gallons  has  the  probability  density 
/( x)  = 6x(l  - x)  if  0 S x S 1 and  0 otherwise. 
Determine  the  mean,  the  variance,  and  the  standardized 
variable. 

16.  What  capacity  must  the  tank  in  Prob.  15  have  in  order 
that  the  probability  that  the  tank  will  be  emptied  in  a 
given  week  be  5%? 

17.  James  rolls  2 fair  dice,  and  Harry  pays  k cents  to  James, 
where  k is  the  product  of  the  two  faces  that  show  on 
the  dice.  How  much  should  James  pay  to  Harry  for 
each  game  to  make  the  game  fair? 

18.  What  is  the  mean  life  of  a lightbulb  whose  life  X [hours] 
has  the  density /(x)  = O.OOle-0'001*  (x  £ 0)? 

19.  Let  X be  discrete  with  probability  function/(0)  =/( 3)  = 
g,  /(l)  = /( 2)  = §.  Find  the  expectation  of  X3. 

20.  TEAM  PROJECT.  Means,  Variances,  Expectations, 
(a)  Show  that  E(X  - p)  = 0,  a2  = E(X2)  - p2. 


(b)  Prove  (10)-(12). 

(c)  Find  all  the  moments  of  the  uniform  distribution 
on  an  interval  a £ x fi  b. 

(d)  The  skewness  y of  a random  variable  X is  defined 
by 

(13)  y = ^£([A-/r]3). 

a 

Show  that  for  a symmetric  distribution  (whose  third 
central  moment  exists)  the  skewness  is  zero. 

(e)  Find  the  skewness  of  the  distribution  with  density 
/(x)  = xe~x  when  x > 0 and  /(x)  = 0 otherwise. 
Sketch /(x). 

(f)  Calculate  the  skewness  of  a few  simple  discrete 
distributions  of  your  own  choice. 

(g)  Find  a nonsymmetric  discrete  distribution  with 
3 possible  values,  mean  0,  and  skewness  0. 


Binomial,  Poisson,  and  Hypergeometric 
Distributions 


These  are  the  three  most  important  discrete  distributions,  with  numerous  applications. 

Binomial  Distribution 

The  binomial  distribution  occurs  in  games  of  chance  (rolling  a die,  see  below,  etc.), 
quality  inspection  (e.g.,  counting  of  the  number  of  defectives),  opinion  polls  (counting 
number  of  employees  favoring  certain  schedule  changes,  etc.),  medicine  (e.g.,  recording 
the  number  of  patients  who  recovered  on  a new  medication),  and  so  on.  The  conditions 
of  its  occurrence  are  as  follows. 

We  are  interested  in  the  number  of  times  an  event  A occurs  in  n independent  trials.  In 
each  trial  the  event  A has  the  same  probability  P(A)  = p.  Then  in  a trial,  A will  not  occur 
with  probability  q = 1 — p.  In  n trials  the  random  variable  that  interests  us  is 

X = Number  of  times  the  event  A occurs  in  n trials. 

X can  assume  the  values  0,  1,  ■■■  ,n,  and  we  want  to  determine  the  corresponding 
probabilities.  Now  X = x means  that  A occurs  in  x trials  and  in  n — x trials  it  does  not 
occur.  This  may  look  as  follows. 

(1)  A A • • ■ A B B -B. 

x times  n — x times 

Here  B = Ac  is  the  complement  of  A,  meaning  that  A does  not  occur  (Sec.  24.2).  We  now 
use  the  assumption  that  the  trials  are  independent,  that  is,  they  do  not  influence  each  other. 
Hence  (1)  has  the  probability  (see  Sec.  24.3  on  independent  events) 
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(1*)  pp  ■ ■ ■ p • qq---q  = pxqn  x. 

x times  n — x times 

Now  (1)  is  just  one  order  of  arranging  x A’s  and  n — x B’s.  We  now  use  Theorem  1(b) 
in  Sec.  24.4,  which  gives  the  number  of  permutations  of  n things  (the  n outcomes  of  the 
n trials)  consisting  of  2 classes,  class  1 containing  the  n1  = x A’s  and  class  2 containing 
the  n — rii  = n — x B’s.  This  number  is 

n\  —(n\ 
xl(n  — x)\  \xj 

Accordingly,  (1*),  multiplied  by  this  binomial  coefficient,  gives  the  probability  P(X  = x) 
of  X = x,  that  is,  of  obtaining  A precisely  x times  in  n trials.  Hence  X has  the  probability 
function 

(2)  m=(n)pxqn-x  (x  = 0,1,  -■-,«) 

and  f{x)  = 0 otherwise.  The  distribution  of  X with  probability  function  (2)  is  called  the 
binomial  distribution  or  Bernoulli  distribution.  The  occurrence  of  A is  called  success 
(regardless  of  what  it  actually  is;  it  may  mean  that  you  miss  your  plane  or  lose  your  watch) 
and  the  nonoccurrence  of  A is  called  failure.  Figure  517  shows  typical  examples.  Numeric 
values  can  be  obtained  from  Table  A5  in  App.  5 or  from  your  CAS. 

The  mean  of  the  binomial  distribution  is  (see  Team  Project  16) 

(3)  p.  = np 

and  the  variance  is  (see  Team  Project  16) 

(4)  cr2  = npq. 

For  the  symmetric  case  of  equal  chance  of  success  and  failure  (p  = q = |)  this  gives  the 
mean  n/ 2,  the  variance  n/A,  and  the  probability  function 

(2*)  f{x)  = (")(!)  (jc  = 0,  !,■■•,«). 


0.5  - 


o, 


o 5 

p = 0.1 


o 5 

p = 0.2 


0 5 

p = 0.5 


0 5 

p = 0.8 


0 5 

p = 0.9 


Fig.  517.  Probability  function  (2)  of  the  binomial  distribution  for  n = 5 and  various  values  of  p 
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EXAMPLE  1 


EXAMPLE  2 


Binomial  Distribution 

Compute  the  probability  of  obtaining  at  least  two  “Six"  in  rolling  a fair  die  4 times. 

Solution,  p = P(A)  = PC' Six")  = q = §,  n = 4.  The  event  “At  least  two  ‘Six’"  occurs  if  we  obtain  2 or 
3 or  4 “Six.”  Hence  the  answer  is 


P=/(2)+/(3)+/(4)  = 


1 171 

= — (6  • 25  + 4 • 5 + 1)  = 

64  1296 


13.2%. 


Poisson  Distribution 

The  discrete  distribution  with  infinitely  many  possible  values  and  probability  function 

(5)  f(x)  = ^-e-»  (jc  = 0,  1,  ■ ■ ■ ) 

jc! 

is  called  the  Poisson  distribution,  named  after  S.  D.  Poisson  (Sec.  18.5).  Figure  518 
shows  (5)  for  some  values  of  p.  It  can  be  proved  that  this  distribution  is  obtained  as  a 
limiting  case  of  the  binomial  distribution,  if  we  let  p—> 0 and  n —*  00  so  that  the  mean 
p = np  approaches  a finite  value.  (For  instance,  p = np  may  be  kept  constant.)  The 
Poisson  distribution  has  the  mean  p and  the  variance  (see  Team  Project  16) 

(6)  o-2  = p. 

Figure  518  gives  the  impression  that,  with  increasing  mean,  the  spread  of  the  distribution 
increases,  thereby  illustrating  formula  (6),  and  that  the  distribution  becomes  more  and 
more  (approximately)  symmetric. 


p = 0.5  p=l  p=2  p = 5 

Fig.  518.  Probability  function  (5)  of  the  Poisson  distribution  for  various  values  of  p 

Poisson  Distribution 

If  the  probability  of  producing  a defective  screw  is  p = 0.01,  what  is  the  probability  that  a lot  of  100  screws 
will  contain  more  than  2 defectives? 

Solution.  The  complementary  event  is  Ac:  Not  more  than  2 defectives.  For  its  probability  we  get,  from  the 
binomial  distribution  with  mean  /i  np  1 , the  value  [see  (2)] 

P(AC)  = 0.99100  + 0.01  • 0.99"  + 0.012  • 0.9998. 
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EXAMPLE  3 


Since  p is  very  small,  we  can  approximate  this  by  the  much  more  convenient  Poisson  distribution  with  mean 
p = np  = 100  • 0.01  = 1,  obtaining  [see  (5)] 


P(AC)  » e-1  (1  + 1 + |) 

= 91.97%. 

Thus  P(A)  = 8.03%.  Show  that  the  binomial  distribution  gives  P{A ) = 7.94%,  so  that  the  Poisson  approximation 
is  quite  good. 

Parking  Problems.  Poisson  Distribution 

If  on  the  average,  2 cars  enter  a certain  parking  lot  per  minute,  what  is  the  probability  that  during  any  given 
minute  4 or  more  cars  will  enter  the  lot? 

Solution.  To  understand  that  the  Poisson  distribution  is  a model  of  the  situation,  we  imagine  the  minute  to 
be  divided  into  very  many  short  time  intervals,  let  p be  the  (constant)  probability  that  a car  will  enter  the  lot 
during  any  such  short  interval,  and  assume  independence  of  the  events  that  happen  during  those  intervals.  Then 
we  are  dealing  with  a binomial  distribution  with  very  large  n and  very  small  p,  which  we  can  approximate  by 
the  Poisson  distribution  with 


p = np  = 2, 


because  2 cars  enter  on  the  average.  The  complementary  event  of  the  event  “4  cars  or  more  during  a given 
minute”  is  “3  cars  or  fewer  enter  the  lot ” and  has  the  probability 


2i  2^  2^ 

/(0)  + /( 1)  + /( 2)  + /( 3)  = e-z  I — — I 1 H — 

\0!  1!  2!  3! 


= 0.857. 


Answer:  14.3%.  (Why  did  we  consider  that  complement?) 


Sampling  with  Replacement 

This  means  that  we  draw  things  from  a given  set  one  by  one,  and  after  each  trial  we 
replace  the  thing  drawn  (put  it  back  to  the  given  set  and  mix)  before  we  draw  the  next 
thing.  This  guarantees  independence  of  trials  and  leads  to  the  binomial  distribution. 
Indeed,  if  a box  contains  N things,  for  example,  screws,  M of  which  are  defective,  the 
probability  of  drawing  a defective  screw  in  a trial  is  p = M/N.  Hence  the  probability  of 
drawing  a nondefective  screw  is  <7  = 1 — p = l — M/N,  and  (2)  gives  the  probability  of 
drawing  x defectives  in  n trials  in  the  form 


(7) 


(x  = 0,  1,  • • • , n). 


Sampling  without  Replacement. 

Hypergeometric  Distribution 

Sampling  without  replacement  means  that  we  return  no  screw  to  the  box.  Then  we  no 
longer  have  independence  of  trials  (why?),  and  instead  of  (7)  the  probability  of  drawing 
x defectives  in  n trials  is 
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EXAMPLE  4 


(8) 


(x  = 0,  1,  • • • , n). 


The  distribution  with  this  probability  function  is  called  the  hypergeometric  distribution 
(because  its  moment  generating  function  (see  Team  Project  16)  can  be  expressed  by  the 
hypergeometric  function  defined  in  Sec.  5.4,  a fact  that  we  shall  not  use). 


Derivation  of  (8).  By  (4a)  in  Sec.  24.4  there  are 


(a) 

(b) 

(c) 


different  ways  of  picking  n things  from  N, 
different  ways  of  picking  x defectives  from  M, 

different  ways  of  picking  n — x nondefectives  from  N — M, 


n 
M 

x 

N - M 
n — x 


and  each  way  in  (b)  combined  with  each  way  in  (c)  gives  the  total  number  of  mutually 
exclusive  ways  of  obtaining  x defectives  in  n drawings  without  replacement.  Since  (a)  is 
the  total  number  of  outcomes  and  we  draw  at  random,  each  such  way  has  the  probability 


From  this,  (8)  follows. 


The  hypergeometric  distribution  has  the  mean  (Team  Project  16) 


(9) 


fi  = n 


M 

N 


and  the  variance 


(10) 


2 nM(N  - M)(N  - n) 

a = 9 

N2(N  - 1) 


Sampling  with  and  without  Replacement 

We  want  to  draw  random  samples  of  two  gaskets  from  a box  containing  10  gaskets,  three  of  which  are  defective. 
Find  the  probability  function  of  the  random  variable  X = Number  of  defectives  in  the  sample. 

Solution.  We  have  N = 10,  M = 3,  iV  — M = 7,  n = 2.  For  sampling  with  replacement,  (7)  yields 
m = Q (-jQ  , m = 0.49,  /( 1)  = 0.42,  /(2)  = 0.09. 


For  sampling  without  replacement  we  have  to  use  (8),  finding 
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IfN,  M,  and  N — M are  large  compared  with  n,  then  it  does  not  matter  too  much  whether 
we  sample  with  or  without  replacement,  and  in  this  case  the  hypergeometric  distribution 
may  be  approximated  by  the  binomial  distribution  (with  p = M/N),  which  is  somewhat 
simpler. 

Hence,  in  sampling  from  an  indefinitely  large  population  (“infinite  population”),  we 
may  use  the  binomial  distribution,  regardless  of  whether  we  sample  with  or  without 
replacement. 


PRO  B L EM  S FT14  7 


1.  Mark  the  positions  of  p in  Fig.  517.  Comment. 

2.  Graph  (2)  for  n = 8 as  in  Fig.  517  and  compare  with 
Fig.  517. 

3.  In  Example  3,  if  5 cars  enter  the  lot  on  the  average, 
what  is  the  probability  that  during  any  given  minute  6 
or  more  cars  will  enter?  First  guess.  Compare  with 
Example  3. 

4.  How  do  the  probabilities  in  Example  4 of  the  text 
change  if  you  double  the  numbers:  drawing  4 gaskets 
from  20,  6 of  which  are  defective?  First  guess. 

5.  Five  fair  coins  are  tossed  simultaneously.  Find  the 
probability  function  of  the  random  variable  X = Number 
of  heads  and  compute  the  probabilities  of  obtaining  no 
heads,  precisely  1 head,  at  least  1 head,  not  more  than 
4 heads. 

6.  Suppose  that  4%  of  steel  rods  made  by  a machine  are 
defective,  the  defectives  occurring  at  random  during 
production.  If  the  rods  are  packaged  100  per  box,  what 
is  the  Poisson  approximation  of  the  probability  that  a 
given  box  will  contain  x = 0,  1,  ■ ■ ■ , 5 defectives? 

7.  Let  X be  the  number  of  cars  per  minute  passing  a certain 
point  of  some  road  between  8 a.m.  and  10  a.m.  on  a 
Sunday.  Assume  that  X has  a Poisson  distribution  with 
mean  5.  Find  the  probability  of  observing  4 or  fewer 
cars  during  any  given  minute. 

8.  Suppose  that  a telephone  switchboard  of  some 
company  on  the  average  handles  300  calls  per  hour, 
and  that  the  board  can  make  at  most  10  connections 
per  minute.  Using  the  Poisson  distribution,  estimate  the 
probability  that  the  board  will  be  overtaxed  during  a 
given  minute.  (Use  Table  A6  in  App.  5 or  your  CAS.) 

9.  Rutherford-Geiger  experiments.  In  1910,  E. 
Rutherford  and  H.  Geiger  showed  experimentally  that 
the  number  of  alpha  particles  emitted  per  second  in  a 
radioactive  process  is  a random  variable  X having  a 
Poisson  distribution.  If  X has  mean  0.5,  what  is  the 
probability  of  observing  two  or  more  particles  during 
any  given  second? 

10.  Let  p = 2%  be  the  probability  that  a certain  type  of 
lightbulb  will  fail  in  a 24-hour  test.  Find  the  probability 


that  a sign  consisting  of  15  such  bulbs  will  bum  24 
hours  with  no  bulb  failures. 

11.  Guess  how  much  less  the  probability  in  Prob.  10  would 
be  if  the  sign  consisted  of  100  bulbs.  Then  calculate. 

12.  Suppose  that  a certain  type  of  magnetic  tape  contains, 
on  the  average,  2 defects  per  100  meters.  What  is  the 
probability  that  a roll  of  tape  300  meters  long  will 
contain  (a)  x defects,  (b)  no  defects? 

13.  Suppose  that  a test  for  extrasensory  perception  consists 
of  naming  (in  any  order)  3 cards  randomly  drawn  from 
a deck  of  13  cards.  Find  the  probability  that  by  chance 
alone,  the  person  will  correctly  name  (a)  no  cards,  (b)  1 
card,  (c)  2 cards,  (d)  3 cards. 

14.  If  a ticket  office  can  serve  at  most  4 customers  per 
minute  and  the  average  number  of  customers  is  120  per 
hour,  what  is  the  probability  that  during  a given  minute 
customers  will  have  to  wait?  (Use  the  Poisson 
distribution,  Table  6 in  Appendix  5.) 

15.  Suppose  that  in  the  production  of  60-ohm  radio 
resistors,  nondefective  items  are  those  that  have  a 
resistance  between  58  and  62  ohms  and  the  probability 
of  a resistor’s  being  defective  is  0.1%.  The  resistors 
are  sold  in  lots  of  200,  with  the  guarantee  that  all 
resistors  are  nondefective.  What  is  the  probability  that 
a given  lot  will  violate  this  guarantee?  (Use  the  Poisson 
distribution.) 

16.  TEAM  PROJECT.  Moment  Generating  Function. 

The  moment  generating  function  G(t)  is  defined  by 

G(J)  = E{etxf  = 2 etXif(xj) 

3 

or 

G{t)  = E{etx)  = | etxf(x)  dx 

where  A is  a discrete  or  continuous  random  variable, 
respectively. 

(a)  Assuming  that  termwise  differentiation  and  differ- 
entiation under  the  integral  sign  are  permissible,  show 
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that  E(Xk)  = Gcfc)( 0),  where  Gik)  = dkG/dtk,  in 
particular,  /x  = G/O). 

(b)  Show  that  the  binomial  distribution  has  the 
moment  generating  function 


g(o  = 2 et: 

x=0 


x n—x 

p q 


= 2 

x=0 

= (pe*  + q)n. 


(p/fq71-* 


(c)  Using  (b),  prove  (3). 

(d)  Prove  (4). 

(e)  Show  that  the  Poisson  distribution  has  the  moment 
generating  function  G(t)  = e~>Le^e  and  prove  (6). 


(f ) Prove  x 


= M 


M - 1 
x — 1 


Using  this,  prove  (9). 

17.  Multinomial  distribution.  Suppose  a trial  can  result 
in  precisely  one  of  k mutually  exclusive  events 


Ai,  ■ ■ ■ , A;,  with  probabilities  pi,  ■ ■ • , p k.  respectively, 
where  p\  + ■ ■ ■ + pk  = 1.  Suppose  that  n independent 
trials  are  performed.  Show  that  the  probability  of 
getting  x^Afs,  ■ ■ ■ , xk  Ak’s  is 

fix  i,  ■ ■ • , xk)  = — - pi1  ■■■Pkk 

x\---xk\ 

where  0 S Xj  S n,  j = 1,  ■ ■ • , k,  and  x\  + ■ ■ ■ + 
xk  = n.  The  distribution  having  this  probability 
function  is  called  the  multinomial  distribution. 

18.  A process  of  manufacturing  screws  is  checked  every 
hour  by  inspecting  n screws  selected  at  random  from 
that  hour’s  production.  If  one  or  more  screws  are 
defective,  the  process  is  halted  and  carefully  examined. 
How  large  should  n be  if  the  manufacturer  wants  the 
probability  to  be  about  95%  that  the  process  will  be 
halted  when  10%  of  the  screws  being  produced  are 
defective?  (Assume  independence  of  the  quality  of  any 
screw  from  that  of  the  other  screws.) 


24.8  Normal  Distribution 


Turning  from  discrete  to  continuous  distributions,  in  this  section  we  discuss  the  normal 
distribution.  This  is  the  most  important  continuous  distribution  because  in  applications  many 
random  variables  are  normal  random  variables  (that  is,  they  have  a normal  distribution) 
or  they  are  approximately  normal  or  can  be  transformed  into  normal  random  variables  in  a 
relatively  simple  fashion.  Furthermore,  the  normal  distribution  is  a useful  approximation  of 
more  complicated  distributions,  and  it  also  occurs  in  the  proofs  of  various  statistical  tests. 


The  normal  distribution  or  Gauss  distribution  is  defined  as  the  distribution  with  the 
density 


(1) 


fix ) 


1 

crV277 


exp 


1 ( x — n 

2 \ cr 


(cr  > 0) 


where  exp  is  the  exponential  function  with  base  e = 2.718  ■ ■ • . This  is  simpler  than  it  may 
at  first  look. /(x)  has  these  features  (see  also  Fig.  519). 

1.  pi  is  the  mean  and  cr  the  standard  deviation. 

2.  1 /(cr  V27t)  is  a constant  factor  that  makes  the  area  under  the  curve  of/(x)  from  — °° 
to  oo  equal  to  1,  as  it  must  be  by  (10),  Sec.  24.5. 

3.  The  curve  of  f(x ) is  symmetric  with  respect  to  x = pt  because  the  exponent  is 
quadratic.  Hence  for  pt  = 0 it  is  symmetric  with  respect  to  the  y-axisx  = 0 (Fig.  519, 
“bell-shaped  curves”). 

4.  The  exponential  function  in  (1)  goes  to  zero  very  fast — the  faster  the  smaller  the 
standard  deviation  cr  is,  as  it  should  be  (Fig.  519). 
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Fig.  519.  Density  (1)  of  the  normal  distribution  with  /x  = 0 for  various  values  of  a 


Distribution  Function  F(x) 

From  (7)  in  Sec.  24.5  and  (1)  we  see  that  the  normal  distribution  has  the  distribution 
function 


(2) 


F(x) 


1 

crV27T 


dv. 


Here  we  needed  x as  the  upper  limit  of  integration  and  wrote  v (instead  of  x)  in  the  integrand. 

For  the  corresponding  standardized  normal  distribution  with  mean  0 and  standard 
deviation  1 we  denote  F(x)  by  d>(z).  Then  we  simply  have  from  (2) 


(3) 


4>(z) 


1 

V277 


e~u2/2du. 


This  integral  cannot  be  integrated  by  one  of  the  methods  of  calculus.  But  this  is  no  serious 
handicap  because  its  values  can  be  obtained  from  Table  A7  in  App.  5 or  from  your  CAS. 
These  values  are  needed  in  working  with  the  normal  distribution.  The  curve  of  'b(z)  is 
.S'-shaped.  It  increases  monotone  (why?)  from  0 to  1 and  intersects  the  vertical  axis  at  \ 
(why?),  as  shown  in  Fig.  520. 

Relation  Between  F(x)  and  <I>(z).  Although  your  CAS  will  give  you  values  of  F(x)  in 
(2)  with  any  /x  and  cr  directly,  it  is  important  to  comprehend  that  and  why  any  such  an 
Fix)  can  be  expressed  in  terms  of  the  tabulated  standard  <f>(z),  as  follows. 


1.0 

4>(x) 

0.8 

0.6 

0,4 

0.2 

i 

1 1 1 

-3  -2  -1  0 

1 2 3 x 

Fig.  520.  Distribution  function  <F(z)  of  the  normal  distribution  with  mean  0 and  variance  1 
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THEOREM  1 


PROOF 


THEOREM  2 


PROOF 


Use  of  the  Normal  Table  A7  in  App.  5 

The  distribution  function  F(x ) of  the  normal  distribution  with  any  /jl  and  cr  [see  (2)] 
is  related  to  the  standardized  distribution  function  <J>(z)  in  (3)  by  the  formula 


(4) 


Fix)  = <FI 


Comparing  (2)  and  (3)  we  see  that  we  should  set 

v ~ ix  x — ix 

u = . Then  v = x gives  u = 

cr  er- 

as the  new  upper  limit  of  integration.  Also  u — /x  = cru,  thus  dv  = c r du.  Together,  since 
cr  drops  out, 


Fix) 


1 

crV27T 


r(x-/x)/a 


"2/  2 (T  du  = d> 


Probabilities  corresponding  to  intervals  will  be  needed  quite  frequently  in  statistics  in 
Chap.  25.  These  are  obtained  as  follows. 


Normal  Probabilities  for  Intervals 

The  probability  that  a normal  random  variable  X with  mean  /x  and  standard 
deviation  cr  assume  any  value  in  an  interval  a < x ts  b is 

fb  — ix\  ( a — ix 

(5)  P(a  < I = i)  = F(b)  - F(a)  = ^ 1 - 4>( 


Formula  (2)  in  Sec.  24.5  gives  the  first  equality  in  (5),  and  (4)  in  this  section  gives  the 
second  equality. 


Numeric  Values 

In  practical  work  with  the  normal  distribution  it  is  good  to  remember  that  about  § of  all  values 
of  X to  be  observed  will  lie  between  j±  ± <r.  about  95  % between  /x  ± 2 cr,  and  practically  all 
between  the  three-sigma  limits  jx  ± 3cr.  More  precisely,  by  Table  A7  in  App.  5, 


(a) 

PQx 

(6) 

(b) 

Piix 

(c) 

Pip 

— cr<A^jU,  + cr)~68% 

— 2cr  < X g p + 2 cr)  ~ 95.5% 

— 3o-  < Z g pt  + 3a)  = 99.7%. 


Formulas  (6a)  and  (6b)  are  illustrated  in  Fig.  521. 
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EXAMPLE  1 


EXAMPLE  2 


The  formulas  in  (6)  show  that  a value  deviating  from  j±  by  more  than  cr,  2 cr,  or  3cr  will 
occur  in  one  of  about  3,  20,  and  300  trials,  respectively. 


95.5% 

/°\ 

2.25%  / \ 2.25% 

Ai_ ± 

li-2a  n fi  + 2a 

(b) 


Fig.  521.  Illustration  of  formula  (6) 


In  tests  (Chap.  25)  we  shall  ask,  conversely,  for  the  intervals  that  correspond  to  certain 
given  probabilities;  practically  most  important  are  the  probabilities  of  95%,  99%,  and 
99.9%.  For  these,  Table  A8  in  App.  5 gives  the  answers  p,  ± 2cr,  /r  ± 2.6<x,  and 
jji  ± 3.3cr,  respectively.  More  precisely, 


(7) 


(a)  P(fju  — 1.96cr  < X ^ /i  + 1.96cr)  = 95% 

(b)  P((jl  - 2.58(7  <Ig/i  + 2.58(7)  = 99% 

(c)  P(ji  - 3.29(7  <IS/if  3.29(7)  = 99.9% 


Working  with  the  Normal  Tables  A7  and  A8  in  App.  5 

There  are  two  normal  tables  in  App.  5,  Tables  A7  and  A8.  If  you  want  probabilities,  use 
Table  A7.  If  probabilities  are  given  and  corresponding  intervals  or  x- values  are  wanted, 
use  Table  A8.  The  following  examples  are  typical.  Do  them  with  care,  verifying  all  values, 
and  don’t  just  regard  them  as  dull  exercises  for  your  software.  Make  sketches  of  the  density 
to  see  whether  the  results  look  reasonable. 

Reading  Entries  from  Table  A7 

If  X is  standardized  normal  (so  that  /jl  = 0,  <r  = 1),  then 
P(X  £ 2.44)  = 0.9927  = 99j  % 

P(X£  -1.16)  = 1 - *>(1.16)  = 1 - 0.8770  = 0.1230  = 12.3% 

P(X  a 1)  = 1 - P(X  S 1)  = 1 - 0.8413  = 0.1587)  by  (7),  Sec.  24.3 
P(1.0  £ X £ 1.8)  = *>(1.8)  - <*>(1.0)  = 0.9641  - 0.8413  = 0.1228. 

Probabilities  for  Given  Intervals,  Table  A7 

Let  X be  normal  with  mean  0.8  and  variance  4 (so  that  <r  = 2).  Then  by  (4)  and  (5) 

( 2.44  - 0.80  \ 

P(IS  2.44)  = F(2.44)  = 4>l J = <*>(0.82)  = 0.7939  = 80% 


or,  if  you  like  it  better,  (similarly  in  the  other  cases) 

(X  - 0.80  2.44  - 0.80 


P(X  S 2.44)  = P 


P(X  s 1)  = 1 - P(X  s 1)  = 1 - 0>l 


1 - 0.1 
2 


= P(Z  S 0.82)  = 0.7939 
j = 1 - 0.5398  = 0.4602 


P(1.0SI£  1.8)  = <*>(0.5)  - 4>(0.1)  = 0.6915  - 0.5398  = 0.1517. 
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EXAMPLE  3 


EXAMPLE  4 


Unknown  Values  c for  Given  Probabilities,  Table  A8 

Let  X be  normal  with  mean  5 and  variance  0.04  (hence  standard  deviation  0.2).  Find  c or  k corresponding  to 
the  given  probability 


P(X  S c)  = 95%, 


c — 5 

= 1.645,  c = 5.329 

0.2 


P(5  - k £ X £ 5 + k)  = 90%,  5 + k = 5.329  (as  before;  why?) 


P(X  a c)  = 1 %,  thus  P(X  S c)  = 99%, 


c - 5 
0.2 


2.326, 


c = 5.465. 


Defectives 

In  a production  of  iron  rods  let  the  diameter  X be  normally  distributed  with  mean  2 in.  and  standard  deviation 
0.008  in. 

(a)  What  percentage  of  defectives  can  we  expect  if  we  set  the  tolerance  limits  at  2 ± 0.02  in.? 

(b)  How  should  we  set  the  tolerance  limits  to  allow  for  4%  defectives? 

Solution,  (a)  1 1 % because  from  (5)  and  Table  A7  we  obtain  for  the  complementary  event  the  probability 


P(1.98  & XS  2.02) 


/ 2.02  — 2.00  \ / 1.98 -2.00 

V 0.008  ) V 0.008 

4>(2.5)  - <T>(— 2.5) 

0.9938  - (1  - 0.9938) 

0.9876 


= 98|%. 


(b)  2 ± 0.0164  because,  for  the  complementary  event,  we  have 

0.96  = P(2  - c S X £ 2 + c) 


or 


0.98  = P(IS2  + c) 


so  that  Table  A8  gives 


0.98  = <!> 


2 + c - 2 
0.008 
2 + c - 2 
0.008 


= 2.054,  c = 0.0164. 


Normal  Approximation  of  the  Binomial  Distribution 

The  probability  function  of  the  binomial  distribution  is  (Sec.  24.7) 

(8)  fix)  = (^X^jpxqn~x  (x  = 0,  1,  • • • , n). 

If  n is  large,  the  binomial  coefficients  and  powers  become  very  inconvenient.  It  is  of  great 
practical  (and  theoretical)  importance  that,  in  this  case,  the  normal  distribution  provides 
a good  approximation  of  the  binomial  distribution,  according  to  the  following  theorem, 
one  of  the  most  important  theorems  in  all  probability  theory. 
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Limit  Theorem  of  De  Moivre  and  Laplace 

For  large  n, 

(9)  ) (*  = o,i, 

Here  f is  given  by  (8).  The  function 


(10) 


f*(x)  = 


V27 T'Vnpq 


-z1 2 3 4 5 6 7/2 


Z = 


x — np 
\fnpq 


is  the  density  of  the  normal  distribution  with  mean  p = np  and  variance  a2  = npq 
(the  mean  and  variance  of  the  binomial  distribution).  The  symbol  ~ (read 
asymptotically  equal)  means  that  the  ratio  of  both  sides  approaches  l as  n approaches 

oo.  Furthermore,  for  any  nonnegative  integers  a and  b (>  a). 


(11) 


P(a  = I = 6)  = ^ 


a — np  — 0.5 


\/npq 


,pxqn-x 


P = 


~ <J>(/3)  - <J>(a), 

b — np  + 0.5 
\/npq 


A proof  of  this  theorem  can  be  found  in  [G3]  listed  in  App.  1 . The  proof  shows  that  the  term 
0.5  in  a and  (3  is  a correction  caused  by  the  change  from  a discrete  to  a continuous  distribution. 


^ff^B^W=S:ET~-2-4=8 


1.  Let  X be  normal  with  mean  10  and  variance  4.  Find 
P(X  > 12),  P(X  < 10),  P(X  < 11),  P(9  < X < 13). 

2.  Let  X be  normal  with  mean  105  and  variance  25.  Find 
P(  XS  112.5),  P(x  > 100),  T(110.5  <X<  111.25). 

3.  Let  X be  normal  with  mean  50  and  variance  9. 
Determine  c such  that  P(X  < c)  = 5%,  P(X  > c)  = 
1%,  P(  50  — c<A'<50  + c)  = 50%. 

4.  Let  X be  normal  with  mean  3.6  and  variance  0.01.  Find 
c such  that  P(X  S c)  = 50%,  P(X  > c)  = 10%, 
P(-c  < X - 3.6  S c)  = 99.9%. 

5.  If  the  lifetime  X of  a certain  kind  of  automobile  battery 
is  normally  distributed  with  a mean  of  5 years  and  a 
standard  deviation  of  1 year,  and  the  manufacturer  wishes 
to  guarantee  the  battery  for  4 years,  what  percentage  of 
the  batteries  will  he  have  to  replace  under  the  guarantee? 

6.  If  the  standard  deviation  in  Prob.  5 were  smaller,  would 
that  percentage  be  larger  or  smaller? 

7.  A manufacturer  knows  from  experience  that  the 
resistance  of  resistors  he  produces  is  normal  with  mean 


p = 150  Tt  and  standard  deviation  a = 5 fi.  What 
percentage  of  the  resistors  will  have  resistance  between 
148  D and  152  D?  Between  140  fl  and  160  fi? 

8.  The  breaking  strength  X [kg]  of  a certain  type  of  plastic 
block  is  normally  distributed  with  a mean  of  1500  kg 
and  a standard  deviation  of  50  kg.  What  is  the  maximum 
load  such  that  we  can  expect  no  more  than  5%  of  the 
blocks  to  break? 

9.  If  the  mathematics  scores  of  the  SAT  college  entrance 
exams  are  normal  with  mean  480  and  standard  deviation 
100  (these  are  about  the  actual  values  over  the  past 
years)  and  if  some  college  sets  500  as  the  minimum 
score  for  new  students,  what  percent  of  students  would 
not  reach  that  score? 

10.  A producer  sells  electric  bulbs  in  cartons  of  1000  bulbs. 
Using  (11),  find  the  probability  that  any  given  carton 
contains  not  more  than  1 % defective  bulbs,  assuming 
the  production  process  to  be  a Bernoulli  experiment 
with  p = 1%(=  probability  that  any  given  bulb  will  be 
defective).  First  guess.  Then  calculate. 
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11.  If  sick-leave  time  X used  by  employees  of  a company 
in  one  month  is  (very  roughly)  normal  with  mean  1000 
hours  and  standard  deviation  100  hours,  how  much 
time  t should  be  budgeted  for  sick  leave  during  the  next 
month  if  t is  to  be  exceeded  with  probability  of  only 
20  %? 

12.  If  the  monthly  machine  repair  and  maintenance  cost  X 
in  a certain  factory  is  known  to  be  normal  with  mean 
$12,000  and  standard  deviation  $2000,  what  is  the 
probability  that  the  repair  cost  for  the  next  month  will 
exceed  the  budgeted  amount  of  $15,000? 

13.  If  the  resistance  X of  certain  wires  in  an  electrical 
network  is  normal  with  mean  0.01  12  and  standard 
deviation  0.001  12,  how  many  of  1000  wires  will  meet 
the  specification  that  they  have  resistance  between 
0.009  and  0.011  12? 

14.  TEAM  PROJECT.  Normal  Distribution,  (a)  Derive 
the  formulas  in  (6)  and  (7)  from  the  appropriate  normal 
table. 

(b)  Show  that  <F(— z)  = 1 — <F(z).  Give  an  example. 

(c)  Find  the  points  of  inflection  of  the  curve  of  (1). 

(d)  Considering  cf>2( oo)  and  introducing  polar  coordi- 
nates in  the  double  integral  (a  standard  trick  worth 
remembering),  prove 


1 f ^ 2 

(12)  4>(oo)  = e~u2/2du  = 1. 

V2tt  )_x 

(e)  Show  that  cr  in  ( 1)  is  indeed  the  standard  deviation 
of  the  normal  distribution.  [Use  (12).] 

(f ) Bernoulli’s  law  of  large  numbers.  In  an  experiment 
let  an  event  A have  probability  p (0  < p < 1),  and  let  X 
be  the  number  of  times  A happens  in  n independent  trials. 
Show  that  for  any  given  e > 0, 


(g)  Transformation.  If  X is  normal  with  mean  /j.  and 
variance  cr2,  show  that  A*  = CiX  + c2  (ci  > 0)  is 

normal  with  mean  pi*  = cy/x  + c2  and  variance 

*2  2 2 
c t * = cfcr  . 

15.  WRITING  PROJECT.  Use  of  Tables,  Use  of  CAS. 

Give  a systematic  discussion  of  the  use  of  Tables  A7  and 
A8  for  obtaining  P(X  < b),  P(X  > a),  P(a  < X < b ), 
P(X  < c)  = k,  P(X  > c)  = k,  as  well  as  P(/x  — c < 
X < pi  + c)  = k\  include  simple  examples.  If  you  have 
a CAS,  describe  to  what  extent  it  makes  the  use  of  those 
tables  superfluous;  give  examples. 


24. $ Distributions  of  Several  Random  Variables 

Distributions  of  two  or  more  random  variables  are  of  interest  for  two  reasons: 

1.  They  occur  in  experiments  in  which  we  observe  several  random  variables,  for 
example,  carbon  content  X and  hardness  Y of  steel,  amount  of  fertilizer  X and  yield  of 
corn  Y,  height  Ai,  weight  A2,  and  blood  pressure  A3  of  persons,  and  so  on. 

2.  They  will  be  needed  in  the  mathematical  justification  of  the  methods  of  statistics  in 
Chap.  25. 

In  this  section  we  consider  two  random  variables  X and  Y or,  as  we  also  say,  a two- 
dimensional  random  variable  (X,  Y).  For  (A,  Y)  the  outcome  of  a trial  is  a pair  of  numbers 
A = x,  Y = y,  briefly  (A,  Y)  = (x,  y),  which  we  can  plot  as  a point  in  the  AT-plane. 

The  two-dimensional  probability  distribution  of  the  random  variable  (A,  Y)  is  given 
by  the  distribution  function 

(1)  F(x,  y ) = PiX  g x,  Y g y). 


This  is  the  probability  that  in  a trial,  A will  assume  any  value  not  greater  than  x and  in 
the  same  trial,  Y will  assume  any  value  not  greater  than  y.  This  corresponds  to  the  blue 
region  in  Fig.  522,  which  extends  to  — °°  to  the  left  and  below.  F(x,y ) determines  the 
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Y 

(x,y) 

X 

Fig.  522.  Formula  (1) 


probability  distribution  uniquely,  because  in  analogy  to  formula  (2)  in  Sec.  24.5,  that  is, 
P(a  < X Si  b)  = F(b)  — F(a),  we  now  have  for  a rectangle  (see  Prob.  16) 

(2)  />(«!  < X ^ bi,  a2<Y^b2)  = F(Jb  i,  b2 ) - F(a1;  b2)  - F{bx,  a2)  + F(ax,  a2). 

As  before,  in  the  two-dimensional  case  we  shall  also  have  discrete  and  continuous 
random  variables  and  distributions. 


Discrete  Two-Dimensional  Distributions 

In  analogy  to  the  case  of  a single  random  variable  (Sec.  24.5),  we  call  ( X , Y)  and  its 
distribution  discrete  if  ( X , Y ) can  assume  only  finitely  many  or  at  most  countably  infinitely 
many  pairs  of  values  (x1;  y i ) , (x2,  y2),  ■ ■ ■ with  positive  probabilities,  whereas  the  probability 
for  any  domain  containing  none  of  those  values  of  (X,  Y)  is  zero. 

Let  Oq,  \’j)  be  any  of  those  pairs  and  let  P(X  = Xj,  Y = yj)  = p.tj  (where  we  admit  that 
Pij  may  be  0 for  certain  pairs  of  subscripts  i,j).  Then  we  define  the  probability  function 
f(x,  y)  of  (X,  T)  by 

(3)  f{x,  y)  = p^  if  x = Xj,  y = yj  and  fix,  y)  = 0 otherwise; 

here,  i = 1,  2,  • ■ ■ and  / = 1,  2,  • • • independently.  In  analogy  to  (4),  Sec.  24.5,  we  now  have 
for  the  distribution  function  the  formula 


(4)  Fix,  y)  = 2 2 /(*i>  Yj)- 

XjSx  yjSy 

Instead  of  (6)  in  Sec.  24.5  we  now  have  the  condition 

(5)  'Z'Zf(xi,yj)  = l. 

i 3 

Two-Dimensional  Discrete  Distribution 

If  we  simultaneously  toss  a dime  and  a nickel  and  consider 

X = Number  of  heads  the  dime  turns  up, 

Y = Number  of  heads  the  nickel  turns  up, 
then  X and  Y can  have  the  values  0 or  1 , and  the  probability  function  is 

/(0,  0)  =/(  1,  0)  =/( 0,  1)  =/(l,  1)  = i fix,y ) = 0 otherwise. 
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EXAMPLE  2 


Y 


I l 

“l  Z 

Fig.  523.  Notion  of  a two-dimensional  distribution 

Continuous  Two-Dimensional  Distributions 

In  analogy  to  the  case  of  a single  random  variable  (Sec.  24.5)  we  call  ( X , Y ) and  its 
distribution  continuous  if  the  corresponding  distribution  function  F(x,  y ) can  be  given  by 
a double  integral 


(6) 


F(x,y) 


rX 

f(x*,  y*)  dx*  dy* 


whose  integrand  f called  the  density  of  (X,  Y ),  is  nonnegative  everywhere,  and  is 
continuous,  possibly  except  on  finitely  many  curves. 

From  (6)  we  obtain  the  probability  that  (X,  Y ) assume  any  value  in  a rectangle  (Fig.  523) 
given  by  the  formula 


(7) 


F(a  i <XSb!,  a2<  fg  b2) 


f>2  rb  i 

fix,  y)  dx  dy. 
J 

a2  «i 


Two-Dimensional  Uniform  Distribution  in  a Rectangle 

Let  R be  the  rectangle  ol\  < x = ft,  a2<y  = p2-  The  density  (see  Fig.  524) 

(8)  f{x,  y)  = 1 jk  if  (x,  y)  is  in  R,  f(x,y)  = 0 otherwise 


defines  the  so-called  uniform  distribution  in  the  rectangle  R;  here  k = (/3i  — ffiXfe  — “2)  is  the  area  of  R. 
The  distribution  function  is  shown  in  Fig.  525. 


Fig.  524.  Density  function  (8)  of  the 
uniform  distribution 


Fig.  525.  Distribution  function  of  the 
uniform  distribution  defined  by  (8) 


Marginal  Distributions  of  a Discrete  Distribution 

This  is  a rather  natural  idea,  without  counterpart  for  a single  random  variable.  It  amounts 
to  being  interested  only  in  one  of  the  two  variables  in  (X,  Y),  say,  X,  and  asking  for  its 
distribution,  called  the  marginal  distribution  of  X in  ( X , Y).  So  we  ask  for  the  probability 
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EXAMPLE  3 


P(X  = x,  Y arbitrary).  Since  ( X , Y ) is  discrete,  so  is  X.  We  get  its  probability  function, 
call  it/iXx),  from  the  probability  function  fix,  y)  of  (X,  Y)  by  summing  over  y: 


(9)  /i(x)  = P(X  = x,Y  arbitrary)  = ^ fix,  y) 

v 


where  we  sum  all  the  values  of  f(x,  y)  that  are  not  0 for  that  x. 

From  (9)  we  see  that  the  distribution  function  of  the  marginal  distribution  of  X is 

(10)  F\(x)  = P(X  Si  x,  Y arbitrary)  = 2 /l  (■**)■ 

X*^X 

Similarly,  the  probability  function 

(11)  /2(y)  = P(X  arbitrary,  Y S y)  = ^ /(•*,  >9 

X 


determines  the  marginal  distribution  of  V in  (X,  Y).  Here  we  sum  all  the  values  of  f(x,  y)  that 
are  not  zero  for  the  corresponding  v.  The  distribution  function  of  this  marginal  distribution  is 

(12)  F2(y)  = P(X  arbitrary,  T = y)  = 2 My*)- 

y*=y 


Marginal  Distributions  of  a Discrete  Two-Dimensional  Random  Variable 

In  drawing  3 cards  with  replacement  from  a bridge  deck  let  us  consider 

(X,  F),  X = Number  of  queens,  Y = Number  of  kings  or  aces. 

The  deck  has  52  cards.  These  include  4 queens,  4 kings,  and  4 aces.  Hence  in  a single  trial  a queen  has  probability 
and  a king  or  ace  ^ This  gives  the  probability  function  of  ( X , F), 

V / , ^ \3-x-y 


f(*,y ) = 


3! 


x\y\0  - x - 


y ) 


(x  + y S 3) 


and  f(x,  y)  = 0 otherwise.  Table  24. 1 shows  in  the  center  the  values  of  f(x,  y)  and  on  the  right  and  lower  margins 
the  values  of  the  probability  functions  (x)  and  fjx')  of  the  marginal  distributions  of  X and  Y,  respectively. 


e 24.1  Values  of  the  Probability  Functions  f(x,  y),  /,(*),  f2(y)  in  Drawing 
Three  Cards  with  Replacement  from  a Bridge  Deck,  where  X is  the  Number 
of  Queens  Drawn  and  Y is  the  Number  of  Kings  or  Aces  Drawn 


y 

X 

0 

l 

2 

3 

fl(x) 

0 

1000 

600 

120 

8 

1728 

2197 

2197 

2197 

2197 

2197 

l 

300 

120 

12 

0 

432 

2197 

2197 

2197 

2197 

2 

30 

6 

0 

0 

36 

2197 

2197 

2197 

3 

1 

2197 

0 

0 

0 

1 

2197 

fz(.y) 

1331 

726 

132 

8 

2197 

2197 

2197 

2197 
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EXAMPLE  4 


Marginal  Distributions  of  a Continuous  Distribution 

This  is  conceptually  the  same  as  for  discrete  distributions,  with  probability  functions  and 
sums  replaced  by  densities  and  integrals.  For  a continuous  random  variable  ( X , Y)  with 
density  /( x,  y ) we  now  have  the  marginal  distribution  of  X in  (X,  Y ),  defined  by  the 
distribution  function 


(13) 


Fi(x)  = P(X  g x,  -oo  < Y < °°) 


rX 

/iQt*)  dx* 

— cc 


with  the  density  /i  of  X obtained  from  fix,  y)  by  integration  over  y. 


(14) 


fi(x) 


fix,  y)  dy. 


Interchanging  the  roles  of  X and  Y,  we  obtain  the  marginal  distribution  of  Y in  ( X , Y) 
with  the  distribution  function 


(15) 


F2(y)  = P(-°°  < X < °°,  Y g y) 


rV 

f2(y*)  dy* 


and  density 


(16) 


/2OO 


fix,  y)  dx. 


Independence  of  Random  Variables 

X and  Y in  a (discrete  or  continuous)  random  variable  (X,  Y)  are  said  to  be  independent  if 

(17)  F(x,  y)  = f'i(x)/-2(y) 

holds  for  all  (jc,  y).  Otherwise  these  random  variables  are  said  to  be  dependent.  These 
definitions  are  suggested  by  the  corresponding  definitions  for  events  in  Sec.  24.3. 
Necessary  and  sufficient  for  independence  is 

(18)  fix,  y)  = /i(x)/2(y) 

for  all  x and  y.  Here  the/’s  are  the  above  probability  functions  if  (Z,  Y)  is  discrete  or 
those  densities  if  (X,  Y)  is  continuous.  (See  Prob.  20.) 


Independence  and  Dependence 

In  tossing  a dime  and  a nickel,  X = Number  of  heads  on  the  dime,  Y = Number  of  heads  on  the  nickel  may 
assume  the  values  0 or  1 and  are  independent.  The  random  variables  in  Table  24.1  are  dependent. 
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Extension  of  Independence  to  n -Dimensional  Random  Variables.  This  will  be  needed 
throughout  Chap.  25.  The  distribution  of  such  a random  variable  X = (X1;  ■■■  ,Xn)  is 
determined  by  a distribution  function  of  the  form 

i(x  ] . , %n)  Pf^i  — * * ' » = Xn). 

The  random  variables  Xi,  ■ ■ ■ , Xn  are  said  to  be  independent  if 

(19)  F(xu  ■ ■ ■ , xn)  = F1(x1)F2(x2)  ■ ■ ■ Fn(xn) 

for  all  (xi,  • ■ ■ , xn).  Here  Fj(xj)  is  the  distribution  function  of  the  marginal  distribution  of 
Xj  in  X,  that  is. 


Fj(xj ) = P(Xj  xj,  Xfc  arbitrary,  k = 1,  • ■ ■ , n,  k # j). 
Otherwise  these  random  variables  are  said  to  be  dependent. 


Functions  of  Random  Variables 

When  n = 2,  we  write  X\  = X,  X2  = Y,  x\  = x,  x2  = y.  Taking  a nonconstant  continuous 
function  g(x,  y)  defined  for  all  x,  y,  we  obtain  a random  variable  Z = g(X,  Y).  For  example, 
if  we  roll  two  dice  and  X and  Y are  the  numbers  the  dice  turn  up  in  a trial,  then  Z = X + Y 
is  the  sum  of  those  two  numbers  (see  Fig.  514  in  Sec.  24.5). 

In  the  case  of  a discrete  random  variable  (X,  Y)  we  may  obtain  the  probability  function 
f(z)  of  Z = g(X,  Y)  by  summing  all  fix,  y)  for  which  g(x,  y)  equals  the  value  of  z 
considered;  thus 

(20)  f(z)  = P(Z  = z)  = 22/(*>V>- 

g(x,y)=z 


Hence  the  distribution  function  of  Z is 

(21)  F(z)  = P(Z  gZ)  = 22  /(*•  y) 

g(x,y)Sz 


where  we  sum  all  values  of  fix,  y)  for  which  g{x,  y)  = z. 

In  the  case  of  a continuous  random  variable  ( X , Y)  we  similarly  have 


(22) 


F(z)  = P(Z  = zf  = fix,  y)  dx  dy 


j j 

g(x,yy, 


where  for  each  z we  integrate  the  density  fix,  y)  of  (X,  Y)  over  the  region  g(x,  y)  y in 
the  xy-plane,  the  boundary  curve  of  this  region  being  g(x,  y)  = z. 
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THEOREM  1 


THEROEM  2 


PROOF 


Addition  of  Means 

The  number 


(23) 


E(g(X,  Y))  = 


2 2 S(x,  y)f(x,  y ) 

x y 

8 (x,  y)f(x,  y)  dx  dy 


[(X,  Y)  discrete] 


[(X,  Y)  continuous] 


is  called  the  mathematical  expectation  or,  briefly,  the  expectation  of  g(X,  Y).  Here  it  is 
assumed  that  the  double  series  converges  absolutely  and  the  integral  of  I g(x,  y)  \f(x,  y) 
over  the  xy-plane  exists  (is  finite).  Since  summation  and  integration  are  linear  processes, 
we  have  from  (23) 

(24)  E(ag(X,  Y)  + bh(X,  Y))  = aE(g(X , Y))  + bE(h(X,  Y)). 

An  important  special  case  is 


E(X  + Y)  = E(X)  + E(Y), 
and  by  induction  we  have  the  following  result. 


Addition  of  Means 

The  mean  (expectation)  of  a sum  of  random  variables  equals  the  sum  of  the  means 
(expectations),  that  is, 

(25)  E(X1  + X2  + ■■■  + Xn)  = E(Xi)  + E(XZ)  + ■■■  + E(Xn). 


Furthermore,  we  readily  obtain 


Multiplication  of  Means 

The  mean  (expectation)  of  the  product  of  independent  random  variables  equals  the 
product  of  the  means  (expectations),  that  is, 

(26)  E(X1X2  ■ ■ ■ Xn)  = E(X1)E(X 2)  ■ ■ • E(Xn). 


If  X and  Y are  independent  random  variables  (both  discrete  or  both  continuous),  then 
E(XY)  = E(X)E(Y).  In  fact,  in  the  discrete  case  we  have 


E(XY)  = xyf(x,y)  = ^ */i«  2 ^(v)  = E(X)E(Y), 


x y 


x 
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and  in  the  continuous  case  the  proof  of  the  relation  is  similar.  Extension  to  n independent 
random  variables  gives  (26),  and  Theorem  2 is  proved. 

Addition  of  Variances 

This  is  another  matter  of  practical  importance  that  we  shall  need.  As  before,  let  Z = X + Y 
and  denote  the  mean  and  variance  of  Z by  /jl  and  cr2.  Then  we  first  have  (see  Team  Project 
20(a)  in  Problem  Set  24.6) 

o-2  = E([Z  - fi f)  = E(Z2)  - [£(Z)]2. 

From  (24)  we  see  that  the  first  term  on  the  right  equals 

E(Z2)  = E(X2  + 2 XY  + Y2)  = E(X2)  + 2 E(XY)  + E(  Y2). 

For  the  second  term  on  the  right  we  obtain  from  Theorem  1 

[E(Z)f  = [E(X)  + E(Y)f  = [E{X)f  + 2 E(X)E(Y)  + [E(Y)f. 

By  substituting  these  expressions  into  the  formula  for  cr2  we  have 

o-2  = E(X2)  - [EiX)}2  + E(Y2)  - [E(Y)f 
+ 2[E(XY)  - E{X)E{Y)}. 

From  Team  Project  20,  Sec.  24.6,  we  see  that  the  expression  in  the  first  line  on  the  right 
is  the  sum  of  the  variances  of  X and  Y,  which  we  denote  by  erf  and  erf,  respectively.  The 
quantity  in  the  second  line  (except  for  the  factor  2)  is 

(27)  <rXY  = E(XY)  - E(X)E(Y) 


and  is  called  the  covariance  of  X and  Y.  Consequently,  our  result  is 
(28)  cr2  — cr\  4*  erf  T 2 cr xy- 

If  X and  Y are  independent,  then 

E(XY)  = E(X)E(Y): 


hence  crXy  = 0,  and 

2 2 i 2 

(29)  cr  — (r\  + a 2- 


Extension  to  more  than  two  variables  gives  the  basic 


THEOREM  3 


Addition  of  Variances 

The  variance  of  the  sum  of  independent  random  variables  equals  the  sum  of  the 
variances  of  these  variables. 
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CAUTION!  In  the  numerous  applications  of  Theorems  1 and  3 we  must  always 
remember  that  Theorem  3 holds  only  for  independent  variables. 

This  is  the  end  of  Chap.  24  on  probability  theory.  Most  of  the  concepts,  methods,  and 
special  distributions  discussed  in  this  chapter  will  play  a fundamental  role  in  the  next 
chapter,  which  deals  with  methods  of  statistical  inference,  that  is,  conclusions  from 
samples  to  populations,  whose  unknown  properties  we  want  to  know  and  try  to  discover 
by  looking  at  suitable  properties  of  samples  that  we  have  obtained. 


PRQBgFM=SR--r4=9 


1.  Let  /( x,  y)  — k when  8 S x S 12  and  0 S y S 2 and 
zero  elsewhere.  Find  k.  Find  P(X  s 11,  1 S TS  1.5) 
and  P(9  £ X £ 13,  YS  1). 

2.  Find  P(X  > 4,  Y > 4)  and  P{X  £ 1,  Y £ 1)  if  ( X , Y) 
has  the  density /(x,  y)  = 35  ifx  S 0,  y S 0,  x + y £ 8. 

3.  Let/(x,  y)  = k if  x > 0,  y > 0,  x + y < 3 and  0 other- 
wise. Find  k.  Sketch/(x,  >’).  Find  P(X  + Y £ 1),  P(Y  > X). 

4.  Find  the  density  of  the  marginal  distribution  of  X in 
Prob.  2. 

5.  Find  the  density  of  the  marginal  distribution  of  Y in 
Fig.  524. 

6.  If  certain  sheets  of  wrapping  paper  have  a mean  weight 
of  10  g each,  with  a standard  deviation  of  0.05  g,  what 
are  the  mean  weight  and  standard  deviation  of  a pack 
of  10,000  sheets? 

7.  What  are  the  mean  thickness  and  the  standard  deviation 
of  transformer  cores  each  consisting  of  50  layers  of 
sheet  metal  and  49  insulating  paper  layers  if  the  metal 
sheets  have  mean  thickness  0.5  mm  each  with  a 
standard  deviation  of  0.05  mm  and  the  paper  layers 
have  mean  0.05  mm  each  with  a standard  deviation  of 
0.02  mm? 

8.  Let  X [cm]  and  Y [cm]  be  the  diameters  of  a pin  and 
hole,  respectively.  Suppose  that  (X,  Y)  has  the  density 

f(x,  y)  = 625  if  0.98  < x < 1.02,  1.00  < y < 1.04 

and  0 otherwise,  (a)  Find  the  marginal  distributions, 
(b)  What  is  the  probability  that  a pin  chosen  at  random 
will  fit  a hole  whose  diameter  is  1.00? 

9.  Using  Theorems  1 and  3,  obtain  the  formulas  for  the 
mean  and  the  variance  of  the  binomial  distribution. 

10.  Using  Theorem  1,  obtain  the  formula  for  the  mean  of 
the  hypergeometric  distribution.  Can  you  use  Theorem 
3 to  obtain  the  variance  of  that  distribution? 

11.  A 5-gear  assembly  is  put  together  with  spacers  between 
the  gears.  The  mean  thickness  of  the  gears  is  5.020  cm 
with  a standard  deviation  of  0.003  cm.  The  mean 
thickness  of  the  spacers  is  0.040  cm  with  a standard 
deviation  of  0.002  cm.  Find  the  mean  and  standard 
deviation  of  the  assembled  units  consisting  of  5 randomly 
selected  gears  and  4 randomly  selected  spacers. 


12.  If  the  mean  weight  of  certain  (empty)  containers  is  5 lb 
the  standard  deviation  is  0.2  lb,  and  if  the  filling  of  the 
containers  has  mean  weight  100  lb  and  standard 
deviation  0.5  lb,  what  are  the  mean  weight  and  the 
standard  deviation  of  filled  containers? 

13.  Find  P{X  > Y)  when  (X,  Y ) has  the  density 

f(x,  y)  = 0.25e~os(x+y)  if  x £ 0,  y a 0 
and  0 otherwise. 

14.  An  electronic  device  consists  of  two  components.  Let 
X and  Y [years]  be  the  times  to  failure  of  the  first  and 
second  components,  respectively.  Assume  that  (A,  Y) 
has  the  density /(x,  y)  = 4e~z(x  + y:>  if  x > 0 and  y > 0 
and  0 otherwise,  (a)  Are  X and  Y dependent  or 
independent?  (b)  Find  the  densities  of  the  marginal 
distributions,  (c)  What  is  the  probability  that  the  first 
component  will  have  a lifetime  of  2 years  or  longer? 

15.  Give  an  example  of  two  different  discrete  distributions 
that  have  the  same  marginal  distributions. 

16.  Prove  (2). 

17.  Let  (A,  Y)  have  the  probability  function 

m o)  =/(  1,  i)  = i 

m i)=/(i,o)  = i. 

Are  A and  Y independent? 

18.  Let  (A,  Y)  have  the  density 

fix,  y)  = k if  x2  + yz  < 1 

and  0 otherwise.  Determine  k.  Find  the  densities  of  the 
marginal  distributions.  Find  the  probability 

Pi A2  + Y2  < |). 

19.  Show  that  the  random  variables  with  the  densities 

fix,  y)  = x + y 

and 

g{x,y)  = (x  + g)(y  + g) 

if  0£x£l,0£yfil  and  fix,  y)  = 0 and 
g{x,  y)  — 0 elsewhere,  have  the  same  marginal 
distribution. 

20.  Prove  the  statement  involving  (18). 
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1.  What  are  stem-and-leaf  plots?  Boxplots?  Histograms? 
Compare  their  advantages. 

2.  What  properties  of  data  are  measured  by  the  mean?  The 
median?  The  standard  deviation?  The  variance? 

3.  What  do  we  mean  by  an  experiment?  An  outcome?  An 
event?  Give  examples. 

4.  What  is  a random  variable?  Its  distribution  function? 
Its  probability  function  or  density? 

5.  State  the  definition  of  probability  from  memory.  Give 
simple  examples. 

6.  What  is  sampling  with  and  without  replacement?  What 
distributions  are  involved? 

7.  When  is  the  Poisson  distribution  a good  approximation 
of  the  binomial  distribution?  The  normal  distribution? 

8.  Explain  the  use  of  the  tables  of  the  normal  distribution. 
If  you  have  a CAS,  how  would  you  proceed  without 
the  tables? 

9.  State  the  main  theorems  on  probability.  Illustrate  them 
by  simple  examples. 

10.  State  the  most  important  facts  about  distributions  of 
two  random  variables  and  their  marginal  distributions. 

11.  Make  a stem-and-leaf  plot,  histogram,  and  boxplot  of  the 
data  110,  113,  109,  118,  110,  115,  104,  111,  116,  113. 

12.  Same  task  as  in  Prob.  11.  for  the  data  13.5,  13.2,  12.1, 
13.6,  13.3. 

13.  Find  the  mean,  standard  deviation,  and  variance  in 
Prob.  11. 

14.  Find  the  mean,  standard  deviation,  and  variance  in 
Prob.  12. 


15.  Show  that  the  mean  always  lies  between  the  smallest 
and  the  largest  data  value. 

16.  What  are  the  outcomes  in  the  sample  space  of  the 
experiment  of  simultaneously  tossing  three  coins? 

17.  Plot  a histogram  of  the  data  8,  2,  4,  10  and  guess  x and  i 
by  inspecting  the  histogram.  Then  calculate  x,  s2,  and  s. 

18.  Using  a Venn  diagram,  show  that  A C B if  and  only  if 
A Cl  B = A. 

19.  Suppose  that  3%  of  bolts  made  by  a machine  are 
defective,  the  defectives  occurring  at  random  during 
production.  If  the  bolts  are  packaged  50  per  box,  what 
is  the  binomial  approximation  of  the  probability  that  a 
given  box  will  contain  x = 0,  1,  ■ ■ ■ , 5 defectives? 

20.  Of  a lot  of  12  items,  3 are  defective,  (a)  Find  the  number 
of  different  samples  of  3 items.  Find  the  number  of 
samples  of  3 items  containing  (b)  no  defectives,  (c)  1 
defective,  (d)  2 defectives,  (e)  3 defectives. 

21.  Find  the  probability  function  of  A = Number  of  times 
of  tossing  a fair  coin  until  the  first  head  appears. 

22.  If  the  life  of  ball  bearings  has  the  density /(jc)  = ke~x 
if  0 S x S 2 and  0 otherwise,  what  is  kl  What  is  the 
probability  P(X  £ 1)? 

23.  Find  the  mean  and  variance  of  a discrete  random  variable 
X having  the  probability  function /(0)  = f{  1)  = g, 

m = l 

24.  Let  Abe  normal  with  mean  14  and  variance  4.  Determine 
c such  that  P(X  S c)  = 95%,  P(X  S c)  = 5%, 
P(AS  c)  = 99.5%. 

25.  Let  X be  normal  with  mean  80  and  variance  9.  Find 
P(X  > 83),  P(X  < 81),  P(X  < 80),  and  P(78  < A < 82). 


SUMMARY  OF  CHAPTER  24 

Data  Analysis.  Probability  Theory 


A random  experiment,  briefly  called  experiment,  is  a process  in  which  the  result 
(“outcome”)  depends  on  “chance”  (effects  of  factors  unknown  to  us).  Examples  are 
games  of  chance  with  dice  or  cards,  measuring  the  hardness  of  steel,  observing  weather 
conditions,  or  recording  the  number  of  accidents  in  a city.  (Thus  the  word  “experiment” 
is  used  here  in  a much  wider  sense  than  in  common  language.)  The  outcomes  are 
regarded  as  points  (elements)  of  a set  S,  called  the  sample  space,  whose  subsets  are 
called  events.  For  events  E we  define  a probability  P(E)  by  the  axioms  (Sec.  24.3) 

0 S P(E)  g 1 

(1)  P(S)  = 1 

P(E1  UE2U  ■■■)  = P(Ei)  + P(E2)  + ■ ■ • 


(Ej  n Ek  = 0). 
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These  axioms  are  motivated  by  properties  of  frequency  distributions  of  data 
(Sec.  24.1). 

The  complement  Ec  of  E has  the  probability 
(2)  P(EC)  = 1 - P(E). 

The  conditional  probability  of  an  event  B under  the  condition  that  an  event  A 
happens  is  (Sec.  24.3) 


P(A  Cl  B) 

(3)  P(B\A)  = [P(A)  > 0], 

P(A) 

Two  events  A and  B are  called  independent  if  the  probability  of  their  simultaneous 
appearance  in  a trial  equals  the  product  of  their  probabilities,  that  is,  if 

(4)  P(A  n B)  = P(A)P(B). 

With  an  experiment  we  associate  a random  variable  X.  This  is  a function  defined 
on  S whose  values  are  real  numbers;  furthermore,  X is  such  that  the  probability 
P(X  = a ) with  which  X assumes  any  value  a,  and  the  probability  P(a  < X g b)  with 
which  X assumes  any  value  in  an  interval  a < X g b are  defined  (Sec.  24.5).  The 
probability  distribution  of  X is  determined  by  the  distribution  function 

(5)  Fix)  = P(X  g x). 

In  applications  there  are  two  important  kinds  of  random  variables:  those  of  the 
discrete  type,  which  appear  if  we  count  (defective  items,  customers  in  a bank,  etc.) 
and  those  of  the  continuous  type,  which  appear  if  we  measure  (length,  speed, 
temperature,  weight,  etc.). 

A discrete  random  variable  has  a probability  function 

(6)  f(x)  = P(X  = x). 

Its  mean  /jl  and  variance  a2  are  (Sec.  24.6) 


(7)  B = ^Xjfixj)  and  cr2  = ^(xj  - pff(Xj) 

j j 

where  the  Xj  are  the  values  for  which  X has  a positive  probability.  Important  discrete 
random  variables  and  distributions  are  the  binomial,  Poisson,  and  hypergeometric 
distributions  discussed  in  Sec.  24.7. 

A continuous  random  variable  has  a density 


(8)  fix ) = F'ix) 

Its  mean  and  variance  are  (Sec.  24.6) 


(9) 


P 


xfix)  dx 


and 


[see  (5)]. 


ix  — fju ffix)  dx. 
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Very  important  is  the  normal  distribution  (Sec.  24.8),  whose  density  is 

and  whose  distribution  function  is  (Sec.  24.8;  Tables  A7,  A8  in  App.  5) 

(x  - (Ji\ 

(11)  F(x)  = <f(^—  J. 

A two-dimensional  random  variable  ( X , Y)  occurs  if  we  simultaneously  observe 
two  quantities  (for  example,  height  X and  weight  Y of  adults).  Its  distribution  function 
is  (Sec.  24.9) 

(12)  F(x,y)  = P(X^x,Y^y). 

X and  Y have  the  distribution  functions  (Sec.  24.9) 

(13)  P\(x)  = P(X  3§  x,  Y arbitrary)  and  /-’2(y)  = P( x arbitrary,  Y y) 

respectively;  their  distributions  are  called  marginal  distributions.  If  both  X and  Y 
are  discrete,  then  ( X , Y)  has  a probability  function 

f(x,y)  = P(X  = x,Y  = y). 

If  both  X and  Y are  continuous,  then  ( X , Y)  has  a density /(x,  y). 


(10) 


fix)  = 


y\fTi T 


exp 


CHAPTER  2 5 
Mathematical  Statistics 


In  probability  theory  we  set  up  mathematical  models  of  processes  that  are  affected  by 
“chance.”  In  mathematical  statistics  or,  briefly,  statistics,  we  check  these  models  against 
the  observable  reality.  This  is  called  statistical  inference.  It  is  done  by  sampling,  that 
is,  by  drawing  random  samples,  briefly  called  samples.  These  are  sets  of  values  from  a 
much  larger  set  of  values  that  could  be  studied,  called  the  population.  An  example  is 
10  diameters  of  screws  drawn  from  a large  lot  of  screws.  Sampling  is  done  in  order  to 
see  whether  a model  of  the  population  is  accurate  enough  for  practical  purposes.  If  this 
is  the  case,  the  model  can  be  used  for  predictions,  decisions,  and  actions,  for  instance,  in 
planning  productions,  buying  equipment,  investing  in  business  projects,  and  so  on. 

Most  important  methods  of  statistical  inference  are  estimation  of  parameters  (Secs.  25.2), 
determination  of  confidence  intervals  (Sec.  25.3),  and  hypothesis  testing  (Sec.  25.4,  25.7, 
25.8),  with  application  to  quality  control  (Sec.  25.5)  and  acceptance  sampling  (Sec.  25.6). 

In  the  last  section  (25.9)  we  give  an  introduction  to  regression  and  correlation  analysis, 
which  concern  experiments  involving  two  variables. 

Prerequisite:  Chap.  24. 

Sections  that  may  be  omitted  in  a shorter  course:  25.5,  25.6,  25.8. 

References,  Answers  to  Problems,  and  Statistical  Tables:  App.  1 Part  G,  App.  2,  App.  5. 

25.1  Introduction.  Random  Sampling 

Mathematical  statistics  consists  of  methods  for  designing  and  evaluating  random 
experiments  to  obtain  information  about  practical  problems,  such  as  exploring  the  relation 
between  iron  content  and  density  of  iron  ore,  the  quality  of  raw  material  or  manufactured 
products,  the  efficiency  of  air-conditioning  systems,  the  performance  of  certain  cars,  the 
effect  of  advertising,  the  reactions  of  consumers  to  a new  product,  etc. 

Random  variables  occur  more  frequently  in  engineering  (and  elsewhere)  than  one 
would  think.  For  example,  properties  of  mass-produced  articles  (screws,  lightbulbs,  etc.) 
always  show  random  variation,  due  to  small  (uncontrollable!)  differences  in  raw  material 
or  manufacturing  processes.  Thus  the  diameter  of  screws  is  a random  variable  X and  we 
have  nondefective  screws,  with  diameter  between  given  tolerance  limits,  and  defective 
screws,  with  diameter  outside  those  limits.  We  can  ask  for  the  distribution  of  X,  for  the 
percentage  of  defective  screws  to  be  expected,  and  for  necessary  improvements  of  the 
production  process. 

Samples  are  selected  from  populations — 20  screws  from  a lot  of  1000,  100  of  5000 
voters,  8 beavers  in  a wildlife  conservation  project — because  inspecting  the  entire 
population  would  be  too  expensive,  time-consuming,  impossible  or  even  senseless  (think 
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of  destructive  testing  of  lightbulbs  or  dynamite).  To  obtain  meaningful  conclusions, 
samples  must  be  random  selections.  Each  of  the  1000  screws  must  have  the  same  chance 
of  being  sampled  (of  being  drawn  when  we  sample),  at  least  approximately.  Only  then 
will  the  sample  mean  x = (xi  + • • • + X2o)/20  (Sec.  24.1)  of  a sample  of  size  n = 20 
(or  any  other  n)  be  a good  approximation  of  the  population  mean  /jl  (Sec.  24.6);  and  the 
accuracy  of  the  approximation  will  generally  improve  with  increasing  n,  as  we  shall  see. 
Similarly  for  other  parameters  (standard  deviation,  variance,  etc.). 

Independent  sample  values  will  be  obtained  in  experiments  with  an  infinite  sample 
space  S (Sec.  24.2),  certainly  for  the  normal  distribution.  This  is  also  true  in  sampling  with 
replacement.  It  is  approximately  true  in  drawing  small  samples  from  a large  finite  population 
(for  instance,  5 or  10  of  1000  items).  However,  if  we  sample  without  replacement  from  a 
small  population,  the  effect  of  dependence  of  sample  values  may  be  considerable. 

Random  numbers  help  in  obtaining  samples  that  are  in  fact  random  selections.  This 
is  sometimes  not  easy  to  accomplish  because  there  are  many  subtle  factors  that  can  bias 
sampling  (by  personal  interviews,  by  poorly  working  machines,  by  the  choice  of 
nontypical  observation  conditions,  etc.).  Random  numbers  can  be  obtained  from  a 
random  number  generator  in  Maple,  Mathematica,  or  other  systems  listed  on  p.  789. 
(The  numbers  are  not  truly  random,  as  they  would  be  produced  in  flipping  coins  or 
rolling  dice,  but  are  calculated  by  a tricky  formula  that  produces  numbers  that  do  have 
practically  all  the  essential  features  of  true  randomness.  Because  these  numbers 
eventually  repeat,  they  must  not  be  used  in  cryptography,  for  example,  where  true 
randomness  is  required.) 

Random  Numbers  from  a Random  Number  Generator 

To  select  a sample  of  size  n = 10  from  80  given  ball  bearings,  we  number  the  bearings  from  1 to  80.  We  then 
let  the  generator  randomly  produce  10  of  the  integers  from  1 to  80  and  include  the  bearings  with  the  numbers 
obtained  in  our  sample,  for  example. 

44  55  53  03  52  61  67  78  39  54 


or  whatever. 

Random  numbers  are  also  contained  in  (older)  statistical  tables. 


Representing  and  processing  data  were  considered  in  Sec.  24.1  in  connection  with 
frequency  distributions.  These  are  the  empirical  counterparts  of  probability  distributions 
and  helped  motivating  axioms  and  properties  in  probability  theory.  The  new  aspect  in  this 
chapter  is  randomness:  the  data  are  samples  selected  randomly  from  a population. 
Accordingly,  we  can  immediately  make  the  connection  to  Sec.  24.1,  using  stem-and-leaf 
plots,  box  plots,  and  histograms  for  representing  samples  graphically. 

Also,  we  now  call  the  mean  x in  (5),  Sec.  24.1,  the  sample  mean 

1 n 1 

(1)  X = - ^ Xj  = - (Xi  + x2  + ■ ■ ■ + xn). 

j= i 


We  call  n the  sample  size,  the  variance  .v  in  (6),  Sec.  24.1,  the  sample  variance 


(2) 


v1 2  = 


1 n 1 

T 2 (xj  ~ x)2  = T - xf  + ■ ’ • + (xn  - xf], 

n — 1 n — 1 

3=1 


SEC.  25.2  Point  Estimation  of  Parameters 


1065 


and  its  positive  square  root  s the  sample  standard  deviation,  x,  s2,  and  s are  called 
parameters  of  a sample;  they  will  be  needed  throughout  this  chapter. 


25.2  Point  Estimation  of  Parameters 

Beginning  in  this  section,  we  shall  discuss  the  most  basic  practical  tasks  in  statistics  and 
corresponding  statistical  methods  to  accomplish  them.  The  first  of  them  is  point  estimation 
of  parameters,  that  is,  of  quantities  appearing  in  distributions,  such  as  p in  the  binomial 
distribution  and  p and  cr  in  the  normal  distribution. 

A point  estimate  of  a parameter  is  a number  (point  on  the  real  line),  which  is  computed 
from  a given  sample  and  serves  as  an  approximation  of  the  unknown  exact  value  of  the 
parameter  of  the  population.  An  interval  estimate  is  an  interval  (“ confidence  interval”) 
obtained  from  a sample;  such  estimates  will  be  considered  in  the  next  section.  Estimation 
of  parameters  is  of  great  practical  importance  in  many  applications. 

As  an  approximation  of  the  mean  p of  a population  we  may  take  the  mean  x of  a 
corresponding  sample.  This  gives  the  estimate  p = x for  p,  that  is, 

(1)  p =x  = ^(xi  + ■■■  + xn) 

where  n is  the  sample  size.  Similarly,  an  estimate  <r2  for  the  variance  of  a population  is 

o 

the  variance  s ofa  corresponding  sample,  that  is, 

(2)  «r2  = ^ = 2 (xj  ~ xf. 

n — 1 

j=i 


Clearly,  (1)  and  (2)  are  estimates  of  parameters  for  distributions  in  which  p or  cr2 
appear  explicity  as  parameters,  such  as  the  normal  and  Poisson  distributions.  For  the 
binomial  distribution,  p = p/n  [see  (3)  in  Sec.  24.7],  From  (1)  we  thus  obtain  for  p 
the  estimate 


(3) 


P = 


JC 

n 


We  mention  that  (1)  is  a special  case  of  the  so-called  method  of  moments.  In  this 
method  the  parameters  to  be  estimated  are  expressed  in  terms  of  the  moments  of  the 
distribution  (see  Sec.  24.6).  In  the  resulting  formulas,  those  moments  of  the  distribution 
are  replaced  by  the  corresponding  moments  of  the  sample.  This  gives  the  estimates.  Here 

the  kth  moment  of  a sample  x±,  ■ ■ ■ , xn  is 


mk  = 


3= 1 
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Maximum  Likelihood  Method 

Another  method  for  obtaining  estimates  is  the  so-called  maximum  likelihood  method  of 
R.  A.  Fisher  [Messenger  Math.  41  (1912),  155-160].  To  explain  it,  we  consider  a discrete 
(or  continuous)  random  variable  X whose  probability  function  (or  density)  fix)  depends 
on  a single  parameter  0.  We  take  a corresponding  sample  of  n independent  values 
jti,  • • • , xn.  Then  in  the  discrete  case  the  probability  that  a sample  of  size  n consists 
precisely  of  those  n values  is 

(4)  l =f(xl)f(x2)---f(xn). 


In  the  continuous  case  the  probability  that  the  sample  consists  of  values  in  the  small 
intervals  xj  x Si  Xj  + A x{j  = 1,  2,  • ■ • , n)  is 

(5)  fix i) A.v  fix 2) Ax  • • • fixn)Ax  = /(Ax)”. 

Since /(xj)  depends  on  6,  the  function  / in  (5)  given  by  (4)  depends  on  xi,  • ■ • , xn  and  0. 
We  imagine  Xi,  ■ • • , xn  to  be  given  and  fixed.  Then  / is  a function  of  6,  which  is  called 
the  likelihood  function.  The  basic  idea  of  the  maximum  likelihood  method  is  quite  simple, 
as  follows.  We  choose  that  approximation  for  the  unknown  value  of  0 for  which  l is  as 
large  as  possible.  If  / is  a differentiable  function  of  9,  a necessary  condition  for  I to  have 
a maximum  in  an  interval  (not  at  the  boundary)  is 


(We  write  a partial  derivative,  because  / depends  also  on  x1;  • • ■ , xn. ) A solution  of  (6) 
depending  on  Xi,  • • ■ , xn  is  called  a maximum  likelihood  estimate  for  6.  We  may  replace 
(6)  by 


(7) 


d In  / 
90 


= 0, 


because /(x,)  > 0,  a maximum  of  / is  in  general  positive,  and  In  / is  a monotone  increasing 
function  of  I.  This  often  simplifies  calculations. 

Several  Parameters.  If  the  distribution  of  X involves  r parameters  6>i,  ■ ■ ■ , 0r,  then  instead 
of  (6)  we  have  the  r conditions  91/96]  = 0,  • • • , dl/99r  = 0,  and  instead  of  (7)  we  have 


(8) 


9 In  / 
991 


= 0, 


9 In  / 
99  r 


= 0. 


Normal  Distribution 

Find  maximum  likelihood  estimates  for  8\  = /j.  and  d2  = & in  the  case  of  the  normal  distribution. 
Solution.  From  (1),  Sec.  24.8,  and  (4)  we  obtain  the  likelihood  function 


/ = 


where 


h = 


1 

2^ 


2 

3= 1 


(Xj  - /x)2. 
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Taking  logarithms,  we  have 

In  / = —n  In  V27T  — n In  a — h. 
The  first  equation  in  (8)  is  d(ln  /)/ d/i  = 0,  written  out 
d In  / dh  1 


2 (*j  ~ #0  = 0. 


dfj,  dp  a2  ,=i 

The  solution  is  the  desired  estimate  [l  for  /r:  we  find 

1 n 

/'  „ ^ u 

i=i 

The  second  equation  in  (8)  is  d(ln  /)/ da  = 0,  written  out 

d In  / n dh  1 1 


hence 


^Xj  - n/A  = 0. 

j=i 


da 


v da 


o 2 Z1) 

o-3 4 5 6  j=1 


Replacing  /r  by  (l  and  solving  for  az,  we  obtain  the  estimate 


<?2  = „ 2(v^)2 

i=i 

which  we  shall  use  in  Sec.  25.7.  Note  that  this  differs  from  (2).  We  cannot  discuss  criteria  for  the  goodness  of 
estimates  but  want  to  mention  that  for  small  n,  formula  (2)  is  preferable. 


PR  OBL  EM=S^T—2~5^2 


1.  Normal  distribution.  Apply  the  maximum  likelihood 
method  to  the  normal  distribution  with  /x  = 0. 

2.  Find  the  maximum  likelihood  estimate  for  the 
parameter  /i  of  a normal  distribution  with  known 
variance  cr2  = cro  = 16. 

3.  Poisson  distribution.  Derive  the  maximum  likelihood 
estimator  for  /x.  Apply  it  to  the  sample  (10,  25,  26,  17, 
10,  4),  giving  numbers  of  minutes  with  0-10,  1 1-20, 
21-30,  31-40,  41-50,  more  than  50  fliers  per  minute, 
respectively,  checking  in  at  some  airport  check-in. 

4.  Uniform  distribution.  Show  that,  in  the  case  of  the 
parameters  a and  b of  the  uniform  distribution  (see 
Sec.  24.6),  the  maximum  likelihood  estimate  cannot  be 
obtained  by  equating  the  first  derivative  to  zero.  How 
can  we  obtain  maximum  likelihood  estimates  in  this 
case,  more  or  less  by  using  common  sense? 

5.  Binomial  distribution.  Derive  a maximum  likelihood 
estimate  for  p. 

6.  Extend  Prob.  5 as  follows.  Suppose  that  m times  n trials 
were  made  and  in  the  first  n trials  A happened  k\  times, 
in  the  second  n trials  A happened  times,  ■ ■ ■ , in  the 
mth  n trials  A happened  km  times.  Find  a maximum 
likelihood  estimate  of  p based  on  this  information. 


7.  Suppose  that  in  Prob.  6 we  made  3 times  4 trials  and 
A happened  2,  3,  2 times,  respectively.  Estimate  p. 

8.  Geometric  distribution.  Let  X = Number  of  inde- 
pendent trials  until  an  event  A occurs.  Show  that  X has 
a geometric  distribution,  defined  by  the  probability 
function  fix ) = pqx -1,  x = 1,  2,  ■ ■ ■ , where  p is  the 
probability  of  A in  a single  trial  and  q = 1 — p.  Find 
the  maximum  likelihood  estimate  of  p corresponding  to 
a sample  x±,  x%,  ■ ■ ■ , xn  of  observed  values  of  X. 

9.  In  Prob.  8,  show  that  /( 1)  +/( 2)  + • • ■ = 1 (as  it 
should  be!).  Calculate  independently  of  Prob.  8 the 
maximum  likelihood  of  p in  Prob.  8 corresponding  to 
a single  observed  value  of  X. 

10.  In  rolling  a die,  suppose  that  we  get  the  first  “ Six ” in 
the  7th  trial  and  in  doing  it  again  we  get  it  in  the  6th 
trial.  Estimate  the  probability  p of  getting  a “Six”  in 
rolling  that  die  once. 

11.  Find  the  maximum  likelihood  estimate  of  0 in  the 
density  fix)  = de~ex  if  x = 0 and/(x)  = 0 if  x < 0. 

12.  In  Prob.  11,  find  the  mean  /x,  substitute  it  in/(jc),  find 
the  maximum  likelihood  estimate  of  /x,  and  show  that 
it  is  identical  with  the  estimate  for  /x  which  can  be 
obtained  from  that  for  6 in  Prob.  1 1 . 
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13.  Compute  6 in  Prob.  1 1 from  the  sample  1.9,  0.4,  0.7,  0.6, 
1.4.  Graph  the  sample  distribution  function  F(x)  and  the 
distribution  function  F(x)  of  the  random  variable,  with 
8 = 8,  on  the  same  axes.  Do  they  agree  reasonably  well? 
(We  consider  goodness  of  fit  systematically  in  Sec.  25.7.) 

14.  Do  the  same  task  as  in  Prob.  13  if  the  given  sample  is 
0.4,  0.7,  0.2,  1.1,  0.1. 


15.  CAS  EXPERIMENT.  Maximum  Likelihood 
Estimates.  (MLEs).  Find  experimentally  how  much 
MLEs  can  differ  depending  on  the  sample  size.  Hint. 
Generate  many  samples  of  the  same  size  n,  e.g.,  of  the 
standardized  normal  distribution,  and  record  x and  s2. 
Then  increase  n. 


25 3 Confidence  Intervals 

Confidence  intervals1  for  an  unknown  parameter  9 of  some  distribution  (e.g.,  0 = yu.)  are 
intervals  9\  Si  9 =§  9%  that  contain  9,  not  with  certainty  but  with  a high  probability  y, 
which  we  can  choose  (95%  and  99%  are  popular).  Such  an  interval  is  calculated  from  a 
sample,  y = 95%  means  probability  1— y = 5%=^jof  being  wrong — one  of  about 
20  such  intervals  will  not  contain  9.  Instead  of  writing  0i  9 Si  9 2,  we  denote  this  more 
distinctly  by  writing 

(1)  CONFy  {0!  g 9 g 92}. 

Such  a special  symbol,  CONF,  seems  worthwhile  in  order  to  avoid  the  misunderstanding 
that  9 must  lie  between  6\  and  02- 

y is  called  the  confidence  level,  and  0\  and  02  are  called  the  lower  and  upper 
confidence  limits.  They  depend  on  y.  The  larger  we  choose  y,  the  smaller  is  the  error 
probability  1 — y,  but  the  longer  is  the  confidence  interval.  If  y — » 1,  then  its  length  goes 
to  infinity.  The  choice  of  y depends  on  the  kind  of  application.  In  taking  no  umbrella,  a 
5%  chance  of  getting  wet  is  not  tragic.  In  a medical  decision  of  life  or  death,  a 5%  chance 
of  being  wrong  may  be  too  large  and  a 1%  chance  of  being  wrong  (y  = 99%)  may  be 
more  desirable. 

Confidence  intervals  are  more  valuable  than  point  estimates  (Sec.  25.2).  Indeed,  we  can 
take  the  midpoint  of  (1)  as  an  approximation  of  9 and  half  the  length  of  (1)  as  an  “error  bound” 
(not  in  the  strict  sense  of  numerics,  but  except  for  an  error  whose  probability  we  know). 

9 1 and  02  in  (1)  are  calculated  from  a sample  xj_,  • • ■ , xn.  These  are  n observations  of  a 
random  variable  X.  Now  comes  a standard  trick.  We  regard  x\,  • ■ • , xn  as  single 
observations  of  n random  variables  X±,  ■ ■ ■ , Xn  ( with  the  same  distribution,  namely,  that 
ofX).  Then  9\  = 9 i(x\,  • • • , xn ) and  02  = 0201, ' ' ' , xn ) in  (1)  are  observed  values  of  two 
random  variables  0i  = ■ ■ ■ , Xn)  and  02  = 02(Ar|.  ■ ■ • , Xn).  The  condition  (1) 

involving  y can  now  be  written 

(2)  P(e1  S 9 g 02)  = y. 

Let  us  see  what  all  this  means  in  concrete  practical  cases. 

In  each  case  in  this  section  we  shall  first  state  the  steps  of  obtaining  a confidence  interval 
in  the  form  of  a table,  then  consider  a typical  example,  and  finally  justify  those  steps 
theoretically. 


1JERZY  NEYMAN  (1894-1981),  American  statistician,  developed  the  theory  of  confidence  intervals  (Annals 

of  Mathematical  Statistics  6 (1935),  111-116). 
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Confidence  Interval  for  /jl  of  the  Normal  Distribution 

o 

with  Known  cr 

Table  25.1  Determination  of  a Confidence  Interval  for  the  Mean  p 
of  a Normal  Distribution  with  Known  Variance  cr2 

Step  1.  Choose  a confidence  level  y (95%,  99%,  or  the  like). 

Step  2.  Determine  the  corresponding  c: 


y 

0.90 

0.95 

0.99 

0.999 

C 

1.645 

1.960 

2.576 

3.291 

Step  3.  Compute  the  mean  x 

of  the  sample  jti,  ■ 

’ > Xn- 

Step  4.  Compute  k = cct/Vh.  The  confidence  interval  for  p is 
(3)  CONF7  {x  — k^p^x  + k}. 


Confidence  Interval  for  p of  the  Normal  Distribution  with  Known  or 2 

Determine  a 95  % confidence  interval  for  the  mean  of  a normal  distribution  with  variance  <x2  = 9,  using  a sample 
of  n = 100  values  with  mean  5 = 5. 

Solution.  Step  1 . y 0.95  is  required.  Step  2.  The  corresponding  c equals  1.960;  see  Table  25.1. 
Step  3.  x = 5 is  given.  Step  4.  We  need  k = 1.960  • 3/V100  = 0.588.  Hence  x — k = 4.412,  x + k = 5.588 
and  the  confidence  interval  is  CONF0.95  14.412  5.588). 

This  is  sometimes  written  fl  5 * 0.588,  but  we  shall  not  use  this  notation,  which  can  be  misleading. 
With  your  CAS  you  can  determine  this  interval  more  directly.  Similarly  for  the  other  examples  in  this  section. 


Theory  for  Table  25.1.  The  method  in  Table  25.1  follows  from  the  basic 


Sum  of  Independent  Normal  Random  Variables 

Let  X\ . ■■■  ,Xn  be  independent  normal  random  variables  each  of  which  has  mean 
p and  variance  cr2.  Then  the  following  holds. 

(a)  The  sum  + ■ ■ ■ + Xn  is  normal  with  mean  tip  and  variance  ncr2. 

(b)  The  following  random  variable  X is  normal  with  mean  p and  variance  cr2 In. 

(4)  X = \ (Xt  + • • • + Xn) 

(c)  The  following  random  variable  Z is  normal  with  mean  0 and  variance  1. 

X - p 

(5)  Z = 

cr 
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The  statements  about  the  mean  and  variance  in  (a)  follow  from  Theorems  1 and  3 in 
Sec.  24.9.  From  this,  and  Theorem  2 in  Sec.  24.6,  we  see  that  X has  the  mean  ( I = /i 
and  the  variance  ( \/n)2ncr 2 = cr2/«.  This  implies  that  Z has  the  mean  0 and  variance  1, 
by  Theorem  2(b)  in  Sec.  24.6.  The  normality  of  Xi  + • • • + Xn  is  proved  in  Ref.  [G3] 
listed  in  App.  1.  This  implies  the  normality  of  (4)  and  (5). 


Derivation  of  (3)  in  Table  25.1.  Sampling  from  a normal  distribution  gives  independent 
sample  values  (see  Sec.  25.1),  so  that  Theorem  1 applies.  Hence  we  can  choose  y and 
then  determine  c such  that 


(6) 


/ X- ii 

P(-c  g Z g c)  = P -eg g c 

V a/Vn 


<b(c)  — <&(— c)  = y. 


For  the  value  y = 0.95  we  obtain  z(D ) = 1.960  from  Table  A8  in  App.  5,  as  used  in 
Example  1.  For  y = 0.9,  0.99,  0.999  we  get  the  other  values  of  c listed  in  Table  25.1. 
Finally,  all  we  have  to  do  is  to  convert  the  inequality  in  (6)  into  one  for  /i  and  insert 
observed  values  obtained  from  the  sample.  We  multiply  — c g Z g c by  — 1 and  then  by 
cr/Vn,  writing  cct/Vh  = k (as  in  Table  25.1), 

( li -x 

P(-c  g Z g c)  = P(c  g -Z  g -c)  = P c g g -c 

v a/va 

= P(k  & n - X ^ -k)  = y. 


Adding  X gives  P{X  + = y or 

(7)  P(X  - lg/igX  + l)  = y. 

Inserting  the  observed  value  x of  X gives  (3).  Here  we  have  regarded  X\,  • • • , xn  as  single 
observations  of  X±,  ■ ■ ■ , Xn  (the  standard  trick!),  so  that  X\  + ■ ■ ■ + xn  is  an  observed  value 
of  Xi  + ■ ■ ■ + Xn  and  x is  an  observed  value  of  X.  Note  further  that  (7)  is  of  the  form  (2) 
with  0i  = X — k and  02  = X + k. 


Sample  Size  Needed  for  a Confidence  Interval  of  Prescribed  Length 

How  large  must  n be  in  Example  1 if  we  want  to  obtain  a 95%  confidence  interval  of  length  L = 0.4? 
Solution.  The  interval  (3)  has  the  length  L = 2k  = 2 ca/  Vn.  Solving  for  n,  we  obtain 

n = (2  ca/L)2. 

In  the  present  case  the  answer  is  n = (2  • 1.960  • 3/0.4)2  870. 

Figure  526  shows  how  L decreases  as  n increases  and  that  for  y = 99%  the  confidence  interval  is  substantially 
longer  than  for  y = 95%  (and  the  same  sample  size  n ). 
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Fig.  526.  Length  of  the  confidence  interval  (3)  (measured  in  multiples  of  cr) 
as  a function  of  the  sample  size  n for  y = 95%  and  y = 99% 


Confidence  Interval  for  /jl  of  the  Normal  Distribution 

cy 

with  Unknown  <r 

In  practice  cr2  is  frequently  unknown.  Then  the  method  in  Table  25.1  does  not  help  and 
the  whole  theory  changes,  although  the  steps  of  determining  a confidence  interval  for  p 
remain  quite  similar.  They  are  shown  in  Table  25.2.  We  see  that  k differs  from  that  in 
Table  25.1,  namely,  the  sample  standard  deviation  s has  taken  the  place  of  the  unknown 
standard  deviation  cr  of  the  population.  And  c now  depends  on  the  sample  size  n and  must 
be  determined  from  Table  A9  in  App.  5 or  from  your  CAS.  That  table  lists  values  z for 
given  values  of  the  distribution  function  (Fig.  527) 


(8) 


F(z)  = Kn 


2\  -Cto  + D/2 

1 + — ) du 

m J 


of  the  f-distribution.  Here,  m (=  1,  2,  • ■ •)  is  a parameter,  called  the  number  of  degrees 
of  freedom  of  the  distribution  ( abbreviated  d.f.).  In  the  present  case,  m = n — 1;  see 
Table  25.2.  The  constant  Km  is  such  that  F(°°)  = 1.  By  integration  it  turns  out  that 
Km  = r (|m  + g)/[  s/rmr  r(gm)],  where  T is  the  gamma  function  (see  (24)  in  App.  A3.1). 

Table  25.2  Determination  of  a Confidence  Interval  for  the  Mean  jit 
of  a Normal  Distribution  with  Unknown  Variance  or 2 

Step  1.  Choose  a confidence  level  y (95%,  99%,  or  the  like). 

Step  2.  Determine  the  solution  c of  the  equation 

(9)  F(c)  = |(1  + y) 

from  the  table  of  the  f-distribution  with  n — 1 degrees  of  freedom 
(Table  A9  in  App.  5;  or  use  a CAS;  n = sample  size). 

Step  3.  Compute  the  meanx  and  the  variances2  of  the  sample  jq,  ■ ■ • , xn. 

Step  4.  Compute  k = cs/X'n.  The  confidence  interval  is 


(10) 


CONF7  {x  — k^p^sx  + k}. 
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Figure  528  compares  the  curve  of  the  density  of  the  f-distribution  with  that  of  the  normal 
distribution.  The  latter  is  steeper.  This  illustrates  that  Table  25.1  (which  uses  more 
information,  namely,  the  known  value  of  cr2)  yields  shorter  confidence  intervals  than 
Table  25.2.  This  is  confirmed  in  Fig.  529,  which  also  gives  an  idea  of  the  gain  by  increasing 
the  sample  size. 


Fig.  527.  Distribution  functions  of  the 
t-distribution  with  1 and  3 d.f.  and  of  the 
standardized  normal  distribution  (steepest  curve) 


Fig.  528.  Densities  of  the  t-distribution 
with  1 and  3 d.f.  and  of  the  standardized 
normal  distribution 


Fig.  529.  Ratio  of  the  lengths  Lf  and  L of  the  confidence 
intervals  (10)  and  (3)  with  y = 95%  and  y = 99%  as  a function 
of  the  sample  size  n for  equal  s and  a 


Confidence  Interval  for  [x  of  the  Normal  Distribution  with  Unknown  cr2 

Five  independent  measurements  of  the  point  of  inflammation  (flash  point)  of  Diesel  oil  (D-2)  gave  the  values 
(in  °F)  144  147  146  142  144.  Assuming  normality,  determine  a 99%  confidence  interval  for  the  mean. 

Solution.  Step  1.  y = 0.99  is  required. 

Step  2.  F(c ) = |(1  + y)  = 0.995,  and  Table  A9  in  App.  5 with  n — 1=4  d.f.  gives  c = 4.60. 

Step  3.  x = 144.6,  s2  = 3.8. 

Step  4.  k = VT8  • 4.60/  V5  = 4.01.  The  confidence  interval  is  CONF0.99  {140.5  ^ /jl  ^ 148.7}. 

If  the  variance  a2  were  known  and  equal  to  the  sample  variance  s2,  thus  cr2  = 3.8,  then  Table  25.1  would 
give  k = ca/\/rn  = 2.576 VT8/V5  = 2.25  and  CONF0.99  {142.35  ^ pu  ^ 146.85}.  We  see  that  the  present 
interval  is  almost  twice  as  long  as  that  obtained  from  Table  25.1  (with  a2  = 3.8).  Hence  for  small  samples  the 
difference  is  considerable!  See  also  Fig.  529. 
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Theory  for  Table  25.2.  For  deriving  (10)  in  Table  25.2  we  need  from  Ref.  [G3] 


THEOREM  2 


Student’s  t-Distribution 

Let  X i,  ■ ■ ■ , Xn  be  independent  normal  random  variables  with  the  same  mean  p and 
the  same  variance  cr2.  Then  the  random  variable 

X - p 

(11)  T = 

has  a t-distribution  [see  (8)]  with  n — 1 degrees  of  freedom  (d.f.);  here  X is  given 
by  (4)  and 

I 

(12)  S2  = — — - 2 (Xj  ~ X)2. 

11  j= i 


Derivation  of  (10).  This  is  similar  to  the  derivation  of  (3).  We  choose  a number  y 
between  0 and  1 and  determine  a number  c from  Table  A9  in  App.  5 with  n — 1 d.f.  (or 
from  a CAS)  such  that 

(13)  P(-c  ^T^c)  = F(c)  - F(-c ) = y. 

Since  the  /-distribution  is  symmetric,  we  have 

F(-c)  = 1 - F(c), 

and  (13)  assumes  the  form  (9).  Substituting  (11)  into  (13)  and  transforming  the  result  as 
before,  we  obtain 

(14)  P(X-  p^X  + K)  = y 
where 


K = cS/x'n. 

By  inserting  the  observed  values  x of  X and  sz  of  Sz  into  (14)  we  finally  obtain  (10). 


a 

Confidence  Interval  for  the  Variance  a 
of  the  Normal  Distribution 

Table  25.3  shows  the  steps,  which  are  similar  to  those  in  Tables  25.1  and  25.2. 
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Table  25.3  Determination  of  a Confidence  Interval  for  the  Variance 
a1  of  a Normal  Distribution,  Whose  Mean  Need  Not  Be  Known 

Step  1.  Choose  a confidence  level  y (95%,  99%,  or  the  like). 

Step  2.  Determine  solutions  C\  and  C2  of  the  equations 

(15)  F(C\)  = 1(1  - r),  F(c2)  = 1(1  + 7) 

from  the  table  of  the  chi-square  distribution  with  n — 1 degrees  of 
freedom  (Table  A10  in  App.  5;  or  use  a CAS;  n = sample  size). 

Step  3.  Compute  ( n — 1 ).v2.  where  .v2  is  the  variance  of  the  sample 

-*-l»  ’ * * » 

Step  4.  Compute  k\  = (n  — l)s2/ci  and  k2  = (n  — l)s2/c2 ■ The 
confidence  interval  is 

(16)  C0NF7  {k2  g o-2  g kj). 


Confidence  Interval  for  the  Variance  of  the  Normal  Distribution 

Determine  a 95%  confidence  interval  (16)  for  the  variance,  using  Table  25.3  and  a sample  (tensile  strength  of 
sheet  steel  in  kg/ mm2,  rounded  to  integer  values) 

89  84  87  81  89  86  91  90  78  89  87  99  83  89. 

Solution.  Step  1.  y = 0.95  is  required. 

Step  2.  For  n — 1 = 13  we  find 

Ci  = 5.01  and  c 2 — 24.74. 

Step  3.  13s2  = 326.9. 

Step  4.  1 3s2/ ci  = 65.25,  13s2/c2  = 13.21. 

The  confidence  interval  is 

CONF0.95  113.21  s cr2  g 65.25). 

This  is  rather  large,  and  for  obtaining  a more  precise  result,  one  would  need  a much  larger  sample. 


Theory  for  Table  25.3.  In  Table  25.1  we  used  the  normal  distribution,  in  Table  25.2 

o . . 

the  /-distribution,  and  now  we  shall  use  the  X -distribution  (chi-square  distribution ), 
whose  distribution  function  is  F(z)  = 0 if  z < 0 and 


F(z)  = C„ 


e~ul  y™-2)/2 


du 


if  z = 0 


(Fig.  530). 


The  parameter  m(=  1,  2,  • • •)  is  called  the  number  of  degrees  of  freedom  (d.f.),  and 

Cm  = l/[2m/2r(|m)]. 

Note  that  the  distribution  is  not  symmetric  (see  also  Fig.  531). 
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For  deriving  (16)  in  Table  25.3  we  need  the  following  theorem. 


Fig.  530.  Distribution  function  of  the  chi-square  distribution  with  2,  3,  5 d.f. 


THEOREM  3 


Chi-Square  Distribution 

Under  the  assumptions  in  Theorem  2 the  random  variable 

S2 

(17)  Y = (n  — 1)  — r 

(7 

with  S2  given  by  (12)  has  a chi-square  distribution  with  n — 1 degrees  of  freedom. 


Proof  in  Ref.  [G3],  listed  in  App.  1. 


Derivation  of  (16).  This  is  similar  to  the  derivation  of  (3)  and  ( 10).  We  choose  a number 
y between  0 and  1 and  determine  C\  and  C2  from  Table  A10,  App.  5,  such  that  [see  (15)] 

P(Y  S ci)  = F(ci)  = |(1  - y),  P(Y  S c2)  = F(c2)  = g(l  + y). 
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P(Cl  gri  c2)  = P(Y  =§  c2)  - P(Y  g d)  = F(c2)  - F(Cl)  = y. 

o 

Transforming  Ci  g y g c2  with  7 given  by  (17)  into  an  inequality  for  cr  , we  obtain 


n — 1 
C2 


S2  g o-2 


n — 1 
Cl 


S2. 


By  inserting  the  observed  value  s2  of  S2  we  obtain  (16). 

Confidence  Intervals  for  Parameters 
of  Other  Distributions 

The  methods  in  Tables  25.1-25.3  for  confidence  intervals  for  p and  cr2  are  designed  for 
the  normal  distribution.  We  now  show  that  they  can  also  be  applied  to  other  distributions 
if  we  use  large  samples. 

We  know  that  if  Z1?  ■ ■ • , Xn  are  independent  random  variables  with  the  same  mean  p, 
and  the  same  variance  cr2,  then  their  sum  Yn  = X1  + ■ ■ ■ + Xn  has  the  following  properties. 

(A)  Yn  has  the  mean  np  and  the  variance  ncr2  (by  Theorems  1 and  3 in  Sec.  24.9). 

(B)  If  those  variables  are  normal,  then  Yn  is  normal  (by  Theorem  1). 

If  those  random  variables  are  not  normal,  then  (B)  is  not  applicable.  However,  for  large 
n the  random  variable  Yn  is  still  approximately  normal.  This  follows  from  the  central  limit 
theorem,  which  is  one  of  the  most  fundamental  results  in  probability  theory. 


THEOREM  4 


Central  Limit  Theorem 

Let  X i,  • • • , Xn,  ■■■  be  independent  random  variables  that  have  the  same  distribution 
function  and  therefore  the  same  mean  pi  and  the  same  variance  cr2.  Let 
Yn  = X\  + • ■ • + Xn.  Then  the  random  variable 


(18) 


7 


Yn  - np 
cr\fn 


is  asymptotically  normal  with  mean  0 and  variance  1;  that  is,  the  distribution 
function  Fn(x)  of  Zn  satisfies 


lim  Fn(x)  = <E>  (x) 


1 

V27T 


A proof  can  be  found  in  Ref.  [G3]  listed  in  App.  1 . 

Hence,  when  applying  Tables  25.1-25.3  to  a nonnormal  distribution,  we  must  use 
sufficiently  large  samples.  As  a rule  of  thumb,  if  the  sample  indicates  that  the  skewness 
of  the  distribution  (the  asymmetry;  see  Team  Project  20(d),  Problem  Set  24.6)  is  small, 
use  at  least  n = 20  for  the  mean  and  at  least  n = 50  for  the  variance. 
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1.  Why  are  interval  estimates  generally  more  useful  than 
point  estimates? 


2.  Find  a 95%  confidence  interval  for  the  mean  of  a 
normal  population  with  standard  deviation  4.00  from 
the  sample  39,  51,  49,  43,  57,  59.  Does  that  interval 
get  longer  or  shorter  if  we  take  y = 0.99  instead  of 
0.95?  By  what  factor? 

3.  By  what  factor  does  the  length  of  the  interval  in  Prob.  2 
change  if  we  double  the  sample  size? 

4.  Determine  a 95%  confidence  interval  for  the  mean  pt 
of  a normal  population  with  variance  cr2  = 16,  using 
a sample  of  size  200  with  mean  74.81. 

5.  What  sample  size  would  be  needed  for  obtaining  a 95  % 
confidence  interval  (3)  of  length  2 cr?  Of  length  cr? 

6.  What  sample  size  is  needed  to  obtain  a 99%  confidence 
interval  of  length  2.0  for  the  mean  of  a normal  population 
with  variance  25?  Use  Fig.  526.  Check  by  calculation. 


2-6 


MEAN  (VARIANCE  KNOWN) 


12.  CAS  EXPERIMENT.  Confidence  Intervals.  Obtain 
100  samples  of  size  10  of  the  standardized  normal 
distribution.  Calculate  from  them  and  graph  the 
corresponding  95%  confidence  intervals  for  the  mean 
and  count  how  many  of  them  do  not  contain  0.  Does 
the  result  support  the  theory?  Repeat  the  whole 
experiment,  compare  and  comment. 


13-17 


VARIANCE 


Find  a 95  % confidence  interval  for  the  variance  of  a normal 
population  from  the  sample: 


13.  Length  of  20  bolts  with  sample  mean  20.2  cm  and 
sample  variance  0.04  cm2 

14.  Carbon  monoxide  emission  (grams  per  mile)  of  a 
certain  type  of  passenger  car  (cruising  at  55  mph):  17.3, 
17.8,  18.0,  17.7,  18.2,  17.4,  17.6,  18.1 


15.  Mean  energy  (keV)  of  delayed  neutron  group  (Group  3, 
half-life  6.2  s)  for  uranium  U235  fission:  a sample  of 
100  values  with  mean  442.5  and  variance  9.3 


MEAN  (VARIANCE  UNKNOWN) 

7.  Find  a 95%  confidence  interval  for  the  percentage  of 
cars  on  a certain  highway  that  have  poorly  adjusted 
brakes,  using  a random  sample  of  800  cars  stopped  at 
a roadblock  on  that  highway,  126  of  which  had  poorly 
adjusted  brakes. 

8.  K.  Pearson  result.  Find  a 99%  confidence  interval  for 
p in  the  binomial  distribution  from  a classical  result  by 
K.  Pearson,  who  in  24,000  trials  of  tossing  a coin  obtained 
12,012  Heads.  Do  you  think  that  the  coin  was  fair? 

Find  a 99%  confidence  interval  for  the  mean  of 
a normal  population  from  the  sample: 

9.  Copper  content  (%)  of  brass  66,  66,  65,  64,  66,  67,  64, 
65,  63,  64 

10.  Melting  point  (°C)  of  aluminum  660,  667,  654,  663,  662 

11.  Knoop  hardness  of  diamond  9500,  9800,  9750,  9200, 
9400,  9550 


16.  Ultimate  tensile  strength  (k  psi)  of  alloy  steel 
(Maraging  H)  at  room  temperature:  251,  255,  258,  253, 
253,  252,  250,  252,  255,  256 

17.  The  sample  in  Prob.  9 

18.  If  Xi  and  X2  are  independent  normal  random  variables 
with  mean  14  and  8 and  variance  2 and  5,  respectively, 
what  distribution  does  3 X1  — X2  have?  Hint.  Use  Team 
Project  14(g)  in  Sec.  24.8. 

19.  A machine  fills  boxes  weighing  Y lb  with  X lb  of  salt, 
where  X and  Y are  normal  with  mean  100  lb  and  5 lb 
and  standard  deviation  1 lb  and  0.5  lb,  respectively. 
What  percent  of  filled  boxes  weighing  between  104  lb 
and  106  lb  are  to  be  expected? 

20.  If  the  weight  X of  bags  of  cement  is  normally 
distributed  with  a mean  of  40  kg  and  a standard 
deviation  of  2 kg,  how  many  bags  can  a delivery  truck 
carry  so  that  the  probability  of  the  total  load  exceeding 
2000  kg  will  be  5%? 


Testing  of  Hypotheses.  Decisions 

The  ideas  of  confidence  intervals  and  of  tests2  are  the  two  most  important  ideas  in  modern 
statistics.  In  a statistical  test  we  make  inference  from  sample  to  population  through  testing  a 
hypothesis,  resulting  from  experience  or  observations,  from  a theory  or  a quality  requirement, 
and  so  on.  In  many  cases  the  result  of  a test  is  used  as  a basis  for  a decision,  for  instance,  to 


2Beginning  around  1930,  a systematic  theory  of  tests  was  developed  by  NEYMAN  (see  Sec.  25.3)  and  EGON 
SHARPE  PEARSON  (1895-1980),  English  statistician,  the  son  of  Karl  Pearson  (see  the  footnote  on  p.  1086). 
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buy  (or  not  to  buy)  a certain  model  of  car,  depending  on  a test  of  the  fuel  efficiency  (miles/gal) 
(and  other  tests,  of  course),  to  apply  some  medication,  depending  on  a test  of  its  effect;  to 
proceed  with  a marketing  strategy,  depending  on  a test  of  consumer  reactions,  etc. 

Let  us  explain  such  a test  in  terms  of  a typical  example  and  introduce  the  corresponding 
standard  notions  of  statistical  testing. 

Test  of  a Hypothesis.  Alternative.  Significance  Level  a 

We  want  to  buy  100  coils  of  a certain  kind  of  wire,  provided  we  can  verify  the  manufacturer’s  claim  that  the 
wire  has  a breaking  limit  p = po  = 200  lb  (or  more).  This  is  a test  of  the  hypothesis  (also  called  null  hypothesis) 
p = po  = 200.  We  shall  not  buy  the  wire  if  the  (statistical)  test  shows  that  actually  p = pi  < po,  the  wire  is 
weaker,  the  claim  does  not  hold,  pi  is  called  the  alternative  (or  alternative  hypothesis)  of  the  test.  We  shall 
accept  the  hypothesis  if  the  test  suggests  that  it  is  true,  except  for  a small  error  probability  a,  called  the 
significance  level  of  the  test.  Otherwise  we  reject  the  hypothesis.  Hence  a is  the  probability  of  rejecting  a 
hypothesis  although  it  is  true.  The  choice  of  a is  up  to  us.  5%  and  1%  are  popular  values. 

For  the  test  we  need  a sample.  We  randomly  select  25  coils  of  the  wire,  cut  a piece  from  each  coil,  and 
determine  the  breaking  limit  experimentally.  Suppose  that  this  sample  of  n = 25  values  of  the  breaking  limit 
has  the  mean  x = 197  lb  (somewhat  less  than  the  claim!)  and  the  standard  deviation  s = 6 lb. 

At  this  point  we  could  only  speculate  whether  this  difference  197  — 200  = — 3 is  due  to  randomness,  is  a 
chance  effect,  or  whether  it  is  significant,  due  to  the  actually  inferior  quality  of  the  wire.  To  continue  beyond 
speculation  requires  probability  theory,  as  follows. 

We  assume  that  the  breaking  limit  is  normally  distributed.  (This  assumption  could  be  tested  by  the  method 
in  Sec.  25.7.  Or  we  could  remember  the  central  limit  theorem  (Sec.  25.3)  and  take  a still  larger  sample.)  Then 

X — po 

T = 

S/X^n 

in  (1 1),  Sec.  25.3,  with  p = po  has  a ^-distribution  with  n — 1 degrees  of  freedom  (n  — 1 = 24  for  our  sample). 
Also  x = 197  and  s = 6 are  observed  values  of  X and  S to  be  used  later.  We  can  now  choose  a significance 
level,  say,  a = 5%.  From  Table  A9  in  App.  5 or  from  a CAS  we  then  obtain  a critical  value  c such  that 
P(T  c)  = a = 5%.  For  P(T  ^ c)  = 1 — a:  = 95%  the  table  gives  c = 1.71,  so  that  c = —c  = —1.71 
because  of  the  symmetry  of  the  distribution  (Fig.  532). 

We  now  reason  as  follows — this  is  the  crucial  idea  of  the  test.  If  the  hypothesis  is  true,  we  have  a chance 
of  only  a (=  5%)  that  we  observe  a value  t of  T (calculated  from  a sample)  that  will  fall  between  — and 
— 1.71.  Hence,  if  we  nevertheless  do  observe  such  a t,  we  assert  that  the  hypothesis  cannot  be  true  and  we  reject 
it.  Then  we  accept  the  alternative.  If,  however,  t c,  we  accept  the  hypothesis. 

A simple  calculation  finally  gives  t = (197  — 200)/(6/V25)  = —2.5  as  an  observed  value  of  T.  Since 
—2.5  < —1.71,  we  reject  the  hypothesis  (the  manufacturer’s  claim)  and  accept  the  alternative  p = pi  < 200, 
the  wire  seems  to  be  weaker  than  claimed. 


Fig.  532.  t-distribution  in  Example  1 

This  example  illustrates  the  steps  of  a test: 

1.  Formulate  the  hypothesis  9 = 9q  to  be  tested.  (0O  = l^o  in  the  example.) 

2.  Formulate  an  alternative  9 = 9\.  (9\  = pi  in  the  example.) 

3.  Choose  a significance  level  a (5%,  1%,  0.1%). 

4.  Use  a random  variable  0 = g(X i,  • • • , Xn ) whose  distribution  depends  on  the 
hypothesis  and  on  the  alternative,  and  this  distribution  is  known  in  both  cases.  Determine 
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a critical  value  c from  the  distribution  of  0,  assuming  the  hypothesis  to  be  true.  (In  the 
example,  0 = T,  and  c is,  obtained  from  P(T  = c)  = a.) 

5.  Use  a sample  X\,  - ■■  ,xn  to  determine  an  observed  value  9 = g(.ri,  • • • , xn)  of  0. 
( t in  the  example.) 

6.  Accept  or  reject  the  hypothesis,  depending  on  the  size  of  d relative  to  c.  (t  < c in 
the  example,  rejection  of  the  hypothesis.) 

Two  important  facts  require  further  discussion  and  careful  attention.  The  first  is  the 
choice  of  an  alternative.  In  the  example,  i±\  < /j.0,  but  other  applications  may  require 
Mi  > Mo  or  Mi  ^ Mo-  The  second  fact  has  to  do  with  errors.  We  know  that  a (the 
significance  level  of  the  test)  is  the  probability  of  rejecting  a true  hypothesis.  And  we 
shall  discuss  the  probability  /3  of  accepting  & false  hypothesis. 


One-Sided  and  Two-Sided  Alternatives  (Fig.  533) 

Let  9 be  an  unknown  parameter  in  a distribution,  and  suppose  that  we  want  to  test  the 
hypothesis  6 = do . Then  there  are  three  main  kinds  of  alternatives,  namely, 

(1)  0 > 0O 

(2)  9 <0O 

(3)  9 * do- 

ll) and  (2)  are  one-sided  alternatives,  and  (3)  is  a two-sided  alternative. 

We  call  rejection  region  (or  critical  region)  the  region  such  that  we  reject  the 
hypothesis  if  the  observed  value  in  the  test  falls  in  this  region.  In  (D  the  critical  c lies  to 
the  right  of  do  because  so  does  the  alternative.  Hence  the  rejection  region  extends  to 
the  right.  This  is  called  a right-sided  test.  In  (2)  the  critical  c lies  to  the  left  of  do  (as 
in  Example  1),  the  rejection  region  extends  to  the  left,  and  we  have  a left-sided  test 
(Fig.  533,  middle  part).  These  are  one-sided  tests.  In  (3)  we  have  two  rejection  regions. 
This  is  called  a two-sided  test  (Fig.  533,  lower  part). 
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Fig.  533.  Test  in  the  case  of  alternative  (1)  (upper  part  of  the  figure),  alternative 
(2)  (middle  part),  and  alternative  (3) 
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All  three  kinds  of  alternatives  occur  in  practical  problems.  For  example,  (1)  may  arise 
if  0O  is  the  maximum  tolerable  inaccuracy  of  a voltmeter  or  some  other  instrument. 
Alternative  (2)  may  occur  in  testing  strength  of  material,  as  in  Example  1 . Finally,  6q  in 
(3)  may  be  the  diameter  of  axle-shafts,  and  shafts  that  are  too  thin  or  too  thick  are  equally 
undesirable,  so  that  we  have  to  watch  for  deviations  in  both  directions. 

Errors  in  Tests 

Tests  always  involve  risks  of  making  false  decisions: 

(I)  Rejecting  a true  hypothesis  (Type  I error). 

a = Probability  of  making  a Type  I error. 

(II)  Accepting  a false  hypothesis  (Type  II  error). 

[3  = Probability  of  making  a Type  II  error. 

Clearly,  we  cannot  avoid  these  errors  because  no  absolutely  certain  conclusions  about 
populations  can  be  drawn  from  samples.  But  we  show  that  there  are  ways  and  means  of 
choosing  suitable  levels  of  risks,  that  is,  of  values  a and  (3 . The  choice  of  a depends  on  the 
nature  of  the  problem  (e.g.,  a small  risk  a = 1 % is  used  if  it  is  a matter  of  life  or  death). 

Let  us  discuss  this  systematically  for  a test  of  a hypothesis  0 = 60  against  an  alternative 
that  is  a single  number  for  simplicity.  We  let  6i  > 6q,  so  that  we  have  a right-sided 
test.  For  a left-sided  or  a two-sided  test  the  discussion  is  quite  similar. 

We  choose  a critical  c > 0O  (as  in  the  upper  part  of  Fig.  533,  by  methods  discussed 
below).  From  a given  sample  xi,  • ■ ■ , xn  we  then  compute  a value 


o = g(*i,  ■■•,*«.) 


with  a suitable  g (whose  choice  will  be  a main  point  of  our  further  discussion;  for  instance, 
take  g = (x  i + ■■■  + xn)/n  in  the  case  in  which  6 is  the  mean).  If  0 > c,  we  reject  the 
hypothesis.  If  0 c,  we  accept  it.  Here,  the  value  6 can  be  regarded  as  an  observed  value 
of  the  random  variable 

(4)  © = g(x i,  • • • , xj 

because  xj  may  be  regarded  as  an  observed  value  of  Xj,j  = 1,  • • ■ , n.  In  this  test  there  are 
two  possibilities  of  making  an  error,  as  follows. 

Type  I Error  (see  Table  25.4).  The  hypothesis  is  true  but  is  rejected  (hence  the 
alternative  is  accepted)  because  0 assumes  a value  6 > c.  Obviously,  the  probability  of 
making  such  an  error  equals 

(5)  P(0  > c)e=g0  = a. 

a is  called  the  significance  level  of  the  test,  as  mentioned  before. 

Type  II  Error  (see  Table  25.4).  The  hypothesis  is  false  but  is  accepted  because  0 
assumes  a value  0 c.  The  probability  of  making  such  an  error  is  denoted  by  /3;  thus 


(6) 


P{Q^c)e=ei  = (3. 


SEC.  25.4  Testing  of  Hypotheses.  Decisions 


1081 


EXAMPLE  2 


17  = 1 — [3  is  called  the  power  of  the  test.  Obviously,  the  power  17  is  the  probability  of 
avoiding  a Type  II  error. 


Table  25.-  Type  I and  Type  II  Errors  in  Testing  a Hypothesis 
0 — 0O  Against  an  Alternative  0=0, 


Unknown  Truth 

II 

O 

0 = 0! 

I 0 = 00 

True  decision 
P = 1 - a 

Type  II  error 
P = (3 

0 

0 

< 

0 = 0! 

Type  1 error 
P = a 

True  decision 
P = 1 - 13 

Formulas  (5)  and  (6)  show  that  both  a and  [3  depend  on  c,  and  we  would  like  to  choose 
c so  that  these  probabilities  of  making  errors  are  as  small  as  possible.  But  the  important 
Figure  534  shows  that  these  are  conflicting  requirements  because  to  let  a decrease  we  must 
shift  c to  the  right,  but  then  [3  increases.  In  practice  we  first  choose  a (5%,  sometimes  1 %), 
then  determine  c,  and  finally  compute  / 3 . If  [3  is  large  so  that  the  power  77  = 1 — f3  is  small, 
we  should  repeat  the  test,  choosing  a larger  sample,  for  reasons  that  will  appear  shortly. 


Fig.  534.  Illustration  of  Type  I and  II  errors  in  testing  a hypothesis 
0 = 0O  against  an  alternative  6 = 0,  (>  d0,  right-sided  test) 


If  the  alternative  is  not  a single  number  but  is  of  the  form  ( 1 )— (3),  then  [3  becomes  a 
function  of  0.  This  function  [3(6 ) is  called  the  operating  characteristic  (OC)  of  the  test 
and  its  curve  the  OC  curve.  Clearly,  in  this  case  17  = 1 — [3  also  depends  on  6.  This 
function  r](6)  is  called  the  power  function  of  the  test.  (Examples  will  follow.) 

Of  course,  from  a test  that  leads  to  the  acceptance  of  a certain  hypothesis  9q,  it  does 
not  follow  that  this  is  the  only  possible  hypothesis  or  the  best  possible  hypothesis.  Hence 
the  terms  “not  reject”  or  “fail  to  reject”  are  perhaps  better  than  the  term  “accept.” 

Test  for  fx  of  the  Normal  Distribution  with  Known  cr 

The  following  example  explains  the  three  kinds  of  hypotheses. 

Test  for  the  Mean  of  the  Normal  Distribution  with  Known  Variance 

Let  X be  a normal  random  variable  with  variance  cr2  = 9.  Using  a sample  of  size  n = 10  with  mean  x , test  the 
hypothesis  fj,  = fx0  = 24  against  the  three  kinds  of  alternatives,  namely. 


(a)  fj.  > no 


(b)  /x<  /x0 


(c)  fX  # fX0. 
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Solution.  We  choose  the  significance  level  a = 0.05.  An  estimate  of  the  mean  will  be  obtained  from 

X = l (*i  + ' ' ' + Xn). 

If  the  hypothesis  is  true,  X is  normal  with  mean  fj.  = 24  and  variance  c2/ n = 0.9,  see  Theorem  1,  Sec.  25.3. 
Hence  we  may  obtain  the  critical  value  c from  Table  A8  in  App.  5. 

Case  (a).  Right-Sided  Test.  We  determine  c from  P(X  > c)f, = 24  = a = 0.05,  that  is, 

fc  - 24  \ 

P(X  § c)„_24  = = 1 - “ = 0.95. 

Table  A8  in  App.  5 gives  (c  — 24)/  V0. 9 = 1.645,  and  c = 25.56,  which  is  greater  than  fj.0,  as  in  the  upper 
part  of  Fig.  533.  If  v £ 25.56,  the  hypothesis  is  accepted.  If  x > 25.56,  it  is  rejected.  The  power  function  of  the 
test  is  (Fig.  535) 


1.0 

0.8 

0.6 

0.4 

0.2 


20  22  iio  26  28  /1 

Fig.  535.  Power  function  in  Example  2,  case  (a)  (dashed)  and  case  (c) 


r lip)  = P(X  > 25.56),,  = 1 -P(X?k  25.56)„ 


(7) 


= !-<!> 


25.56  - n 

V0.9 


|=1-  0>(26.94  - 1.05/i) 
Case  (b).  Left-Sided  Test.  The  critical  value  c is  obtained  from  the  equation 
P(XScC.2i  = 


c - 24\ 

— = ct  = 

V(I9  / 


0.05. 


Table  A8  in  App.  5 yields  c = 24  — 1.56  = 22.44.  If  x £ 22.44,  we  accept  the  hypothesis.  1 1 v < 22.44,  we 
reject  it.  The  power  function  of  the  test  is 


(8) 


VW  = P(X  S 22.44)„  = 


<J> 


/ 22.44  - n \ 

V V(h9  / 


= $(23.65  - 1.05/r). 


Case  (c).  Two-Sided  Test.  Since  the  normal  distribution  is  symmetric,  we  choose  c4  and  c2  equidistant  from 
p.  = 24,  say,  <?i  = 24  — k and  c2  = 24  + k , and  determine  k from 


P(24  - ISXS24  + % = 24 


= 1 - a = 0.95. 
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Table  A8  in  App.  5 gives  k/  V0.9  = 1.960,  hence  k = 1.86.  This  gives  the  values  Ci  = 24  — 1.86  = 22.14  and 
c2  = 24  + 1.86  = 25.86.  If  x is  not  smaller  than  c,  and  not  greater  than  c2,  we  accept  the  hypothesis.  Otherwise 
we  reject  it.  The  power  function  of  the  test  is  (Fig.  535) 

tj(a0  = P(X  < 22.14^  + p(X  > 25.86V  = p&  < 22-14V  + 1 - P(X  S 25.86V 


(9) 


= 1 + $ 


/ 22.14  - /i\ 

V V09  / 


- 4> 


25.86  - n 

V09 


= 1 + $(23.34  - 1.05/i)  - $(27.26-1.05/1). 


Consequently,  the  operating  characteristic  /3(/i)  = 1 — t/(/i)  (see  before)  is  (Fig.  536) 

/3(/i)  = $(27.26  - 1.05/i)  - $(23.34  - 1.05/i). 

If  we  take  a larger  sample,  say,  of  size  n = 100  (instead  of  10),  then  it2/ n = 0.09  (instead  of  0.9)  and  the 
critical  values  are  Ci  = 23.41  and  c2  = 24.59,  as  can  be  readily  verified.  Then  the  operating  characteristic  of 
the  test  is 


/3(/i)  = * 


/ 24.59  - /i\ 
V V(l09  / 


- $ 


/ 23.41  - /i\ 
V VO09  / 


= $(81.97  - 3.33/i)  - $(78.03  - 3.33 /i). 


Figure  536  shows  that  the  corresponding  OC  curve  is  steeper  than  that  for  n = 10.  This  means  that  the  increase 
of  n has  led  to  an  improvement  of  the  test.  In  any  practical  case,  n is  chosen  as  small  as  possible  but  so 
large  that  the  test  brings  out  deviations  between  /i  and  /i0  that  are  of  practical  interest.  For  instance,  if 
deviations  of  ±2  units  are  of  interest,  we  see  from  Fig.  536  that  n = 10  is  much  too  small  because  when 
/i  = 24  — 2 = 22  or  /i  = 24  + 2 = 26  /3  is  almost  50%.  On  the  other  hand,  we  see  that  n = 100  is  sufficient 
for  that  purpose. 
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Fig.  536.  Curves  of  the  operating  characteristic  (OC  curves)  in 
Example  2,  case  (c),  for  two  different  sample  sizes  n 
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Test  for  ju  When  cr  Is  Unknown,  and  for  cr 

EXAMPLE  3 Test  for  the  Mean  of  the  Normal  Distribution  with  Unknown  Variance 

The  tensile  strength  of  a sample  of  n = 16  manila  ropes  (diameter  3 in.)  was  measured.  The  sample  mean  was 
x = 4482  kg,  and  the  sample  standard  deviation  was  s = 115  kg  (N.  C.  Wiley,  41st  Annual  Meeting  of  the 
American  Society  for  Testing  Materials).  Assuming  that  the  tensile  strength  is  a normal  random  variable,  test 
the  hypothesis  /jlq  = 4500  kg  against  the  alternative  /jli  = 4400  kg.  Here  /jlq  may  be  a value  given  by  the 
manufacturer,  while  /jli  may  result  from  previous  experience. 


\n  = 10 


n = 100 


0 22  no  26  28 
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EXAMPLE  4 


EXAMPLE  5 


Solution.  We  choose  the  significance  level  a = 5%.  If  the  hypothesis  is  true,  it  follows  from  Theorem  2 
in  Sec.  25.3,  that  the  random  variable 


_ X - /jl0  _ X - 4500 
S/y/n  S/4 

has  a /-distribution  with  n — 1 = 15  d.f.  The  test  is  left-sided.  The  critical  value  c is  obtained  from 
P(T  < c)^  = a = 0.05.  Table  A9  in  App.  5 gives  c — —1.75.  As  an  observed  value  of  T we  obtain  from  the 
sample  t = (4482  — 4500)/ (1 15/4)  = —0.626.  We  see  that  t > c and  accept  the  hypothesis.  For  obtaining 
numeric  values  of  the  power  of  the  test,  we  would  need  tables  called  noncentral  Student  /-tables;  we  shall  not 
discuss  this  question  here. 

Test  for  the  Variance  of  the  Normal  Distribution 

Using  a sample  of  size  n = 15  and  sample  variance  s2  =13  from  a normal  population,  test  the  hypothesis 
<x2  = <x § = 10  against  the  alternative  a2  = cr2  = 20. 

Solution.  We  choose  the  significance  level  a = 5%.  If  the  hypothesis  is  true,  then 

r>2  o2 

Y = (n  - 1)—  = 14—  = 1.4 S2 
a20  10 

has  a chi-square  distribution  with  n — 1 = 14  d.f.  by  Theorem  3,  Sec.  25.3.  From 

P(Y>  c)  = a = 0.05,  that  is,  P(Y  ^ c)  = 0.95, 

and  Table  A10  in  App.  5 with  14  degrees  of  freedom  we  obtain  c = 23.68.  This  is  the  critical  value  of  Y.  Hence 
to  S'2  = ooY/(n  — 1)  = 0.7 14F  there  corresponds  the  critical  value  c*  = 0.714  • 23.68  = 16.91.  Since  s2  < c*, 
we  accept  the  hypothesis. 

If  the  alternative  is  true,  the  random  variable  Yi  = 14 S2/cr2  = 0.7 S2  has  a chi-square  distribution  with  14 
d.f.  Hence  our  test  has  the  power 

r)  = P(S2  > c*)^= 20  = P(Y1  > 0.7c*V  = 2o  = 1 - P(Y1  S 11.84)^=20. 

From  a more  extensive  table  of  the  chi-square  distribution  (e.g.  in  Ref.  [G3]  or  [G8])  or  from  your  CAS,  you 
see  that  17  ~ 62%.  Hence  the  Type  II  risk  is  very  large,  namely,  38%.  To  make  this  risk  smaller,  we  would 
have  to  increase  the  sample  size. 

Comparison  of  Means  and  Variances 

Comparison  of  the  Means  of  Two  Normal  Distributions 

Using  a sample  *1,  • • • , xUl  from  a normal  distribution  with  unknown  mean  ilx  and  a sample  yi,  • • • , from 
another  normal  distribution  with  unknown  mean  fiy,  we  want  to  test  the  hypothesis  that  the  means  are  equal, 
fjix  = fxy,  against  an  alternative,  say,  fxx  > fLy.  The  variances  need  not  be  known  but  are  assumed  to  be  equal.3 
Two  cases  of  comparing  means  are  of  practical  importance: 

Case  A.  The  samples  have  the  same  size.  Furthermore,  each  value  of  the  first  sample  corresponds  to  precisely 
one  value  of  the  other,  because  corresponding  values  result  from  the  same  person  or  thing  (paired  comparison) — 
for  example,  two  measurements  of  the  same  thing  by  two  different  methods  or  two  measurements  from  the  two 
eyes  of  the  same  person.  More  generally,  they  may  result  from  pairs  of  similar  individuals  or  things,  for  example, 
identical  twins,  pairs  of  used  front  tires  from  the  same  car,  etc.  Then  we  should  form  the  differences  of 
corresponding  values  and  test  the  hypothesis  that  the  population  corresponding  to  the  differences  has  mean  0, 
using  the  method  in  Example  3.  If  we  have  a choice,  this  method  is  better  than  the  following. 


3This  assumption  of  equality  of  variances  can  be  tested,  as  shown  in  the  next  example.  If  the  test  shows  that 
they  differ  significantly,  choose  two  samples  of  the  same  size  «i  = n2  — n (not  too  small,  > 30,  say),  use  the 
test  in  Example  2 together  with  the  fact  that  (12)  is  an  observed  value  of  an  approximately  standardized  normal 
random  variable. 
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Case  B.  The  two  samples  are  independent  and  not  necessarily  of  the  same  size.  Then  we  may  proceed 
as  follows.  Suppose  that  the  alternative  is  p.x  > fiy.  We  choose  a significance  level  a.  Then  we  compute  the 
sample  means  x and  y as  well  as  (n  ( — 1 )sx  and  (n2  — 1 )sy,  where  sx  and  sy  are  the  sample  variances.  Using 
Table  A9  in  App.  5 with  »i  + n2  ~ 2 degrees  of  freedom,  we  now  determine  c from 

(10)  P(T  Sc)=  | - a. 

We  finally  compute 


(11) 


to  ■ 


= /»i''2('ti  + rt 2 - 2) 


x-y 


Ui  + n2 


V(n1  - l)s|  + (n2  ~ 1 )Sy 


It  can  be  shown  that  this  is  an  observed  value  of  a random  variable  that  has  a /-distribution  with  «i  + ft 2 — 2 
degrees  of  freedom,  provided  the  hypothesis  is  true.  If  /q  = c,  the  hypothesis  is  accepted.  If  /q  > c,  it  is  rejected. 
If  the  alternative  is  px  A py,  then  (10)  must  be  replaced  by 

(10*)  P{T  ^ ci)  = 0.5a,  P(T  ^ c2)  = 1 0.5a. 

Note  that  for  samples  of  equal  size  «i  = «2  = n->  formula  (11)  reduces  to 


(12) 


t0=  Vn 


V4 


To  illustrate  the  computations,  let  us  consider  the  two  samples  (xi,  • • • , xUl ) and  (yi,  • • • , ynz)  given  by 


105 

108 

86 

103 

103 

107 

124 

105 

89 

92 

84 

97 

103 

107 

111 

97 

showing  the  relative  output  of  tin  plate  workers  under  two  different  working  conditions  [J.  J.  B.  Worth,  Journal 
of  Industrial  Engineering  9,  249-253).  Assuming  that  the  corresponding  populations  are  normal  and  have  the 
same  variance,  let  us  test  the  hypothesis  px  = py  against  the  alternative  /jlx  A py . (Equality  of  variances  will 
be  tested  in  the  next  example.) 

Solution.  We  find 

x = 105.125,  v = 97.500,  si  = 106.125.  s%  = B4.000. 

We  choose  the  significance  level  a = 5%.  From  (10*)  with  0.5a  = 2.5%,  1 — 0.5a  = 97.5%  and  Table  A9 
in  App.  5 with  14  degrees  of  freedom  we  obtain  = —2.14  and  c2  = 2.14.  Formula  (12)  with  n = 8 gives  the 
value 


to  = V8  • 7.625/V190.125  = 1.56. 

Since  ^ /q  = c2,  we  accept  the  hypothesis  p,x  = py  that  under  both  conditions  the  mean  output  is  the  same. 

Case  A applies  to  the  example  because  the  two  first  sample  values  correspond  to  a certain  type  of  work,  the 
next  two  were  obtained  in  another  kind  of  work,  etc.  So  we  may  use  the  differences 

16  16  2 6 0 0 13  8 

of  corresponding  sample  values  and  the  method  in  Example  3 to  test  the  hypothesis  p = 0,  where  p is  the  mean 
of  the  population  corresponding  to  the  differences.  As  a logical  alternative  we  take  p =£  0.  The  sample  mean  is 
d = 7.625,  and  the  sample  variance  is  s2  = 45.696.  Hence 

/ = V8  (7.625  - 0)/V4^696  = 3.19. 

From  P(T  ^ Ci)  = 2.5%,  P{T  ^ c2)  = 97.5%  and  Table  A9  in  App.  5 with  n — 1 = 7 degrees  of  freedom  we 
obtain  c\  = —2.36,  c2  = 2.36  and  reject  the  hypothesis  because  t = 3.19  does  not  lie  between  C\  and  c2.  Hence 
our  present  test,  in  which  we  used  more  information  (but  the  same  samples),  shows  that  the  difference  in  output 
is  significant. 
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EXAMPLE  6 Comparison  of  the  Variance  of  Two  Normal  Distributions 

Using  the  two  samples  in  the  last  example,  test  the  hypothesis  a%  = dy\  assume  that  the  corresponding 
populations  are  normal  and  the  nature  of  the  experiment  suggests  the  alternative  a*  > ay. 

Solution.  We  find  s \ = 106.125,  Sy  = 84.000.  We  choose  the  significance  level  a = 5%.  Using 
P(V  = c)  = 1 — a = 95%  and  Table  All  in  App.  5,  with  {n\  — 1,  n 2 — 1)  = (7,  7)  degrees  of  freedom,  we 
determine  c = 3.79.  We  finally  compute  Vq  = s%/sy  = 1.26.  Since  Uq  = c->  we  accept  the  hypothesis.  If  Vq  < c, 
we  would  reject  it. 

This  test  is  justified  by  the  fact  that  vq  is  an  observed  value  of  a random  variable  that  has  a so-called 
F-distribution  with  (n  1 — l,  n 2 ~ 1)  degrees  of  freedom,  provided  the  hypothesis  is  true.  (Proof  in  Ref.  [G3] 
listed  in  App.  1.)  The  F-distribution  with  (m,  n)  degrees  of  freedom  was  introduced  by  R.  A.  Fisher4  and  has 
the  distribution  function  F(z)  = 0 if  z < 0 and 


(13)  F(z)  = Kmn  [ r<**-*VV  + n)-(m+n)/2  dt  (z  S 0), 

where  Kmn  = m ?n,i  ‘Tfj  m + gn)/r(|m)r(|n).  (For  T see  App.  A3.1.) 

This  long  section  contained  the  basic  ideas  and  concepts  of  testing,  along  with  typical 
applications  and  you  may  perhaps  want  to  review  it  quickly  before  going  on,  because  the 
next  sections  concern  an  adaptation  of  these  ideas  to  tasks  of  great  practical  importance 
and  resulting  tests  in  connection  with  quality  control,  acceptance  (or  rejection)  of  goods 
produced,  and  so  on. 


P HFO  B L E-M -S  E T -2  5^4 


1.  From  memory:  Make  a list  of  the  three  types  of 
alternatives,  each  with  a typical  example  of  your  own. 

2.  Make  a list  of  methods  in  this  section,  each  with  the 
distribution  needed  in  testing. 

3.  Test  /x  = 0 against  /jl  > 0,  assuming  normality  and 
using  the  sample  0,  1,  — 1,3,  —8,  6,  1 (deviations  of  the 
azimuth  [multiples  of  0.01  radian]  in  some  revolution 
of  a satellite).  Choose  a = 5%. 

4.  In  one  of  his  classical  experiments  Buffon  obtained  2048 
heads  in  tossing  a coin  4040  times.  Was  the  coin  fair? 

5.  Do  the  same  test  as  in  Prob.  4,  using  a result  by  K. 
Pearson,  who  obtained  6019  heads  in  12,000  trials. 

6.  Assuming  normality  and  known  variance  a1 2 3 4 5 6 7  = 9, 
test  the  hypothesis  /x  = 60.0  against  the  alternative 
/x  = 57.0  using  a sample  of  size  20  with  mean  x — 58.50 
and  choosing  a = 5%. 

7.  How  does  the  result  in  Prob.  6 change  if  we  use  a small- 
er sample,  say,  of  size  5,  the  other  data  ( x = 58.05, 
a = 5%,  etc.)  remaining  as  before? 


8.  Determine  the  power  of  the  test  in  Prob.  6. 

9.  What  is  the  rejection  region  in  Prob.  6 in  the  case  of  a 
two-sided  test  with  a — 5%? 

10.  CAS  EXPERIMENT.  Tests  of  Means  and  Variances. 

(a)  Obtain  100  samples  of  size  10  each  from  the  normal 
distribution  with  mean  100  and  variance  25.  For  each 
sample,  test  the  hypothesis  /x0  = 100  against  the 
alternative  /xi  > 100  at  the  level  of  a = 10%.  Record 
the  number  of  rejections  of  the  hypothesis.  Do  the  whole 
experiment  once  more  and  compare. 

(b)  Set  up  a similar  experiment  for  the  variance  of  a 
normal  distribution  and  perform  it  100  times. 

11.  A firm  sells  oil  in  cans  containing  5000  g oil  per  can 
and  is  interested  to  know  whether  the  mean  weight 
differs  significantly  from  5000  g at  the  5%  level,  in 
which  case  the  filling  machine  has  to  be  adjusted.  Set 
up  a hypothesis  and  an  alternative  and  perform  the  test, 
assuming  normality  and  using  a sample  of  50  fillings 
with  mean  4990  g and  standard  deviation  20  g. 


4After  the  pioneering  work  of  the  English  statistician  and  biologist,  KARL  PEARSON  (1857-1936),  the 
founder  of  the  English  school  of  statistics,  and  WILLIAM  SEALY  GOSSET  (1876-1937),  who  discovered  the 
/-distribution  (and  published  under  the  name  “Student”),  the  English  statistician  Sir  RONALD  AYLMER 
FISHER  (1890-1962),  professor  of  eugenics  in  London  (1933-1943)  and  professor  of  genetics  in  Cambridge, 
England  (1943-1957)  and  Adelaide,  Australia  (1957-1962),  had  great  influence  on  the  further  development  of 
modern  statistics. 
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12.  If  a sample  of  25  tires  of  a certain  kind  has  a mean  life 
of  37,000  miles  and  a standard  deviation  of  5000  miles, 
can  the  manufacturer  claim  that  the  true  mean  life  of 
such  tires  is  greater  than  35,000  miles?  Set  up  and  test 
a corresponding  hypothesis  at  the  5 % level,  assuming 
normality. 

13.  If  simultaneous  measurements  of  electric  voltage  by 
two  different  types  of  voltmeter  yield  the  differences 
(in  volts)  0.4,  —0.6,  0.2,  0.0,  1.0,  1.4,  0.4,  1.6,  can  we 
assert  at  the  5%  level  that  there  is  no  significant 
difference  in  the  calibration  of  the  two  types  of 
instruments?  Assume  normality. 

14.  If  a standard  medication  cures  about  75%  of  patients 
with  a certain  disease  and  a new  medication  cured  310 
of  the  first  400  patients  on  whom  it  was  tried,  can  we 
conclude  that  the  new  medication  is  better?  Choose 
a = 5%.  First  guess.  Then  calculate. 

15.  Suppose  that  in  the  past  the  standard  deviation  of 
weights  of  certain  100.0-oz  packages  filled  by  a 
machine  was  0.8  oz.  Test  the  hypothesis  H0:  a = 0.8 
against  the  alternative  H±:  it  > 0.8  (an  undesirable 
increase),  using  a sample  of  20  packages  with  standard 
deviation  1.0  oz  and  assuming  normality.  Choose 
a = 5%. 

16.  Suppose  that  in  operating  battery-powered  electrical 
equipment,  it  is  less  expensive  to  replace  all  batter- 
ies at  fixed  intervals  than  to  replace  each  battery 
individually  when  it  breaks  down,  provided  the 
standard  deviation  of  the  lifetime  is  less  than  a certain 


25.5  Quality  Control 

The  ideas  on  testing  can  be  adapted  and  extended  in  various  ways  to  serve  basic  practical 
needs  in  engineering  and  other  fields.  We  show  this  in  the  remaining  sections  for  some 
of  the  most  important  tasks  solvable  by  statistical  methods.  As  a first  such  area  of  problems, 
we  discuss  industrial  quality  control,  a highly  successful  method  used  in  various  industries. 

No  production  process  is  so  perfect  that  all  the  products  are  completely  alike.  There  is 
always  a small  variation  that  is  caused  by  a great  number  of  small,  uncontrollable  factors 
and  must  therefore  be  regarded  as  a chance  variation.  It  is  important  to  make  sure  that  the 
products  have  required  values  (for  example,  length,  strength,  or  whatever  property  may 
be  essential  in  a particular  case).  For  this  purpose  one  makes  a test  of  the  hypothesis  that 
the  products  have  the  required  property,  say,  /jl  = /x0,  where  /z0  is  a required  value.  If 
this  is  done  after  an  entire  lot  has  been  produced  (for  example,  a lot  of  100,000  screws), 
the  test  will  tell  us  how  good  or  how  bad  the  products  are,  but  it  it  obviously  too  late  to 
alter  undesirable  results.  It  is  much  better  to  test  during  the  production  run.  This  is  done 
at  regular  intervals  of  time  (for  example,  every  hour  or  half-hour)  and  is  called  quality 
control.  Each  time  a sample  of  the  same  size  is  taken,  in  practice  3 to  10  times.  If  the 
hypothesis  is  rejected,  we  stop  the  production  and  look  for  the  cause  of  the  trouble. 


limit,  say,  less  than  5 hours.  Set  up  and  apply  a suitable 
test,  using  a sample  of  28  values  of  lifetimes  with 
standard  deviation  s = 3.5  hours  and  assuming 
normality:  choose  a = 5%. 

17.  Brand  A gasoline  was  used  in  16  similar  automobiles 
under  identical  conditions.  The  corresponding  sample 
of  16  values  (miles  per  gallon)  had  mean  19.6  and 
standard  deviation  0.4.  Under  the  same  conditions, 
high-power  brand  B gasoline  gave  a sample  of  16 
values  with  mean  20.2  and  standard  deviation  0.6.  Is 
the  mileage  of  B significantly  better  than  that  of  A? 
Test  at  the  5%  level;  assume  normality.  First  guess. 
Then  calculate. 

18.  The  two  samples  70,  80,  30,  70,  60,  80  and  140,  120, 
130,  120,  120,  130,  120  are  values  of  the  differences  of 
temperatures  (°C)  of  iron  at  two  stages  of  casting,  taken 
from  two  different  crucibles.  Is  the  variance  of  the  first 
population  larger  than  that  of  the  second?  Assume 
normality.  Choose  a = 5%. 

19.  Show  that  for  a normal  distribution  the  two  types  of 
errors  in  a test  of  a hypothesis  F/0:  /x  = /x0  against  an 
alternative  7/ 1 : /x  = /jl1  can  be  made  as  small  as  one 
pleases  (not  zero!)  by  taking  the  sample  sufficiently 
large. 

20.  Test  for  equality  of  population  means  against  the 
alternative  that  the  means  are  different  assuming 
normality,  choosing  a = 5%  and  using  two  samples  of 
sizes  12  and  18,  with  mean  10  and  14,  respectively, 
and  equal  standard  deviation  3. 
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If  we  stop  the  production  process  even  though  it  is  progressing  properly,  we  make  a 
Type  I error.  If  we  do  not  stop  the  process  even  though  something  is  not  in  order,  we 
make  a Type  II  error  (see  Sec.  25.4).  The  result  of  each  test  is  marked  in  graphical  form 
on  what  is  called  a control  chart.  This  was  proposed  by  W.  A.  Shewhart  in  1924  and 
makes  quality  control  particularly  effective. 

Control  Chart  for  the  Mean 

An  illustration  and  example  of  a control  chart  is  given  in  the  upper  part  of  Fig.  537.  This 
control  chart  for  the  mean  shows  the  lower  control  limit  LCL,  the  center  control  line 
CL,  and  the  upper  control  limit  UCL.  The  two  control  limits  correspond  to  the  critical 
values  ci  and  C2  in  case  (c)  of  Example  2 in  Sec.  25.4.  As  soon  as  a sample  mean  falls 
outside  the  range  between  the  control  limits,  we  reject  the  hypothesis  and  assert  that  the 


Sample  no. 


J 

0.5% 

7 

\ 

i 

% 

7 

\ 

0.5% 

Sample  no.  5 10 


Fig.  537.  Control  charts  for  the  mean  (upper  part  of  figure)  and 
the  standard  deviation  in  the  case  of  the  samples  on  p.  1089 
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production  process  is  “out  of  control”;  that  is,  we  assert  that  there  has  been  a shift  in 
process  level.  Action  is  called  for  whenever  a point  exceeds  the  limits. 

If  we  choose  control  limits  that  are  too  loose,  we  shall  not  detect  process  shifts.  On  the 
other  hand,  if  we  choose  control  limits  that  are  too  tight,  we  shall  be  unable  to  run  the 
process  because  of  frequent  searches  for  nonexistent  trouble.  The  usual  significance  level 
is  a = 1%.  From  Theorem  1 in  Sec.  25.3  and  Table  A8  in  App.  5 we  see  that  in  the  case 
of  the  normal  distribution  the  corresponding  control  limits  for  the  mean  are 

(1)  LCL  = /i o - 2.58  UCL  = /jl0  + 2.58  . 

Vn  Vn 

Here  cr  is  assumed  to  be  known.  If  cr  is  unknown,  we  may  compute  the  standard  deviations 
of  the  first  20  or  30  samples  and  take  their  arithmetic  mean  as  an  approximation  of  cr. 
The  broken  line  connecting  the  means  in  Fig.  537  is  merely  to  display  the  results. 

Additional,  more  subtle  controls  are  often  used  in  industry.  For  instance,  one  observes 
the  motions  of  the  sample  means  above  and  below  the  centerline,  which  should  happen 
frequently.  Accordingly,  long  runs  (conventionally  of  length  7 or  more)  of  means  all  above 
(or  all  below)  the  centerline  could  indicate  trouble. 


Table  25.5  Twelve  Samples  of  Five  Values  Each 
(Diameter  of  Small  Cylinders,  Measured  in  Millimeters) 


Sample 

Number 

Sample  Values 

X 

5 

R 

1 

4.06 

4.08 

4.08 

4.08 

4.10 

4.080 

0.014 

0.04 

2 

4.10 

4.10 

4.12 

4.12 

4.12 

4.112 

0.011 

0.02 

3 

4.06 

4.06 

4.08 

4.10 

4.12 

4.084 

0.026 

0.06 

4 

4.06 

4.08 

4.08 

4.10 

4.12 

4.088 

0.023 

0.06 

5 

4.08 

4.10 

4.12 

4.12 

4.12 

4.108 

0.018 

0.04 

6 

4.08 

4.10 

4.10 

4.10 

4.12 

4.100 

0.014 

0.04 

7 

4.06 

4.08 

4.08 

4.10 

4.12 

4.088 

0.023 

0.06 

8 

4.08 

4.08 

4.10 

4.10 

4.12 

4.096 

0.017 

0.04 

9 

4.06 

4.08 

4.10 

4.12 

4.14 

4.100 

0.032 

0.08 

10 

4.06 

4.08 

4.10 

4.12 

4.16 

4.104 

0.038 

0.10 

11 

4.12 

4.14 

4.14 

4.14 

4.16 

4.140 

0.014 

0.04 

12 

4.14 

4.14 

4.16 

4.16 

4.16 

4.152 

0.011 

0.02 

Control  Chart  for  the  Variance 

In  addition  to  the  mean,  one  often  controls  the  variance,  the  standard  deviation,  or  the  range. 
To  set  up  a control  chart  for  the  variance  in  the  case  of  a normal  distribution,  we  may  employ 
the  method  in  Example  4 of  Sec.  25.4  for  determining  control  limits.  It  is  customary  to  use  only 
one  control  limit,  namely,  an  upper  control  limit.  Now  from  Example  4 of  Sec.  25.4  we  have 
.S'2  = cr^Y/in  — 1),  where,  because  of  our  normality  assumption,  the  random  variable  Y has  a 
chi-square  distribution  with  n — 1 degrees  of  freedom.  Hence  the  desired  control  limit  is 


n — 1 


(2) 


UCL  = 
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where  c is  obtained  from  the  equation 

P(Y  > c)  = a,  that  is,  P(Y  Si  c)  = 1 — a 

and  the  table  of  the  chi-square  distribution  (Table  A10  in  App.  5)  with  n — 1 degrees  of 
freedom  (or  from  your  CAS);  here  a (5%  or  1%,  say)  is  the  probability  that  in  a properly 
running  process  an  observed  value  s2  of  .S' 2 is  greater  than  the  upper  control  limit. 

If  we  wanted  a control  chart  for  the  variance  with  both  an  upper  control  limit  UCL  and 
a lower  control  limit  LCL,  these  limits  would  be 

<x2ci  cr2c2 

(3)  LCL  = and  UCL  = , 

n — 1 7i—l 

where  <q  and  c2  are  obtained  from  Table  A10  with  n — 1 d.f.  and  the  equations 

(4)  P(Y^Cl)  = | and  P(Y  g c2)  = 1 - | . 


Control  Chart  for  the  Standard  Deviation 

To  set  up  a control  chart  for  the  standard  deviation,  we  need  an  upper  control  limit 


(5) 


UCL 


crVc 

— 1 


obtained  from  (2).  For  example,  in  Table  25.5  we  have  n = 5.  Assuming  that  the 
corresponding  population  is  normal  with  standard  deviation  cr  = 0.02  and  choosing 
a = 1 % , we  obtain  from  the  equation 

P(Y  g c)  = 1 - a = 99% 

and  Table  A10  in  App.  5 with  4 degrees  of  freedom  the  critical  value  c = 13.28  and  from 
(5)  the  corresponding  value 


UCL 


0.02VLL28 

V4 


0.0365, 


which  is  shown  in  the  lower  part  of  Fig.  537. 

A control  chart  for  the  standard  deviation  with  both  an  upper  and  a lower  control  limit 
is  obtained  from  (3). 


Control  Chart  for  the  Range 

Instead  of  the  variance  or  standard  deviation,  one  often  controls  the  range  R (=  largest 
sample  value  minus  smallest  sample  value).  It  can  be  shown  that  in  the  case  of  the  normal 
distribution,  the  standard  deviation  <x  is  proportional  to  the  expectation  of  the  random 
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variable  R*  for  which  R is  an  observed  value,  say,  a = \nE(R*)  where  the  factor  of 
proportionality  An  depends  on  the  sample  size  n and  has  the  values 


n 23456789  10 

\n  = cr/E{R*)  0.89  0.59  0.49  0.43  0.40  0.37  0.35  0.34  0.32 

n 12  14  16  18  20  30  40  50 

\n  = (t/E(R*)  0.31  0.29  0.28  0.28  0.27  0.25  0.23  0.22 


Since  R depends  on  two  sample  values  only,  it  gives  less  information  about  a sample 
than  .v  does.  Clearly,  the  larger  the  sample  size  n is,  the  more  information  we  lose  in  using 
R instead  of  s.  A practical  rule  is  to  use  s when  n is  larger  than  10. 


PRrQB^LEM^=SET— 2 5^5 


1.  Suppose  a machine  for  filling  cans  with  lubricating 
oil  is  set  so  that  it  will  generate  fillings  which  form 
a normal  population  with  mean  1 gal  and  standard 
deviation  0.02  gal.  Set  up  a control  chart  of  the 
type  shown  in  Fig.  537  for  controlling  the  mean,  that 
is,  find  LCL  and  UCL,  assuming  that  the  sample  size 
is  4. 

2.  Three-sigma  control  chart.  Show  that  in  Prob.  1,  the 
requirement  of  the  significance  level  a = 0.3%  leads 
to  LCL  = fjL  — 'icr/  Vrc  and  UCL  = /jl  + 3cr/  Vn,  and 
find  the  corresponding  numeric  values. 

3.  What  sample  size  should  we  choose  in  Prob.  1 if  we 
want  LCL  and  UCL  somewhat  closer  together,  say, 
UCL  — LCL  = 0.02,  without  changing  the  signifi- 
cance level? 

4.  What  effect  on  UCL  — LCL  does  it  have  if  we  double 
the  sample  size?  If  we  switch  from  a = 1%  to 
a = 5%? 

5.  How  should  we  change  the  sample  size  in  controlling 
the  mean  of  a normal  population  if  we  want 
UCL  — LCL  to  decrease  to  half  its  original  value? 

6.  Graph  the  means  of  the  following  10  samples 
(thickness  of  gaskets,  coded  values)  on  a control  chart 
for  means,  assuming  that  the  population  is  normal  with 
mean  5 and  standard  deviation  1.16. 

Time  _10:00  11:00  12:00  13:00 

5 7 7 4 

Sample  2 5 3 4 

values  5 4 6 3 

6 4 5 6 


7.  Graph  the  ranges  of  the  samples  in  Prob.  6 on  a control 
chart  for  ranges. 

8.  Graph  \n  = cr/E{R*)  as  a function  of  n.  Why  is  \n  a 
monotone  decreasing  function  of  rf! 

9.  Eight  samples  of  size  2 were  taken  from  a lot  of  screws. 
The  values  (length  in  inches)  are 

Sample  No.  1 2 3 4 5 6 7 8 

, 3.50  3.51  3.49  3.52  3.53  3.49  3.48  3.52 

Length 

3.51  3.48  3.50  3.50  3.49  3.50  3.47  3.49 

Assuming  that  the  population  is  normal  with  mean 
3.500  and  variance  0.0004  and  using  (1),  set  up  a 
control  chart  for  the  mean  and  graph  the  sample  means 
on  the  chart. 

10.  Attribute  control  charts.  Fifteen  samples  of  size  100 
were  taken  from  a production  of  containers.  The 
numbers  of  defectives  (leaking  containers)  in  those 
samples  (in  the  order  observed)  were 

145497056  13  021  12  8 

From  previous  experience  it  was  known  that  the 
average  fraction  defective  is  p = 4%  provided  that 
the  process  of  production  is  running  properly.  Using 
the  binomial  distribution,  set  up  a fraction  defective 
chart  (also  called  a /7-chart),  that  is,  choose  the 

14:00  15:00  16:00  _L7:00  J8:00  19:00 

5 6 5 5 3 3 

6 4 5 2 4 6 

4 6 6 5 8 6 

6 4 4 3 
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LCL  = 0 and  determine  the  UCL  for  the  fraction 
defective  (in  percent)  by  the  use  of  3-sigma  limits, 
where  cr2  is  the  variance  of  the  random  variable 
X = Fraction  defective  in  a sample  of  size  100. 

Is  the  process  under  control? 

11.  Number  of  defectives.  Find  formulas  for  the  UCL,  CL, 
and  LCL  (corresponding  to  3cr-limits)  in  the  case  of  a 
control  chart  for  the  number  of  defectives,  assuming 
that,  in  a state  of  statistical  control,  the  fraction  of 
defectives  is  p. 

12.  CAS  PROJECT.  Control  Charts,  (a)  Obtain  100 
samples  of  4 values  each  from  the  normal  distribution 
with  mean  8.0  and  variance  0.16  and  their  means, 
variances,  and  ranges. 

(b)  Use  these  samples  for  making  up  a control  chart 
for  the  mean. 

(c)  Use  them  on  a control  chart  for  the  standard 
deviation. 

(d)  Make  up  a control  chart  for  the  range. 

(e)  Describe  quantitative  properties  of  the  samples 
that  you  can  see  from  those  charts  (e.g.,  whether  the 


corresponding  process  is  under  control,  whether  the 
quantities  observed  vary  randomly,  etc.). 

13.  Since  the  presence  of  a point  outside  control  limits  for 
the  mean  indicates  trouble,  how  often  would  we  be 
making  the  mistake  of  looking  for  nonexistent  trouble 
if  we  used  (a)  1 -sigma  limits,  (b)  2-sigma  limits? 
Assume  normality. 

14.  What  LCL  and  UCL  should  we  use  instead  of  (1)  if, 
instead  of  x,  we  use  the  sum  X\  + • • • + xn  of  the 
sample  values?  Determine  these  limits  in  the  case  of 
Fig.  537. 

15.  Number  of  defects  per  unit.  A so-called  c-chart  or 
defects-per-unit  chart  is  used  for  the  control  of  the 
number  X of  defects  per  unit  (for  instance,  the  number 
of  defects  per  100  meters  of  paper,  the  number  of 
missing  rivets  in  an  airplane  wing,  etc.),  (a)  Set  up 
formulas  for  CL  and  LCL,  UCL  corresponding  to 
p ± 3cr,  assuming  that  X has  a Poisson  distribution, 
(b)  Compute  CL,  LCL,  and  UCL  in  a control  process 
of  the  number  of  imperfections  in  sheet  glass;  assume 
that  this  number  is  3.6  per  sheet  on  the  average  when 
the  process  is  in  control. 


25.6  Acceptance  Sampling 

Acceptance  sampling  is  usually  done  when  products  leave  the  factory  (or  in  some  cases 
even  within  the  factory).  The  standard  situation  in  acceptance  sampling  is  that  a producer 
supplies  to  a consumer  (a  buyer  or  wholesaler)  a lot  of  N items  (a  carton  of  screws,  for 
instance).  The  decision  to  accept  or  reject  the  lot  is  made  by  determining  the  number  x 
of  defectives  (=  defective  items)  in  a sample  of  size  n from  the  lot.  The  lot  is  accepted 
if  x c,  where  c is  called  the  acceptance  number,  giving  the  allowable  number  of 
defectives.  If  x > c,  the  consumer  rejects  the  lot.  Clearly,  producer  and  consumer  must 
agree  on  a certain  sampling  plan  giving  n and  c. 

From  the  hypergeometric  distribution  we  see  that  the  event  A:  “Accept  the  lot”  has 
probability  (see  Sec.  24.7) 


(1) 


P(A)  = P(X  S c)  = 2 


x=0 


M 


N - M\  UN 
n — x J/  \n 


where  M is  the  number  of  defectives  in  a lot  of  N items.  In  terms  of  the  fraction  defective 
6 = M/N  we  can  write  (1)  as 


P{A\  6)  can  assume  n + 1 values  corresponding  to  9 = 0,  I / N,  2/N,  • • ■ , N/N;  here,  n and 
c are  fixed.  A monotone  smooth  curve  through  these  points  is  called  the  operating 
characteristic  curve  (OC  curve)  of  the  sampling  plan  considered. 
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EXAMPLE  1 


EXAMPLE  2 


Sampling  Plan 

Suppose  that  certain  tool  bits  are  packaged  20  to  a box,  and  the  following  sampling  plan  is  used.  A sample  of 
two  tool  bits  is  drawn,  and  the  corresponding  box  is  accepted  if  and  only  if  both  bits  in  the  sample  are  good. 
In  this  case,  N = 20,  n = 2,  c = 0,  and  (2)  takes  the  form  (a  factor  2 drops  out) 


P(A-  0) 


(20  - 20  0)(19  - 20  0) 
380 


The  values  of  P(A,  8)  for  0 = 0,  1/20,  2/20,  • • • , 20/20  and  the  resulting  OC  curve  are  shown  in  Fig.  538. 
(Verify!) 


P(A-8)  0.5- 


0 I I I I I I I I I I 

0 0.5  1 

0 


P(A-  0) 


\ 


0 1 1 -1- 

0 0.2 


Fig.  538.  OC  curve  of  the  sampling  plan  with  n = 2 Fig.  539.  OC  curve  in  Example  2 

and  c = 0 for  lots  of  size  N = 20 


In  most  practical  cases  0 will  be  small  (less  than  10%).  Then  if  we  take  small  samples 
compared  to  N,  we  can  approximate  (2)  by  the  Poisson  distribution  (Sec.  24.7);  thus 


(3) 


P(A;0)~e-^  fJ- 
„ x\ 


(/z  = nO). 


Sampling  Plan.  Poisson  Distribution 

Suppose  that  for  large  lots  the  following  sampling  plan  is  used.  A sample  of  size  n = 20  is  taken.  If  it  contains 
not  more  than  one  defective,  the  lot  is  accepted.  If  the  sample  contains  two  or  more  defectives,  the  lot  is  rejected. 
In  this  plan,  we  obtain  from  (3) 

P(A;d)~  e~20e(  1 + 20  0), 

The  corresponding  OC  curve  is  shown  in  Fig.  539. 

Errors  in  Acceptance  Sampling 

We  show  how  acceptance  sampling  fits  into  general  test  theory  (Sec.  25.4)  and  what  this 
means  from  a practical  point  of  view.  The  producer  wants  the  probability  a of  rejecting 
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P(A-  6) 


Good  ] Indifference  ] Poor 
material  \ zone  \ material 


Fig.  540.  OC  curve,  producer’s  and  consumer’s  risks 


an  acceptable  lot  (a  lot  for  which  0 does  not  exceed  a certain  number  0 o on  which  the 
two  parties  agree)  to  be  small.  Oq  is  called  the  acceptable  quality  level  (AQL).  Similarly, 
the  consumer  (the  buyer)  wants  the  probability  (3  of  accepting  an  unacceptable  lot  (a  lot 
for  which  0 is  greater  than  or  equal  to  some  0 1 ) to  be  small.  0\  is  called  the  lot  tolerance 
percent  defective  (LTPD)  or  the  rejectable  quality  level  (RQL).  a is  called  producer’s 
risk.  It  corresponds  to  a Type  I error  in  Sec.  25.4.  fi  is  called  consumer’s  risk  and 
corresponds  to  a Type  II  error.  Figure  540  shows  an  example.  We  see  that  the  points 
(Do.  1 — a)  and  (0| . fi)  lie  on  the  OC  curve.  It  can  be  shown  that  for  large  lots  we  can 
choose  9o,  0i  (>  Oq),  a,  fi  and  then  determine  n and  c such  that  the  OC  curve  runs  very 
close  to  those  prescribed  points.  Table  25.6  shows  the  analogy  between  acceptance 
sampling  and  hypothesis  testing  in  Sec.  25.4. 


Table  25.6  Acceptance  Sampling  and  Hypothesis  Testing 


Acceptance  Sampling 

Hypothesis  Testing 

Acceptable  quality  level  (AQL)  9 = 0O 
Lot  tolerance  percent  defectives  (LTPD) 
9 = 0i 

Allowable  number  of  defectives  c 
Producer’s  risk  a of  rejecting  a lot 
with  0 § 0O 

Consumer’s  risk  f3  of  accepting  a lot 
with  0 g 0! 

Hypothesis  0 = 0O 
Alternative  0 = 0! 

Critical  value  c 

Probability  a of  making  a Type  I error 
(significance  level) 

Probability  (3  of  making  a Type  II  error 

Rectification 

Rectification  of  a rejected  lot  means  that  the  lot  is  inspected  item  by  item  and  all  defectives 
are  removed  and  replaced  by  nondefective  items.  (This  may  be  too  expensive  if  the  lot  is 
cheap;  in  this  case  the  lot  may  be  sold  at  a cut-rate  price  or  scrapped.)  If  a production 
turns  out  1000%  defectives,  then  in  K lots  of  size  N each,  KN6  of  the  KN  items  are 
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defectives.  Now  KP{A\  9)  of  these  lots  are  accepted.  These  contain  KPN9  defectives, 
whereas  the  rejected  and  rectified  lots  contain  no  defectives,  because  of  the  rectification. 
Hence  after  the  rectification  the  fraction  defective  in  all  K lots  equals  KPN9/KN.  This  is 
called  the  average  outgoing  quality  (AOQ);  thus 


(4) 


AOQ(0)  = 9P(A-  9). 


Figure  541  shows  an  example.  Since  AOQ(O)  = 0 and  P(A\  1)  = 0,  the  AOQ  curve  has 
a maximum  at  some  9 = 9*,  giving  the  average  outgoing  quality  limit  (AOQL).  This  is 
the  worst  average  quality  that  may  be  expected  to  be  accepted  under  rectification. 


0.5  - 


AOQL 


OC  curve 


\ 


AOQ  curve 


6*  0.5 


Fig.  541.  OC  curve  and  AOQ  curve  for  the  sampling  plan  in  Fig.  538 


FRQBL~EM=S^T—yS~6 


1.  Lots  of  kitchen  knives  are  inspected  by  a sampling  plan 
that  uses  a sample  of  size  20  and  the  acceptance  number 
c = 1 . What  is  the  probability  of  accepting  a lot  with 
1%,  2%,  10%  defectives  (knives  with  dull  blades)? 
Use  Table  A6  of  the  Poisson  distribution  in  App.  5. 
Graph  the  OC  curve. 

2.  What  happens  in  Prob.  1 if  the  sample  size  is  increased 
to  50?  First  guess.  Then  calculate.  Graph  the  OC  curve 
and  compare. 

3.  How  will  the  probabilities  in  Prob.  1 with  n = 20 
change  (up  or  down)  if  we  decrease  c to  zero?  First 
guess. 

4.  What  are  the  producer’s  and  consumer’s  risks  in 
Prob.  1 if  the  AQL  is  2%  and  the  RQL  is  15%? 

5.  Lots  of  copper  pipes  are  inspected  according  to  a 
sample  plan  that  uses  sample  size  25  and  acceptance 
number  1.  Graph  the  OC  curve  of  the  plan,  using  the 


Poisson  approximation.  Find  the  producer’s  risk  if  the 
AQL  is  1.5%. 

6.  Graph  the  AOQ  curve  in  Prob.  5.  Determine  the  AOQL, 
assuming  that  rectification  is  applied. 

7.  In  Example  1 in  the  text,  what  are  the  producer’s  and 
consumer’s  risks  if  the  AQL  is  0.1  and  the  RQL  is  0.6? 

8.  What  happens  in  Example  1 in  the  text  if  we  increase 
the  sample  size  to  n = 3,  leaving  the  other  data  as 
before?  Compute  P(A\  0.1)  and  P(A\  0.2)  and  compare 
with  Example  1. 

9.  Graph  and  compare  sampling  plans  with  c = 1 and 
increasing  values  of  n,  say,  n = 2,  3,  4.  (Use  the 
binomial  distribution.) 

10.  Find  the  binomial  approximation  of  the  hypergeometric 
distribution  in  Example  1 in  the  text  and  compare  the 
approximate  and  the  accurate  values. 
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11.  Samples  of  3 fuses  are  drawn  from  lots  and  a lot  is 
accepted  if  in  the  corresponding  sample  we  find  no 
more  than  1 defective  fuse.  Criticize  this  sampling  plan. 
In  particular,  find  the  probability  of  accepting  a lot 
that  is  50%  defective.  (Use  the  binomial  distribution 
(7),  Sec.  24.7.) 

12.  If  in  a sampling  plan  for  large  lots  of  spark  plugs,  the 
sample  size  is  100  and  we  want  the  AQL  to  be  5%  and 
the  producer's  risk  2%,  what  acceptance  number  c 
should  we  choose?  (Use  the  normal  approximation  of 
the  binomial  distribution  in  Sec.  24.8.) 


13.  What  is  the  consumer’s  risk  in  Prob.  12  if  we  want  the 
RQL  to  be  12%?  Use  c = 9 from  the  answer  of 
Prob.  12. 

14.  A lot  of  batteries  for  wrist  watches  is  accepted  if  and 
only  if  a sample  of  20  contains  at  most  1 defective. 
Graph  the  OC  and  AOQ  curves.  Find  AOQL.  [Use  (3).] 

15.  Graph  the  OC  curve  and  the  AOQ  curve  for  the  single 
sampling  plan  for  large  lots  with  n = 5 and  c = 0,  and 
find  the  AOQL. 


25  J Goodness  of  Fit.  x2  "Test 

To  test  for  goodness  of  fit  means  that  we  wish  to  test  that  a certain  function  Fix)  is  the 
distribution  function  of  a distribution  from  which  we  have  a sample  X\,  ■ ■ ■ , xn.  Then  we 
test  whether  the  sample  distribution  function  F(x)  defined  by 

F(x)  = Sum  of  the  relative  frequencies  of  all  sample  values  Xj  not  exceeding  x 

fits  F(x ) “sufficiently  well.”  If  this  is  so,  we  shall  accept  the  hypothesis  that  F(x)  is  the 
distribution  function  of  the  population;  if  not,  we  shall  reject  the  hypothesis. 

This  test  is  of  considerable  practical  importance,  and  it  differs  in  character  from  the 
tests  for  parameters  (jj.,  cr2,  etc.)  considered  so  far. 

To  test  in  that  fashion,  we  have  to  know  how  much  F(x)  can  differ  from  F(x)  if  the 
hypothesis  is  true.  Hence  we  must  first  introduce  a quantity  that  measures  the  deviation 
of  F( x)  from  F(x),  and  we  must  know  the  probability  distribution  of  this  quantity  under 
the  assumption  that  the  hypothesis  is  true.  Then  we  proceed  as  follows.  We  determine 
a number  c such  that,  if  the  hypothesis  is  true,  a deviation  greater  than  c has  a small 
preassigned  probability.  If,  nevertheless,  a deviation  greater  than  c occurs,  we  have  reason 
to  doubt  that  the  hypothesis  is  true  and  we  reject  it.  On  the  other  hand,  if  the  deviation 
does  not  exceed  c,  so  that  F(x)  approximates  F(x)  sufficiently  well,  we  accept  the 
hypothesis.  Of  course,  if  we  accept  the  hypothesis,  this  means  that  we  have  insufficient 
evidence  to  reject  it,  and  this  does  not  exclude  the  possibility  that  there  are  other  functions 
that  would  not  be  rejected  in  the  test.  In  this  respect  the  situation  is  quite  similar  to  that 
in  Sec.  25.4. 

Table  25.7  shows  a test  of  that  type,  which  was  introduced  by  R.  A.  Fisher.  This 
test  is  justified  by  the  fact  that  if  the  hypothesis  is  true,  then  xo  is  an  observed  value 
of  a random  variable  whose  distribution  function  approaches  that  of  the  chi-square 
distribution  with  K — 1 degrees  of  freedom  (or  K — r — 1 degrees  of  freedom  if  r 
parameters  are  estimated)  as  n approaches  infinity.  The  requirement  that  at  least  five 
sample  values  lie  in  each  interval  in  Table  25.7  results  from  the  fact  that  for  finite 
n that  random  variable  has  only  approximately  a chi-square  distribution.  A proof  can 
be  found  in  Ref.  [G3]  listed  in  App.  1.  If  the  sample  is  so  small  that  the  requirement 
cannot  be  satisfied,  one  may  continue  with  the  test,  but  then  use  the  result  with 
caution. 
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Table  25.7  Chi-square  Test  for  the  Hypothesis  That  F(x)  is  the  Distribution  Function 
of  a Population  from  Which  a Sample  x„  • • ■ , x„  is  Taken 


Step  1.  Subdivide  the  x-axis  into  K intervals  71;  I2,-  -,IK  such  that  each  interval  contains 
at  least  5 values  of  the  given  sample  x 1;  • ■ • , xn.  Determine  the  number  bj  of  sample 
values  in  the  interval  Ij,  where  j = 1,  • • • , K.  If  a sample  value  lies  at  a common 
boundary  point  of  two  intervals,  add  0.5  to  each  of  the  two  corresponding  bj. 

Step  2.  Using  F(x),  compute  the  probability  pj  that  the  random  variable  X under 
consideration  assumes  any  value  in  the  interval  Ij,  where  j = 1,  ■ ■ • , K.  Compute 


ej  = nPj. 


(This  is  the  number  of  sample  values  theoretically  expected  in  Ij  if  the  hypothesis 
is  true.) 


Step  3.  Compute  the  deviation 

(1) 


2 V ® ~ Sj) 

3=  1 


Step  4.  Choose  a significance  level  (5%,  1%,  or  the  like). 

Step  5.  Determine  the  solution  c of  the  equation 

P(X2  Sc)  = l-  a 

from  the  table  of  the  chi-sqare  distribution  with  K — 1 degrees  of  freedom  (Table 
A10  in  App.  5).  If  r parameters  of  F(x)  are  unknown  and  their  maximum  likelihood 
estimates  (Sec.  25.2)  are  used,  then  use  K — r — 1 degrees  of  freedom  (instead 
of  K — 1).  If  xo  = c,  accept  the  hypothesis.  If  xo  > c,  reject  the  hypothesis. 


Table  25.8  Sample  of  100  Values  of  the  Splitting  Tensile  Strength  (lb/in.2) 
of  Concrete  Cylinders 


320 

380 

340 

410 

380 

340 

360 

350 

320 

370 

350 

340 

350 

360 

370 

350 

380 

370 

300 

420 

370 

390 

390 

440 

330 

390 

330 

360 

400 

370 

320 

350 

360 

340 

340 

350 

350 

390 

380 

340 

400 

360 

350 

390 

400 

350 

360 

340 

370 

420 

420 

400 

350 

370 

330 

320 

390 

380 

400 

370 

390 

330 

360 

380 

350 

330 

360 

300 

360 

360 

360 

390 

350 

370 

370 

350 

390 

370 

370 

340 

370 

400 

360 

350 

380 

380 

360 

340 

330 

370 

340 

360 

390 

400 

370 

410 

360 

400 

340 

360 

D.  L.  IVEY,  Splitting  tensile  tests  on  structural  lightweight  aggregate  concrete.  Texas  Transportation 
Institute,  College  Station,  Texas. 


Test  of  Normality 

Test  whether  the  population  from  which  the  sample  in  Table  25.8  was  taken  is  normal. 

Solution.  Table  25.8  shows  the  values  (column  by  column)  in  the  order  obtained  in  the  experiment.  Table 
25.9  gives  the  frequency  distribution  and  Fig.  542  the  histogram.  It  is  hard  to  guess  the  outcome  of  the  test — 
does  the  histogram  resemble  a normal  density  curve  sufficiently  well  or  not? 
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The  maximum  likelihood  estimates  for  /jl  and  cr2  are  fi=x  = 364.7  and  a2  = 712.9.  The  computation  in 
Table  25.10  yields  xo  = 2.688.  It  is  very  interesting  that  the  interval  375  • • • 385  contributes  over  50%  of  xo- 
From  the  histogram  we  see  that  the  corresponding  frequency  looks  much  too  small.  The  second  largest 
contribution  comes  from  395  • • • 405,  and  the  histogram  shows  that  the  frequency  seems  somewhat  too  large, 
which  is  perhaps  not  obvious  from  inspection. 


Table  25.9  Frequency  Table  of  the  Sample  in  Table  25.8 


1 

Tensile 

Strength 

X 

[lb/in.2] 

2 

Absolute 

Frequency 

3 

Relative 

Frequency 

fix) 

4 

Cumulative 

Absolute 

Frequency 

5 

Cumulative 

Relative 

Frequency 

m 

300 

2 

0.02 

2 

0.02 

310 

0 

0.00 

2 

0.02 

320 

4 

0.04 

6 

0.06 

330 

6 

0.06 

12 

0.12 

340 

11 

0.11 

23 

0.23 

350 

14 

0.14 

37 

0.37 

360 

16 

0.16 

53 

0.53 

370 

15 

0.15 

68 

0.68 

380 

8 

0.08 

76 

0.76 

390 

10 

0.10 

86 

0.86 

400 

8 

0.08 

94 

0.94 

410 

2 

0.02 

96 

0.96 

420 

3 

0.03 

99 

0.99 

430 

0 

0.00 

99 

0.99 

440 

1 

0.01 

100 

1.00 

We  choose  a = 5%.  Since  K — 10  and  we  estimated  r = 2 parameters  we  have  to  use  Table  A 10  in  App.  5 
with  K — r — 1 = 7 degrees  of  freedom.  We  find  c = 14.07  as  the  solution  of  P(\  = c)  = 95%.  Since  xo  < c, 
we  accept  the  hypothesis  that  the  population  is  normal. 


[Ib./in.2] 


Fig.  542.  Frequency  histogram  of  the  sample  in  Table  25.8 


SEC.  25.7 


Goodness  of  Fit.  ;y2-Test 


1099 


Table  25.10  Computations  in  Example  1 


Xj 

xj  - 364.7 

JX> 

- 364.7  \ 

e3 

bj 

Term  in  (1) 

26.7 

26.7  J 

— 00 

• • 325 

— 00  • • 

■ -1.49 

0.0000 

• ■ • 0.0681 

6.81 

6 

0.096 

325 

• • 335 

-1.49  • • 

• -1.11 

0.0681 

• ■ • 0.1335 

6.54 

6 

0.045 

335 

• • 345 

-1.11  • • 

• -0.74 

0.1335 

• • • 0.2296 

9.61 

11 

0.201 

345 

• • 355 

-0.74  • • 

• -0.36 

0.2296 

• • • 0.3594 

12.98 

14 

0.080 

355 

• • 365 

-0.36  • • 

• 0.01 

0.3594 

• • • 0.5040 

14.46 

16 

0.164 

365 

• • 375 

0.01  • • 

• 0.39 

0.5040 

• • • 0.6517 

14.77 

15 

0.0004 

375 

• • 385 

0.39  • • 

• 0.76 

0.6517 

• • • 0.7764 

12.47 

8 

1.602 

385 

• • 395 

0.76  • • 

• 1.13 

0.7764 

• • • 0.8708 

9.44 

10 

0.033 

395 

• • 405 

1.13  • • 

• 1.51 

0.8708 

• ■ • 0.9345 

6.37 

8 

0.417 

405 

• • 00 

1.51  • ■ 

• 00 

0.9345 

■ ■ ■ 1.0000 

6.55 

6 

0.046 

Xo  = 2.688 
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1.  Verify  the  calculations  in  Example  1 of  the  text. 

2.  If  it  is  known  that  25%  of  certain  steel  rods  produced 
by  a standard  process  will  break  when  subjected  to  a 
load  of  5000  lb,  can  we  claim  that  a new,  less  expensive 
process  yields  the  same  breakage  rate  if  we  find  that  in 
a sample  of  80  rods  produced  by  the  new  process,  27 
rods  broke  when  subjected  to  that  load?  (Use  a = 5%.) 

3.  If  100  flips  of  a coin  result  in  40  heads  and  60  tails, 
can  we  assert  on  the  5%  level  that  the  coin  is  fair? 

4.  If  in  10  flips  of  a coin  we  get  the  same  ratio  as  in  Prob.  3 
(4  heads  and  6 tails),  is  the  conclusion  the  same  as  in 
Prob.  3?  First  conjecture,  then  compute. 

5.  Can  you  claim,  on  a 5%  level,  that  a die  is  fair  if  60 
trials  give  1,  • ■ ■ , 6 with  absolute  frequencies  10,  13,  9, 
11,  9,  8? 

6.  Solve  Prob.  5 if  rolling  a die  180  times  gives  33,  27, 
29,  35,  25,  31. 

7.  If  a service  station  had  served  60,  49,  56,  46,  68,  39 
cars  from  Monday  through  Friday  between  1 p.m.  and 
2 P.M.,  can  one  claim  on  a 5%  level  that  the  differences 
are  due  to  randomness?  First  guess.  Then  calculate. 

8.  A manufacturer  claims  that  in  a process  of  producing 
drill  bits,  only  2.5%  of  the  bits  are  dull.  Test  the  claim 
against  the  alternative  that  more  than  2.5%  of  the  bits 
are  dull,  using  a sample  of  400  bits  containing  17  dull 
ones.  Use  a = 5%. 

9.  In  a table  of  properly  rounded  function  values,  even 
and  odd  last  decimals  should  appear  about  equally 
often.  Test  this  for  the  90  values  of  Ji(x)  in  Table  A1 
in  App.  5. 


10.  TEAM  PROJECT.  Difficulty  with  Random 
Selection.  77  students  were  asked  to  choose  3 of  the 
integers  1 1,  12,  13,  ■ ■ ■ , 30  completely  arbitrarily.  The 
amazing  result  was  as  follows. 

Number  11  12  13  14  15  16  17  18  19  20 

Frequ.  11  10  20  8 13  9 21  9 16  8 

Number  21  22  23  24  25  26  27  28  29  30 

Frequ.  12  8 15  10  10  9 12  8 13  9 

If  the  selection  were  completely  random,  the  following 
hypotheses  should  be  true. 

(a)  The  20  numbers  are  equally  likely. 

(b)  The  10  even  numbers  together  are  as  likely  as  the 
10  odd  numbers  together. 

(c)  The  6 prime  numbers  together  have  probability  0.3 
and  the  14  other  numbers  together  have  probability  0.7. 
Test  these  hypotheses,  using  a = 5%.  Design  further 
experiments  that  illustrate  the  difficulties  of  random 
selection. 

11.  CAS  EXPERIMENT.  Random  Number  Generator. 

Check  your  generator  experimentally  by  imitating 
results  of  n trials  of  rolling  a fair  die,  with  a convenient 
n (e.g.,  60  or  300  or  the  like).  Do  this  many  times  and 
see  whether  you  can  notice  any  “nonrandomness” 
features,  for  example,  too  few  Sixes,  too  many  even 
numbers,  etc.,  or  whether  your  generator  seems  to  work 
properly.  Design  and  perform  other  kinds  of  checks. 

12.  Test  for  normality  at  the  1 % level  using  a sample  of 
n = 79  (rounded)  values  x (tensile  strength  [kg/mm2] 
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of  steel  sheets  of  0.3  mm  thickness),  a = a(x)  = 
absolute  frequency.  (Take  the  first  two  values  together, 
also  the  last  three,  to  get  K = 5.) 


X 

57 

58 

59 

60 

61 

62 

63 

64 

a 

4 

10 

17 

27 

8 

9 

3 

1 

13.  Mendel’s  pathbreaking  experiments.  In  a famous 
plant-crossing  experiment,  the  Austrian  Augustinian 
father  Gregor  Mendel  (1822-1884)  obtained  355 
yellow  and  123  green  peas.  Test  whether  this  agrees 
with  Mendel's  theory  according  to  which  the  ratio 
should  be  3:1. 

14.  Accidents  in  a foundry.  Does  the  random  variable 

X = Number  of  accidents  per  week  have  a Poisson 
distribution  if,  within  50  weeks,  33  were  accident-free, 
1 accident  occurred  in  11  of  the  50  weeks,  2 in  6 of 


the  weeks,  and  more  than  2 accidents  in  no  week? 
Choose  a — 5%. 

15.  Radioactivity.  Rutherford-Geiger  experiments. 

Using  the  given  sample,  test  that  the  corresponding 
population  has  a Poisson  distribution,  x is  the  number 
of  alpha  particles  per  7.5-s  intervals  observed  by 
E.  Rutherford  and  H.  Geiger  in  one  of  their  classical 
experiments  in  1910,  and  a(x)  is  the  absolute  frequency 
(=  number  of  time  periods  during  which  exactly  x 
particles  were  observed).  Use  a = 5%. 


X 

0 

1 

2 

3 

4 

5 

6 

a 

57 

203 

383 

525 

532 

408 

273 

X 

7 

8 

9 

10 

11 

12 

£13 

a 

139 

45 

27 

10 

4 

2 

0 

Nonparametric  Tests 

Nonparametric  tests,  also  called  distribution-free  tests,  are  valid  for  any  distribution. 
Hence  they  are  used  in  cases  when  the  kind  of  distribution  is  unknown,  or  is  known  but 
such  that  no  tests  specifically  designed  for  it  are  available.  In  this  section  we  shall  explain 
the  basic  idea  of  these  tests,  which  are  based  on  “order  statistics”  and  are  rather  simple. 
If  there  is  a choice,  then  tests  designed  for  a specific  distribution  generally  give  better 
results  than  do  nonparametric  tests.  For  instance,  this  applies  to  the  tests  in  Sec.  25.4  for 
the  normal  distribution. 

We  shall  discuss  two  tests  in  terms  of  typical  examples.  In  deriving  the  distributions 
used  in  the  test,  it  is  essential  that  the  distributions,  from  which  we  sample,  are  continuous. 
(Nonparametric  tests  can  also  be  derived  for  discrete  distributions,  but  this  is  slightly  more 
complicated.) 

EXAMPLE  Sign  Test  for  the  Median 

A median  of  the  population  is  a solution  x = jl  of  the  equation  F(x ) = 0.5,  where  F is  the  distribution  function 
of  the  population. 

Suppose  that  eight  radio  operators  were  tested,  first  in  rooms  without  air-conditioning  and  then  in  air-conditioned 
rooms  over  the  same  period  of  time,  and  the  difference  of  errors  (unconditioned  minus  conditioned)  were 

9 4 0 6 4 0 7 11. 

Test  the  hypothesis  /X  = 0 (that  is,  air-conditioning  has  no  effect)  against  the  alternative  > 0 (that  is,  inferior 
performance  in  unconditioned  rooms). 

Solution.  We  choose  the  significance  level  a = 5%.  If  the  hypothesis  is  true,  the  probability  p of  a positive 
difference  is  the  same  as  that  of  a negative  difference.  Hence  in  this  case,  p = 0.5,  and  the  random  variable 

X = Number  of  positive  values  among  n values 

has  a binomial  distribution  with  p = 0.5.  Our  sample  has  eight  values.  We  omit  the  values  0,  which  do  not 
contribute  to  the  decision.  Then  six  values  are  left,  all  of  which  are  positive.  Since 

P(X  = 6)  =r)(0.5)6(0.5)° 


25.8 


= 0.0156 
= 1.56% 
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we  have  observed  an  event  whose  probability  is  very  small  if  the  hypothesis  is  true;  in  fact  1.56%  < a = 5%. 
Hence  we  assert  that  the  alternative  > 0 is  true.  That  is,  the  number  of  errors  made  in  unconditioned  rooms 
is  significantly  higher,  so  that  installation  of  air  conditioning  should  be  considered. 

Test  for  Arbitrary  Trend 

A certain  machine  is  used  for  cutting  lengths  of  wire.  Five  successive  pieces  had  the  lengths 

29  31  28  30  32. 

Using  this  sample,  test  the  hypothesis  that  there  is  no  trend,  that  is,  the  machine  does  not  have  the  tendency  to 
produce  longer  and  longer  pieces  or  shorter  and  shorter  pieces.  Assume  that  the  type  of  machine  suggests  the 
alternative  that  there  is  positive  trend,  that  is,  there  is  the  tendency  of  successive  pieces  to  get  longer. 

Solution.  We  count  the  number  of  transpositions  in  the  sample,  that  is,  the  number  of  times  a larger  value 
precedes  a smaller  value: 


29  precedes  28  (1  transposition), 

31  precedes  28  and  30  (2  transpositions). 

The  remaining  three  sample  values  follow  in  ascending  order.  Hence  in  the  sample  there  are  1+2  = 3 
transpositions.  We  now  consider  the  random  variable 

T = Number  of  transpositions. 

If  the  hypothesis  is  true  (no  trend),  then  each  of  the  5!  = 120  permutations  of  five  elements  1 2 3 4 5 has  the 
same  probability  (1/120).  We  arrange  these  permutations  according  to  their  number  of  transpositions: 


T = 0 


T=  1 


T = 2 


T = 3 


1 2 3 4 5 


1 2 3 5 4 

1 2 4 3 5 

1 3 2 4 5 

2 13  4 5 


1 2 4 5 3 

1 2 5 3 4 

1 3 2 5 4 

1 3 4 2 5 

1 4 2 3 5 

2 13  5 4 
2 14  3 5 

2 3 14  5 

3 12  4 5 


1 2 5 4 3 

1 3 4 5 2 

1 3 5 2 4 

1 4 2 5 3 

1 4 3 2 5 

1 5 2 3 4 

2 14  5 3 

2 1 5 3 4 etc. 

2 3 15  4 

2 3 4 1 5 

2 4 13  5 

3 12  5 4 

3 14  2 5 

3 2 14  5 

4 12  3 5 


From  this  we  obtain 


P(T  £ 3)  = jig  + ito 


9 

120 


15 

120 


29 

120 


= 24%. 


We  accept  the  hypothesis  because  we  have  observed  an  event  that  has  a relatively  large  probability  (certainly 
much  more  than  5%)  if  the  hypothesis  is  true. 

Values  of  the  distribution  function  of  T in  the  case  of  no  trend  are  shown  in  Table  A12,  App.  5.  For  instance, 
if  n = 3,  then  F(0)  = 0.167,  F(l)  = 0.500,  F(2)  = 1 - 0.167.  If  n = 4,  then  F(0)  = 0.042,  F(l)  = 0.167, 
F( 2)  = 0.375,  F(3)  = 1 - 0.375,  F(4)  = 1 - 0.167,  and  so  on. 
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Our  method  and  those  values  refer  to  continuous  distributions.  Theoretically,  we  may  then  expect  that  all  the 
values  of  a sample  are  different.  Practically,  some  sample  values  may  still  be  equal,  because  of  rounding:  If  m 
values  are  equal,  add  m(m  — l)/4  (=  mean  value  of  the  transpositions  in  the  case  of  the  permutations  of  m 
elements),  that  is,  | for  each  pair  of  equal  values,  § for  each  triple,  etc. 


F-R-Q-B-fc-E^M— S-ET~2STS 


1.  What  would  change  in  Example  1 had  we  observed 
only  5 positive  values?  Only  4? 

2.  Test  jl  = 0 against  /x  > 0,  using  1,  —1,  1,  3,  —8,  6,  0 
(deviations  of  the  azimuth  [multiples  of  0.01  radian]  in 
some  revolution  of  a satellite). 

3.  Are  oil  filters  of  type  A better  than  type  B filters  if  in 
1 1 trials,  A gave  cleaner  oil  than  B in  7 cases,  B gave 
cleaner  oil  than  A in  1 case,  whereas  in  3 of  the  trials 
the  results  for  A and  B were  practically  the  same? 

4.  Does  a process  of  producing  stainless  steel  pipes  of 
length  20  ft  for  nuclear  reactors  need  adjustment  if,  in  a 
sample,  4 pipes  have  the  exact  length  and  15  are  shorter 
and  3 longer  than  20  ft?  Use  the  normal  approximation 
of  the  binomial  distribution. 

5.  Do  the  computations  in  Prob.  4 without  the  use  of  the 
DeMoivre-Laplace  limit  theorem  in  Sec.  24.8. 

6.  Thirty  new  employees  were  grouped  into  15  pairs  of 
similar  intelligence  and  experience  and  were  then 
instructed  in  data  processing  by  an  old  method  (A) 
applied  to  one  (randomly  selected)  person  of  each  pair, 
and  by  a new  presumably  better  method  (B)  applied  to 
the  other  person  of  each  pair.  Test  for  equality  of 
methods  against  the  alternative  that  (B)  is  better  than 
(A),  using  the  following  scores  obtained  after  the  end 
of  the  training  period. 


A 

60 

70 

80 

85 

75 

40 

70 

45 

95 

80 

90 

60 

80 

75 

65 

B 

65 

85 

85 

80 

95 

65 

100 

60 

90 

85 

100 

75 

90 

60 

80 

7.  Assuming  normality,  solve  Prob.  6 by  a suitable  test 
from  Sec.  25.4. 

8.  In  a clinical  experiment,  each  of  10  patients  were  given 
two  different  sedatives  A and  B.  The  following  table 
shows  the  effect  (increase  of  sleeping  time,  measured 
in  hours).  Using  the  sign  test,  find  out  whether  the 
difference  is  significant. 

A 1.9  0.8  1.1  0.1  -0.1  4.4  5.5  1.6  4.6  3.4 

B 0.7  -1.6  -0.2  -1.2  -0.1  3.4  3.7  0.8  0.0  2.0 


9.  Assuming  that  the  populations  corresponding  to  the 
samples  in  Prob.  8 are  normal,  apply  a suitable  test  for 
the  normal  distribution. 

10.  Test  whether  a thermostatic  switch  is  properly  set  to 
50°C  against  the  alternative  that  its  setting  is  too  low. 
Use  a sample  of  9 values,  8 of  which  are  less  than  50°C 
and  1 is  greater. 

11.  How  would  you  proceed  in  the  sign  test  if  the 
hypothesis  is  jl  = jl0  (any  number)  instead  of  jl  = 0? 

12.  Test  the  hypothesis  that,  for  a certain  type  of  voltmeter, 
readings  are  independent  of  temperature  T [°C]  against 
the  alternative  that  they  tend  to  increase  with  T.  Use 
a sample  of  values  obtained  by  applying  a constant 
voltage: 


Temperature  T [°C] 

10 

20 

30 

40 

50 

Reading  V [volts] 

99.5 

101.1 

100.4 

100.8 

101.6 

13.  Does  the  amount  of  fertilizer  increase  the  yield  of 
wheat  X [kg/plot]?  Use  a sample  of  values  ordered 
according  to  increasing  amounts  of  fertilizer: 

33.4  35.3  31.6  35.0  36.1  37.6  36.5  38.7. 

14.  Apply  the  test  explained  in  Example  2 to  the  following 
data  (x  = diastolic  blood  pressure  [mm  Hg],  y = 
weight  of  heart  [in  grams]  of  10  patients  who  died  of 
cerebral  hemorrhage). 


X 

121 

120 

95 

123 

140 

112 

92 

100 

102 

91 

y 

521 

465 

352 

455 

490 

388 

301 

395 

375 

418 

15.  Does  an  increase  in  temperature  cause  an  increase  of 
the  yield  of  a chemical  reaction  from  which  the 
following  sample  was  taken? 


Temperature  [°C] 

10 

20 

30 

40 

60  80 

Yield  [kg/min] 

0.6 

1.1 

0.9 

1.6 

1.2  2.0 

Difference  1.2  2.4  1.3  1.3  0.0  1.0  1.8  0.8  4.6  1.4 
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Regression.  Fitting  Straight  Lines. 
Correlation 


So  far  we  were  concerned  with  random  experiments  in  which  we  observed  a single  quantity 
(random  variable)  and  got  samples  whose  values  were  single  numbers.  In  this  section  we 
discuss  experiments  in  which  we  observe  or  measure  two  quantities  simultaneously,  so 
that  we  get  samples  of  pairs  of  values  (x±,  yi),  (x2,  y2), ' ' ' , (xn, »,,).  Most  applications 
involve  one  of  two  kinds  of  experiments,  as  follows. 

1.  In  regression  analysis  one  of  the  two  variables,  call  it  x,  can  be  regarded  as  an 
ordinary  variable  because  we  can  measure  it  without  substantial  error  or  we  can 
even  give  it  values  we  want,  x is  called  the  independent  variable,  or  sometimes 
the  controlled  variable  because  we  can  control  it  (set  it  at  values  we  choose).  The 
other  variable,  Y,  is  a random  variable,  and  we  are  interested  in  the  dependence  of 
Y on  x.  Typical  examples  are  the  dependence  of  the  blood  pressure  Y on  the  age  x 
of  a person  or,  as  we  shall  now  say,  the  regression  of  Y on  x,  the  regression  of  the 
gain  of  weight  Y of  certain  animals  on  the  daily  ration  of  food  x,  the  regression  of 
the  heat  conductivity  Y of  cork  on  the  specific  weight  x of  the  cork,  etc. 

2.  In  correlation  analysis  both  quantities  are  random  variables  and  we  are  interested 
in  relations  between  them.  Examples  are  the  relation  (one  says  “correlation”) 
between  wear  X and  wear  Y of  the  front  tires  of  cars,  between  grades  X and  Y of 
students  in  mathematics  and  in  physics,  respectively,  between  the  hardness  X of 
steel  plates  in  the  center  and  the  hardness  Y near  the  edges  of  the  plates,  etc. 

Regression  Analysis 

In  regression  analysis  the  dependence  of  Y on  x is  a dependence  of  the  mean  p of  V on 
x,  so  that  p = p(x)  is  a function  in  the  ordinary  sense.  The  curve  of  p(x)  is  called  the 

regression  curve  of  Y on  x. 

In  this  section  we  discuss  the  simplest  case,  namely,  that  of  a straight  regression  line 
(1)  p(x)  = k0  + k-lx. 

Then  we  may  want  to  graph  the  sample  values  as  n points  in  the  xT-plane,  fit  a straight 
line  through  them,  and  use  it  for  estimating  p(x)  at  values  of  x that  interest  us,  so  that  we 
know  what  values  of  Y we  can  expect  for  those  x.  Fitting  that  line  by  eye  would  not  be 
good  because  it  would  be  subjective;  that  is,  different  persons’  results  would  come  out 
differently,  particularly  if  the  points  are  scattered.  So  we  need  a mathematical  method  that 
gives  a unique  result  depending  only  on  the  n points.  A widely  used  procedure  is  the  method 
of  least  squares  by  Gauss  and  Legendre.  For  our  task  we  may  formulate  it  as  follows. 


Least  Squares  Principle 

The  straight  line  should  be  fitted  through  the  given  points  so  that  the  sum  of  the 
squares  of  the  distances  of  those  points  from  the  straight  line  is  minimum,  where 
the  distance  is  measured  in  the  vertical  direction  ( the  y-direction ).  (Formulas  below.) 
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To  get  uniqueness  of  the  straight  line,  we  need  some  extra  condition.  To  see  this,  take 
the  sample  (0,  1),  (0,  —1).  Then  all  the  lines  y = kyx  with  any  k \ satisfy  the  principle. 
(Can  you  see  it?)  The  following  assumption  will  imply  uniqueness,  as  we  shall  find  out. 


General  Assumption  (Al) 

The  x-values  X\,  ■ • • , xn  in  our  sample  (xj,  yq),  • • • , (xn,  yn ) are  not  all  equal. 


From  a given  sample  (xi,  yq),  • • • , (xn,  yyj  we  shall  now  determine  a straight  line  by 
least  squares.  We  write  the  line  as 

(2)  y = k0  + Aqx 

and  call  it  the  sample  regression  line  because  it  will  be  the  counterpart  of  the  population 
regression  line  (1). 

Now  a sample  point  ( Xj , yy)  has  the  vertical  distance  (distance  measured  in  the 
y-dircction)  from  (2)  given  by 


I yj  ~ (k o + k\Xj)\ 


(see  Fig.  543). 


Fig.  543.  Vertical  distance  of  a point  (Xj,  y^)  from  a straight  line  y = k0  + k^x 


Hence  the  sum  of  the  squares  of  these  distances  is 

(3) 


cl  = 2Cvj  - ko  - kiXjf. 
j=  i 


In  the  method  of  least  squares  we  now  have  to  determine  ko  and  ki  such  that  q is  minimum. 
From  calculus  we  know  that  a necessary  condition  for  this  is 


(4) 


dq 


= 0 and 


dq 


= 0. 


dk  o dk  1 

We  shall  see  that  from  this  condition  we  obtain  for  the  sample  regression  line  the  formula 


(5) 


y - y = k-iix  - x). 
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Here  x and  y are  the  means  of  the  x-  and  the  y-values  in  our  sample,  that  is, 


(a) 

X = — (li  + • • 

n 

Xyi) 

(b) 

y = n(y  i + - 

■ • + yn)’ 

The  slope  k\  in  (5)  is  called  the  regression  coefficient  of  the  sample  and  is  given  by 


(7) 


°xy 

k i = — . 

s* 


Here  the  “sample  covariance”  sxy  is 


1 

(8)  = _ - 2 (xj  ~ x)(yj  - y) 

11  1 3=  1 

and  Sx  is  given  by 

1 n 

(9a)  si  = 2 (*j  ~ xf 

n — 1 

3= i 


n 

2 xjyj 

.3  = 1 


i 

n 


From  (5)  we  see  that  the  sample  regression  line  passes  through  the  point  ( x , y),  by  which 
it  is  determined,  together  with  the  regression  coefficient  (7).  We  may  call  si  the  variance 
of  the  x-values,  but  we  should  keep  in  mind  that  x is  an  ordinary  variable,  not  a random 
variable. 

We  shall  soon  also  need 


(9b) 


2 _ 


3 = 1 


- i 2% 


U=1 


-3  = 1 


Derivation  of  (5)  and  (7).  Differentiating  (3)  and  using  (4),  we  first  obtain 


~ ko  - kiXj)  = 0, 

oK-o 

— = -2  2jxj(yj  ~ ko  ~ ki xj)  = 0 


where  we  sum  over  j from  1 to  n.  We  now  divide  by  2,  write  each  of  the  two  sums  as 
three  sums,  and  take  the  sums  containing  y?  and  x3y,j  over  to  the  right.  Then  we  get  the 

“normal  equations” 


ko"  + k\2jXj  = 2% 

ko^xj  + k^xf  = 2 xjyj ■ 


(10) 
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This  is  a linear  system  of  two  equations  in  the  two  unknowns  k0  and  k\.  Its  coefficient 
determinant  is  [see  (9)] 


n 


n(n  - 1 )sf 


n^j(Xj  ~ xf 


and  is  not  zero  because  of  Assumption  (Al).  Hence  the  system  has  a unique  solution. 
Dividing  the  first  equation  of  (10)  by  n and  using  (6),  we  get  k0  = y — k ^x.  Together 
with  y = k o + k iX  in  (2)  this  gives  (5).  To  get  (7),  we  solve  the  system  (10)  by  Cramer’s 
rule  (Sec.  7.6)  or  elimination,  finding 

(11)  *T  = 2 ■ 

n(n  - I)** 


This  gives  (7) — (9)  and  completes  the  derivation.  [The  equality  of  the  two  expressions  in 
(8)  and  in  (9)  may  be  shown  by  the  student]. 

Regression  Line 

The  decrease  of  volume  y [%]  of  leather  for  certain  fixed  values  of  high  pressure  x [atmospheres]  was  measured. 
The  results  are  shown  in  the  first  two  columns  of  Table  25.1 1.  Find  the  regression  line  of  y on  x. 

Solution.  We  see  that  n — 4 and  obtain  the  values  x = 28000/4  = 7000,  y = 19.0/4  = 4.75,  and  from  (9) 
and  (8) 


Table  25.1  Regression  of  the  Decrease  of  Volume  y [%] 
of  Leather  on  the  Pressure  x [Atmospheres] 


Given  Values 

Auxiliary  Values 

Xj 

37 

xf 

xjy-j 

4000 

2.3 

16,000,000 

9200 

6000 

4.1 

36,000,000 

24,600 

8000 

5.7 

64,000,000 

45,600 

10,000 

6.9 

100,000,000 

69,000 

28,000 

19.0 

216,000,000 

148,400 

20,000,000 
3 

15,400 
3 

Hence  k i = 15,400/20,000,000  = 0.00077  from  (7),  and  the  regression  line  is 

y - 4.75  = 0.00077(;t  - 7000)  or  y = 0.00077*  - 0.64. 


si  = - [ 216,000,000  - 


28,000" 


Sxy  3 


148,400  - 


28,000  • 19 


Note  that  y(0)  = —0.64,  which  is  physically  meaningless,  but  typically  indicates  that  a linear  relation  is  merely 
an  approximation  valid  on  some  restricted  interval. 
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Confidence  Intervals  in  Regression  Analysis 


If  we  want  to  get  confidence  intervals,  we  have  to  make  assumptions  about  the  distribution 
of  Y (which  we  have  not  made  so  far;  least  squares  is  a “geometric  principle,”  nowhere 
involving  probabilities!).  We  assume  normality  and  independence  in  sampling: 

Assumption  (A2) 

For  each  fixed  x the  random  variable  Y is  normal  with  mean  (1),  that  is, 

(12)  p(x)  = «o  + k\x 

and  variance  cr2  independent  of  x. 

Assumption  (A3) 

The  n performances  of  the  experiment  by  which  we  obtain  a sample 


are  independent. 

K\  in  (12)  is  called  the  regression  coefficient  of  the  population  because  it  can  be  shown 
that,  under  Assumptions  (A1)-(A3),  the  maximum  likelihood  estimate  of  k\  is  the  sample 
regression  coefficient  k j given  by  (11). 

Under  Assumptions  (A1)-(A3),  we  may  now  obtain  a confidence  interval  for  ki,  as 
shown  in  Table  25.12. 

Table  25.12  Determination  of  a Confidence  Interval  for  k in  (1)  under  Assumptions  (A1)-(A3) 

Step  1.  Choose  a confidence  level  y(95%,  99%,  or  the  like). 

Step  2.  Determine  the  solution  c of  the  equation 


(*i,  n),  (X2,y2), 


(■ Xni  Yn) 


(13) 


F(c)  = 1(1  + y) 


from  the  table  of  the  /-distribution  with  n — 2 degrees  of  freedom  (Table  A9  in 
App.  5;  n — sample  size). 


Step  3.  Using  a sample  (x-i,  yU,  ■ ■ • , ( xn , yn ),  compute  ( n — Y)s2  from  (9a),  (n  — 1 )sxy 
from  (8),  ki  from  (7), 


(14) 


[as  in  (9b)],  and 


(15) 

Step  4.  Compute 


qo  = (n  - 1 Ksl  - k2s2). 


The  confidence  interval  is 


(16) 


CONF7  [k1  - K § Ki  S k1  + K}. 
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EXAMPLE  2 


THEOREM  1 


Confidence  Interval  for  the  Regression  Coefficient 

Using  the  sample  in  Table  25.11,  determine  a confidence  interval  for  k\  by  the  method  in  Table  25.12. 
Solution.  Step  1.  We  choose  y = 0.95. 

Step  2.  Equation  (13)  takes  the  form  F(c ) = 0.975,  and  Table  A9  in  App.  5 with  n — 2 = 2 degrees  of  freedom 
gives  c = 4.30. 

Step  3.  From  Example  1 we  have  3^  = 20,000,000  and  ki  = 0.00077.  From  Table  25.11  we  compute 


Step  4.  We  thus  obtain 


and 


= 102.0 

4 

= 11.95. 

q0  = 11.95  - 20,000,000  ■ 0.000772 
= 0.092. 

K = 4.30V0.092/(2  • 20,000,000) 

= 0.000206 


CONF  0.95 


{0.00056  £ Ki  S 0.00098). 


Correlation  Analysis 

We  shall  now  give  an  introduction  to  the  basic  facts  in  correlation  analysis;  for  proofs  see 
Ref.  [G2]  or  [G8]  in  App.  1. 

Correlation  analysis  is  concerned  with  the  relation  between  X and  Y in  a two- 
dimensional  random  variable  ( X , Y ) (Sec.  24.9).  A sample  consists  of  n ordered  pairs  of 
values  (xi,  v i ) , • • • , (xn,  yn),  as  before.  The  interrelation  between  the  x and  y values  in  the 
sample  is  measured  by  the  sample  covariance  sxy  in  (8)  or  by  the  sample  correlation 
coefficient 


(17) 


sxy 

SxSy 


with  sx  and  sy  given  in  (9).  Here  r has  the  advantage  that  it  does  not  change  under  a 
multiplication  of  the  x and  y values  by  a factor  (in  going  from  feet  to  inches,  etc.). 


Sample  Correlation  Coefficient 

The  sample  correlation  coefficient  r satisfies  — 1 = r = l.  In  particular,  r = ± 1 
if  and  only  if  the  sample  values  lie  on  a straight  line.  (See  Fig.  544.) 


The  theoretical  counterpart  of  r is  the  correlation  coefficient  p of  X and  Y, 

ctxy 
(tX(Ty 


(18) 
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r = 1 


10  - 


r = 0 


10 


o 


J I l_ 


0 10  20 

r = 0.98 


10 


i i i i 

0 10  20 


10 


r = 0.6 


i i i i 

0 10  20 


0 


J I I I 

0 10  20 

r = -0.3 


10  - 


i i i i 

0 10  20 


10  - 


r = -0.9 


i i i i 

0 10  20 


Fig.  544.  Samples  with  various  values  of  the  correlation  coefficient  r 


where  px  = E(X),  p,y  = E(Y),  cr\  = E([X  — /ax]2),  oy  = E([Y  — /Ay]2)  (the  means 
and  variances  of  the  marginal  distributions  of  X and  Y;  see  Sec.  24.9),  and  (tXy  is  the 
covariance  of  X and  Y given  by  (see  Sec.  24.9) 

(19)  cr xy  = E([X  - /Axil  Y - /ay])  = E(XY)  - E(X)E(Y). 

The  analog  of  Theorem  1 is 


THEOREM  2 


Correlation  Coefficient 

The  correlation  coefficient  p satisfies  —1  = f>  = l.  In  particular,  p = ±1  if  and 
only  ifX  and  Y are  linearly  related,  that  is,  Y = yX  + S,  X = y*Y  + 8*. 


X and  Y are  called  uncorrelated  if  p = 0. 


THEOREM  3 


Independence.  Normal  Distribution 

(a)  Independent  X and  Y (see  Sec.  24.9)  are  uncorrelated. 

(b)  If  (J C Y)  is  normal  (see  below),  then  uncorrelated  X and  Y are 
independent. 


Here  the  two-dimensional  normal  distribution  can  be  introduced  by  taking  two  independent 
standardized  normal  random  variables  X*,  Y*,  whose  joint  distribution  thus  has  the  density 
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(representing  a surface  of  revolution  over  the  jt*y*-plane  with  a bell-shaped  curve  as  cross 
section)  and  setting 

X = l-i-x  + <rXX* 

Y = pbY  + pcryX*  + VT~—  p2uYY*. 

This  gives  the  general  two-dimensional  normal  distribution  with  the  density 


(21a) 


where 


fix,  y ) = 


27 TcrxCTyX/ 1 — p2 


-hix,y)/  2 


(21b)  h(x,y)  = 


1 


* - Px\ 
o-x  / 


- 2 p 


x - vx\(y  - my\  (y^jxy^ 


(Tx 


(Ty 


1 ^ 

1 - p 

In  Theorem  3(b),  normality  is  important,  as  we  can  see  from  the  following  example. 


Uncorrelated  But  Dependent  Random  Variables 

If  X assumes  —1,  0,  1 with  probability  § and  Y = X2,  then  E(X)  = 0 and  in  (3) 

o-xy  = E(XY)  = E(X3)  = (-1)3  ■ 1 + 03  • J + l3  ■ J = 0, 

so  that  p = 0 and  X and  Y are  uncorrelated.  But  they  are  certainly  not  independent  since  they  are  even  functionally 
related. 


Test  for  the  Correlation  Coefficient  p 

Table  25.13  shows  a test  for  p in  the  case  of  the  two-dimensional  normal  distribution,  t is 
an  observed  value  of  a random  variable  that  has  a /-distribution  with  n — 2 degrees  of 
freedom.  This  was  shown  by  R.  A.  Fisher  ( Biometrika  10  (1915),  507-521). 


Test  of  the  Hypothesis  p — 0 Against  the  Alternative  p > 0 in  the  Case 
of  the  Two-Dimensional  Normal  Distribution 


Step  1.  Choose  a significance  level  a (5%,  1%,  or  the  like). 
Step  2.  Determine  the  solution  c of  the  equation 

P(T  g c)  = 1 — a 


from  the  /-distribution  (Table  A9  in  App.  5)  with  n — 2 degrees  of  freedom. 
Step  3.  Compute  r from  (17),  using  a sample  (x\,  yi),  ■ ■ ■ , ( xn , yn). 

Step  4.  Compute 


If  t S c,  accept  the  hypothesis.  If  / > c,  reject  the  hypothesis. 
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EXAMPLE  4 Test  for  the  Correlation  Coefficient  p 

Test  the  hypothesis  p = 0 (independence  of  X and  Y,  because  of  Theorem  3)  against  the  alternative  p > 0,  using 
the  data  in  the  lower  left  comer  of  Fig.  544,  where  r = 0.6  (manual  soldering  errors  on  10  two-sided  circuit 
boards  done  by  10  workers;  x = front,  y = back  of  the  boards). 

Solution.  We  choose  a = 5%;  thus  1 — a = 95%.  Since  n = 10,  n — 2 = 8,  the  table  gives  c = 1.86.  Also, 
t = 0.6  V8/0.64  = 2.12  > c.  We  reject  the  hypothesis  and  assert  that  there  is  a positive  correlation.  A worker  making 
few  (many)  errors  on  the  front  side  also  tends  to  make  few  (many)  errors  on  the  reverse  side  of  the  board. 
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SAMPLE  REGRESSION  LINE 


Find  and  graph  the  sample  regression  line  of  y on  x and  the 
given  data  as  points  on  the  same  axes.  Show  the  details  of 
your  work. 


1.  (0,  1.0),  (2,  2.1),  (4,  2.9),  (6,  3.6),  (8,  5.2) 

2.  (-2,  3.5),  (1,  2.6),  (3,  1.3),  (5,  0.4) 

3.  x = Revolutions  per  minute,  y = Power  of  a Diesel 
engine  [hp] 


x 400  500  600  700  750 

y 5800  10,300  14,200  18,800  21,000 


4.  x = Deformation  of  a certain  steel  [mm],  y = Brinell 
hardness  [kg/mm1 2] 

x 6 9 11  13  22  26  28  33  35 

y 68  67  65  53  44  40  37  34  32 

5.  x = Brinell  hardness,  y = Tensile  strength  [in  1000  psi 
(pounds  per  square  inch)]  of  steel  with  0.45%  C 
tempered  for  1 hour 

x 200  300  400  500 

y 110  150  190  280 

6.  Abrasion  of  quenched  and  tempered  steel  S620. 

x = Sliding  distance  [km],  y = Wear  volume  [mm3] 

x 1.1  3.2  3.4  4.5  5.6 

y 40  65  120  150  190 

7.  Ohm’s  law  (Sec.  2.9).  x = Voltage  [V],  y = Current 
[A],  Also  find  the  resistance  R |fl|. 


8.  Hooke’s  law  (Sec.  2.4).  x = Force  [lb],  y = Extension 
[in]  of  a spring.  Also  find  the  spring  modulus. 

x 2 4 6 8 

y 4.1  7.8  12.3  15.8 

9.  Thermal  conductivity  of  water,  x = Temperature 
[°F],  y = Conductivity  [Btu/(hr  ■ ft  • °F)].  Also  find  y 
at  room  temperature  66°F. 

x 32  50  100  150  212 

y 0.337  0.345  0.365  0.380  0.395 

10.  Stopping  distance  of  a car.  x = Speed  [mph],  y = 
Stopping  distance  [ft].  Also  find  y at  35  mph. 

x 30  40  50  60 

y 160  240  330  435 

11.  CAS  EXPERIMENT.  Moving  Data.  Take  a sample, 
for  instance,  that  in  Prob.  4,  and  investigate  and  graph 
the  effect  of  changing  y-values  (a)  for  small  x,  (b)  for 
large  x,  (c)  in  the  middle  of  the  sample. 


12-15 


CONFIDENCE  INTERVALS 


Find  a 95%  confidence  interval  for  the  regression 
coefficient  Ki,  assuming  (A2)  and  (A3)  hold  and  using  the 
sample. 

12.  In  Prob.  2 

13.  In  Prob.  3 

14.  In  Prob.  4 

15.  x = Humidity  of  air  [%],  y = Expansion  of  gelatin  [%], 


x 40  40  80  80  110  110 


x 10  20  30  40 


y 5.1  4.8  0.0  10.3  13.0  12.7 


y 0.8  1.6  2.3  2.8 


S T I O N S AND  PROBLEMS 


1.  What  is  a sample?  A population?  Why  do  we  sample 
in  statistics? 

2.  If  we  have  several  samples  from  the  same  population, 

do  they  have  the  same  sample  distribution  function? 
The  same  mean  and  variance? 


3.  Can  we  develop  statistical  methods  without  using 
probability  theory?  Apply  the  methods  without  using  a 
sample? 

4.  What  is  the  idea  of  the  maximum  likelihood  method? 
Why  do  we  say  “likelihood”  rather  than  “probability”? 
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5.  Couldn't  we  make  the  error  of  interval  estimation  zero 
simply  by  choosing  the  confidence  level  1? 

6.  What  is  testing?  Why  do  we  test?  What  are  the  errors 
involved? 

7.  When  did  we  use  the  /-distribution?  The  F-distribution? 

8.  What  is  the  chi-square  (x2)  test?  Give  a sample 
example  from  memory. 

9.  What  are  one-sided  and  two-sided  tests?  Give  typical 
examples. 

10.  How  do  we  test  in  quality  control?  In  acceptance 
sampling? 

11.  What  is  the  power  of  a test?  What  could  you  perhaps 
do  when  it  is  low? 

12.  What  is  Gauss’s  least  squares  principle  (which  he  found 
at  age  18)? 

13.  What  is  the  difference  between  regression  and 
correlation? 

14.  Find  the  mean,  variance,  and  standard  derivation  of  the 
sample  21.0  21.6  19.9  19.6  15.6  20.6  22.1  22.2. 

15.  Assuming  normality,  find  the  maximum  likelihood 
estimates  of  mean  and  variance  from  the  sample  in 
Prob.  14. 

16.  Determine  a 95%  confidence  interval  for  the  mean  p 
of  a normal  population  with  variance  cr2  = 25,  using 
a sample  of  size  500  with  mean  22. 

17.  Determine  a 99%  confidence  interval  for  the  mean  of 
a normal  population,  using  the  sample  32,  33,  32,  34, 
35,  29,  29,  27. 


18.  Assuming  normality,  find  a 95%  confidence  interval  for 
the  variance  from  the  sample  145.3,  145.1,  145.4,  146.2. 

19.  Using  a sample  of  10  values  with  mean  14.5  from 
a normal  population  with  variance  cr2  = 0.25,  test 
the  hypothesis  p0  = 15.0  against  the  alternative 
p1  = 14.5  on  the  5%  level.  Find  the  power. 

20.  Three  specimens  of  high-quality  concrete  had 
compressive  strength  357,  359,  413  [kg/cm2],  and  for 
three  specimens  of  ordinary  concrete  the  values  were 
346,  358,  302.  Test  for  equality  of  the  population  means, 

= /u,2,  against  the  alternative  pi  > Assume 
normality  and  equality  of  variance.  Choose  a = 5%. 

21.  Assume  the  thickness  X of  washers  to  be  normal  with 
mean  2.75  mm  and  variance  0.00024  mm2.  Set  up 
a control  chart  for  p and  graph  the  means  of  the  five 
samples  (2.74,2.76),  (2.74,2.74),  (2.79,2.81),  (2.78, 
2.76),  (2.71,  2.75)  on  the  chart. 

22.  The  OC  curve  in  acceptance  sampling  cannot  have  a 
strictly  vertical  portion.  Why? 

23.  Find  the  risks  in  the  sampling  plan  with  n = 6 and 
c = 0,  assuming  that  the  AQL  is  0O  = 1%  and  the 
RQL  is  6 1 = 15%.  How  do  the  risks  change  if  we 
increase  n? 

24.  Does  a process  of  producing  plastic  rods  of  length 
p = 2 meters  need  adjustment  if  in  a sample,  2 rods 
have  the  exact  length  and  15  are  shorter  and  3 longer 
than  2 meters?  (Use  the  sign  test.) 

25.  Find  the  regression  line  of  y on  x for  the  data 
(x,  y)  = (0,  4),  (2,  0),  (4,  -5),  (6,  -9),  (8,  - 10). 


SUMMARY  OF  CHAPTER  25 

Mathematical  Statistics 


We  recall  from  Chap.  24  that,  with  an  experiment  in  which  we  observe  some  quantity 
(number  of  defectives,  height  of  persons,  etc.),  there  is  associated  a random  variable 
X whose  probability  distribution  is  given  by  a distribution  function 

(1)  F(x)  = P(X  g x)  (Sec.  24.5) 

which  for  each  x gives  the  probability  that  X assumes  any  value  not  exceeding  x. 

In  statistics  we  take  random  samples  x\,---,xn  of  size  n by  performing  that 
experiment  n times  (Sec.  25.1)  and  draw  conclusions  from  properties  of  samples 
about  properties  of  the  distribution  of  the  corresponding  X.  We  do  this  by  calculating 
point  estimates  or  confidence  intervals  or  by  performing  a test  for  parameters  (p 

o 

and  cr  in  the  normal  distribution,  p in  the  binomial  distribution,  etc.)  or  by  a test 
for  distribution  functions. 


Summary  of  Chapter  25 
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A point  estimate  (Sec.  25.2)  is  an  approximate  value  for  a parameter  in  the 
distribution  of  X obtained  from  a sample.  Notably,  the  sample  mean  (Sec.  25.1) 

1 n 1 

(2)  x = ~ 2 xj  = - (*i  + - • + xn) 

j=t 

is  an  estimate  of  the  mean  /j.  of  X,  and  the  sample  variance  (Sec.  25.1) 

I n I 

(3)  s2  = r 2 (xi  ~ = T K*i  - xf  + • ■ • + (xn  - x)2] 

n — 1 n — 1 

3= 1 

is  an  estimate  of  the  variance  cr2  of  X.  Point  estimation  can  be  done  by  the  basic 
maximum  likelihood  method  (Sec.  25.2). 

Confidence  intervals  (Sec.  25.3)  are  intervals  01  g 9 ^ 02  with  endpoints 
calculated  from  a sample  such  that,  with  a high  probability  y,  we  obtain  an  interval 
that  contains  the  unknown  true  value  of  the  parameter  6 in  the  distribution  of  X. 
Here,  y is  chosen  at  the  beginning,  usually  95%  or  99%.  We  denote  such  an  interval 
by  CONFy  {0!  =i  9 g 02}. 

In  a test  for  a parameter  we  test  a hypothesis  0 = 0o  against  an  alternative  9 = 9t 
and  then,  on  the  basis  of  a sample,  accept  the  hypothesis,  or  we  reject  it  in  favor  of 
the  alternative  (Sec.  25.4).  Like  any  conclusion  about  X from  samples,  this  may 
involve  errors  leading  to  a false  decision.  There  is  a small  probability  a (which  we 
can  choose,  5%  or  1 %,  for  instance)  that  we  reject  a true  hypothesis,  and  there  is  a 
probability  f3  (which  we  can  compute  and  decrease  by  taking  larger  samples)  that 
we  accept  a false  hypothesis,  a is  called  the  significance  level  and  1 — /3  the  power 
of  the  test.  Among  many  other  engineering  applications,  testing  is  used  in  quality 
control  (Sec.  25.5)  and  acceptance  sampling  (Sec.  25.6). 

If  not  merely  a parameter  but  the  kind  of  distribution  of  X is  unknown,  we  can 
use  the  chi-square  test  (Sec.  25.7)  for  testing  the  hypothesis  that  some  function 
F(x)  is  the  unknown  distribution  function  of  X.  This  is  done  by  determining  the 
discrepancy  between  F(x ) and  the  distribution  function  F(x)  of  a given  sample. 

“Distribution-free”  or  nonparametric  tests  are  tests  that  apply  to  any  distribution, 
since  they  are  based  on  combinatorial  ideas.  These  tests  are  usually  very  simple. 
Two  of  them  are  discussed  in  Sec.  25.8. 

The  last  section  deals  with  samples  of  pairs  of  values,  which  arise  in  an  experiment 
when  we  simultaneously  observe  two  quantities.  In  regression  analysis,  one  of  the 
quantities,  x,  is  an  ordinary  variable  and  the  other,  Y,  is  a random  variable  whose 
mean  p depends  on  x,  say,  p(x)  = k0  + Kh<c.  In  correlation  analysis  the  relation 
between  X and  Y in  a two-dimensional  random  variable  (X,  Y)  is  investigated, 
notably  in  terms  of  the  correlation  coefficient  p. 


APPENDIX 


References 

Software  see  at  the  beginning  of  Chaps.  19 
and  24. 

General  References 

[GenRefl]  Abramowitz,  M.  and  I.  A.  Stegun  (eds.), 
Handbook  of  Mathematical  Functions.  10th  printing, 
with  corrections.  Washington,  DC:  National  Bureau 
of  Standards.  1972  (also  New  York:  Dover,  1965).  See 
also  [Wl] 

[GenRef2]  Cajori,  F.,  History  of  Mathematics.  5th  ed. 
Reprinted.  Providence,  RI:  American  Mathematical 
Society,  2002. 

[GenRef3]  Courant,  R.  and  D.  Hilbert,  Methods  of 
Mathematical  Physics.  2 vols.  Hoboken,  NJ:  Wiley, 
1989. 

[GenRef4]  Courant,  R.,  Differential  and  Integral 
Calculus.  2 vols.  Hoboken,  NJ:  Wiley,  1988. 

[GenRef5]  Graham,  R.  L.  et  al.,  Concrete  Mathematics. 
2nd  ed.  Reading,  MA:  Addison-Wesley,  1994. 

[GenRef6]  Ito,  K.  (ed.),  Encyclopedic  Dictionary  of 
Mathematics.  4 vols.  2nd  ed.  Cambridge,  MA:  MIT 
Press,  1993. 

[GenRef7]  Kreyszig,  E.,  Introductory  Functional 
Analysis  with  Applications.  New  York:  Wiley,  1989. 

[GenRef8]  Kreyszig,  E.,  Differential  Geometry.  Mineola, 
NY:  Dover,  1991. 

[GenRef9]  Kreyszig,  E.  Introduction  to  Differential 
Geometry  and  Riemannian  Geometry.  Toronto: 
University  of  Toronto  Press,  1975. 

[GenReflO]  Szego,  G.,  Orthogonal  Polynomials.  4th  ed. 
Reprinted.  New  York:  American  Mathematical  Society, 
2003. 

[GenRefl  1]  Thomas,  G.  et  al.,  Thomas’  Calculus,  Early 
Transcendentals  Update.  10th  ed.  Reading,  MA: 
Addison-Wesley,  2003. 

Part  A.  Ordinary  Differential  Equations 
(ODEs)  (Chaps.  1-6) 

See  also  Part  E:  Numeric  Analysis 

[Al]  Arnold,  V.  I.,  Ordinary  Differential  Equations.  3rd 
ed.  New  York:  Springer,  2006. 

[A2]  Bhatia,  N.  P.  and  G.  P.  Szego,  Stability  Theory  of 
Dynamical  Systems.  New  York:  Springer.  2002. 

[A3]  Birkhoff,  G.  and  G.-C.  Rota,  Ordinary  Differential 
Equations.  4th  ed.  New  York:  Wiley,  1989. 


[A4]  Brauer,  F.  and  J.  A.  Nohel,  Qualitative  Theory  of 
Ordinary  Differential  Equations.  Mineola,  NY : Dover, 
1994. 

[A5]  Churchill,  R.  V.,  Operational  Mathematics.  3rd  ed. 
New  York:  McGraw-Hill,  1972. 

[A6]  Coddington,  E.  A.  and  R.  Carlson,  Linear  Ordinary 
Differential  Equations.  Philadelphia:  SIAM,  1997. 

[A7]  Coddington,  E.  A.  and  N.  Levinson,  Theory  of 
Ordinary  Differential  Equations.  Malabar,  FL:  Krieger, 
1984. 

[A8]  Dong,  T.-R.  et  al..  Qualitative  Theory  of  Differential 
Equations.  Providence,  RI:  American  Mathematical 
Society,  1992. 

[A9]  Erdelyi,  A.  et  al.,  Tables  of  Integral  Transforms. 
2 vols.  New  York:  McGraw-Hill,  1954. 

[A10]  Hartman,  P.,  Ordinary  Differential  Equations.  2nd 
ed.  Philadelphia:  SIAM,  2002. 

[All]  Ince,  E.  L..  Ordinary  Differential  Equations.  New 
York:  Dover,  1956. 

[A12]  Schiff,  J.  L.,  The  Laplace  Transform:  Theory  and 
Applications.  New  York:  Springer,  1999. 

[A13]  Watson,  G.  N.,  A Treatise  on  the  Theory  of  Bessel 
Functions.  2nd  ed.  Reprinted.  New  York:  Cambridge 
University  Press,  1995. 

[A14]  Widder,  D.  V.,  The  Laplace  Transform.  Princeton, 
NJ:  Princeton  University  Press,  1941. 

[A15]  Zwillinger,  D.,  Handbook  of  Differential  Equations. 
3rd  ed.  New  York:  Academic  Press,  1998. 

Part  B.  Linear  Algebra,  Vector  Calculus 
(Chaps.  7-10) 

For  books  on  numeric  linear  algebra,  see  also 

Part  E:  Numeric  Analysis. 

[Bl]  Bellman,  R.,  Introduction  to  Matrix  Analysis.  2nd 
ed.  Philadelphia:  SIAM,  1997. 

[B2]  Chatelin,  F.,  Eigenvalues  of  Matrices.  New  York: 
Wiley-Interscience,  1993. 

[B3]  Gantmacher,  F.  R.,  The  Theory  of  Matrices.  2 vols. 
Providence,  RI:  American  Mathematical  Society,  2000. 

[B4]  Gohberg,  I.  P.  et  al.,  Invariant  Subspaces  of  Matrices 
with  Applications.  New  York:  Wiley,  2006. 

[B5]  Greub,  W.  H.,  Linear  Algebra.  4th  ed.  New  York: 
Springer,  1975. 

[B6]  Herstein,  I.  N.,  Abstract  Algebra.  3rd  ed.  New  York: 
Wiley,  1996. 


Al 


A2 


APP.  1 References 


[B7]  Joshi,  A.  W.,  Matrices  and  Tensors  in  Physics.  3rd 
ed.  New  York:  Wiley,  1995. 

[B8]  Lang,  S.,  Linear  Algebra.  3rd  ed.  New  York: 
Springer,  1996. 

[B9]  Nef,  W.,  Linear  Algebra.  2nd  ed.  New  York:  Dover, 
1988. 

[BIO]  Parlett,  B.,  The  Symmetric  Eigenvalue  Problem. 
Philadelphia:  SIAM,  1998. 

Part  C.  Fourier  Analysis  and  PDEs 
(Chaps.  11-12) 

For  books  on  numerics  for  PDEs  see  also  Part 

E:  Numeric  Analysis. 

[Cl]  Antimirov,  M.  Ya.,  Applied  Integral  Transforms. 
Providence,  RI:  American  Mathematical  Society,  1993. 

[C2]  Bracewell,  R.,  The  Fourier  Transform  and  Its 
Applications.  3rd  ed.  New  York:  McGraw-Hill,  2000. 

[C3]  Carslaw,  H.  S.  and  J.  C.  Jaeger,  Conduction  of  Heat 
in  Solids.  2nd  ed.  Reprinted.  Oxford:  Clarendon,  2000. 

[C4]  Churchill,  R.  V.  and  J.  W.  Brown,  Fourier  Series 
and  Boundary  Value  Problems.  6th  ed.  New  York: 
McGraw-Hill,  2006. 

[C5]  DuChateau,  P.  and  D.  Zachmann,  Applied  Partial 
Differential  Equations.  Mineola,  NY:  Dover,  2002. 

[C6]  Hanna,  J.  R.  and  J.  H.  Rowland,  Fourier  Series, 
Transforms,  and  Boundary  Value  Problems.  2nd  ed. 
New  York:  Wiley,  2008. 

[C7]  Jerri,  A.  J.,  The  Gibbs  Phenomenon  in  Fourier 
Analysis,  Splines,  and  Wavelet  Approximations.  Boston: 
Kluwer,  1998. 

[C8]  John,  F.,  Partial  Differential  Equations.  4th  edition 
New  York:  Springer,  1982. 

[C9]  Tolstov,  G.  P.,  Fourier  Series.  New  York:  Dover,  1976. 

[CIO]  Widder,  D.  V.,  The  Heat  Equation.  New  York: 
Academic  Press,  1975. 

[Cll]  Zauderer,  E.,  Partial  Differential  Equations  of 
Applied  Mathematics.  3rd  ed.  New  York:  Wiley,  2006. 

[Cl 2]  Zygmund,  A.  and  R.  Fefferman,  Trigonometric  Series. 
3rd  ed.  New  York:  Cambridge  University  Press,  2002. 

Part  D.  Complex  Analysis  (Chaps.  13-18) 

[Dl]  Ahlfors,  L.  V.,  Complex  Analysis.  3rd  ed.  New 
York:  McGraw-Hill,  1979. 

[D2]  Bieberbach,  L.,  Conformal  Mapping.  Providence, 
RI:  American  Mathematical  Society,  2000. 

[D3]  Henrici,  P.,  Applied  and  Computational  Complex 
Analysis.  3 vols.  New  York:  Wiley,  1993. 

[D4]  Hille,  E.,  Analytic  Function  Theory.  2 vols.  2nd  ed. 
Providence,  RI:  American  Mathematical  Society, 

Reprint  VI  1983,  V2  2005. 

[D5]  Rnopp,  K.,  Elements  of  the  Theory  of  Functions. 
New  York:  Dover,  1952. 


[D6]  Knopp,  K.,  Theory  of  Functions.  2 parts.  New  York: 
Dover,  Reprinted  1996. 

[D7]  Rrantz,  S.  G.,  Complex  Analysis:  The  Geometric 
Viewpoint.  Washington,  DC:  The  Mathematical 

Association  of  America,  1990. 

[D8]  Lang,  S.,  Complex  Analysis.  4th  ed.  New  York: 
Springer,  1999. 

[D9]  Narasimhan,  R.,  Compact  Riemann  Surfaces.  New 
York:  Springer,  1996. 

[D10]  Nehari,  Z.,  Conformal  Mapping.  Mineola,  NY: 
Dover,  1975. 

[Dll]  Springer,  G.,  Introduction  to  Riemann  Surfaces. 
Providence,  RI:  American  Mathematical  Society,  2001. 

Part  E.  Numeric  Analysis  (Chaps.  19-21) 

[El]  Ames,  W.  F.,  Numerical  Methods  for  Partial 
Differential  Equations.  3rd  ed.  New  York:  Academic 
Press,  1992. 

[E2]  Anderson,  E.,  et  al.,  LAPACK  User’s  Guide.  3rd  ed. 
Philadelphia:  SIAM,  1999. 

[E3]  Bank,  R.  E.,  PLTMG.  A Software  Package  for 
Solving  Elliptic  Partial  Differential  Equations:  Users’ 
Guide  8.0.  Philadelphia:  SIAM,  1998. 

[E4]  Constanda,  C.,  Solution  Techniques  for  Elementary 
Partial  Differential  Equations.  Boca  Raton,  FL:  CRC 
Press,  2002. 

[E5]  Dahlquist,  G.  and  A.  Bjorck,  Numerical  Methods. 
Mineola,  NY:  Dover,  2003. 

[E6]  DeBoor,  C.,  A Practical  Guide  to  Splines.  Reprinted. 
New  York:  Springer,  2001. 

[E7]  Dongarra,  J.  J.  et  al.,  UNPACK  Users  Guide. 
Philadelphia:  SIAM.  1979.  (See  also  at  the  beginning  of 
Chap.  19.) 

[E8]  Garbow,  B.  S.  et  al.,  Matrix  Eigensystem  Routines: 
E1SPACK  Guide  Extension.  Reprinted.  New  York: 
Springer,  1990. 

[E9]  Golub,  G.  H.  and  C.  F.  Van  Loan,  Matrix 
Computations.  3rd  ed.  Baltimore,  MD:  Johns  Hopkins 
University  Press,  1996. 

[E10]  Higham,  N.  J.,  Accuracy  and  Stability  of  Numerical 
Algorithms.  2nd  ed.  Philadelphia:  SIAM,  2002. 

[Ell]  IMSL  (International  Mathematical  and  Statistical 
Libraries),  FORTRAN  Numerical  Library.  Houston,  TX: 
Visual  Numerics,  2002.  (See  also  at  the  beginning  of 
Chap.  19.) 

[E12]  IMSL,  IMSL  for  Java.  Houston,  TX:  Visual 
Numerics,  2002. 

[El 3]  IMSL,  C Library.  Houston,  TX:  Visual  Numerics, 

2002. 

[E14]  Kelley,  C.  T.,  Iterative  Methods  for  Linear  and 
Nonlinear  Equations.  Philadelphia:  SIAM,  1995. 

[E15]  Knabner,  P.  and  L.  Angerman,  Numerical  Methods  for 
Partial  Differential  Equations.  New  York:  Springer,  2003. 


APP.  1 References 


A3 


[E16]  Knuth,  D.  E.,  The  Art  of  Computer  Programming. 
3 vols.  3rd  ed.  Reading,  MA:  Addison-Wesley,  1997- 
2009. 

[E17]  Kreyszig,  E.,  Introductory  Functional  Analysis  with 
Applications.  New  York:  Wiley,  1989. 

[El 8]  Kreyszig,  E.,  On  methods  of  Fourier  analysis  in 
multigrid  theory.  Lecture  Notes  in  Pure  and  Applied 
Mathematics  157.  New  York:  Dekker,  1994,  pp.  225-242. 

[E19]  Kreyszig,  E.,  Basic  ideas  in  modern  numerical 
analysis  and  their  origins.  Proceedings  of  the  Annual 
Conference  of  the  Canadian  Society  for  the  History  and 
Philosophy  of  Mathematics.  1997,  pp.  34-45. 

[E20]  Kreyszig,  E.,  and  J.  Todd,  QR  in  two  dimensions. 
Elemente  der  Mathematik  31  (1976),  pp.  109-114. 

[E21]  Mortensen,  M.  E.,  Geometric  Modeling.  2nd  ed. 
New  York:  Wiley,  1997. 

[E22]  Morton,  K.  W.,  and  D.  F.  Mayers,  Numerical  Solution 
of  Partial  Differential  Equations:  An  Introduction.  New 
York:  Cambridge  University  Press,  1994. 

[E23]  Ortega,  J.  M.,  Introduction  to  Parallel  and  Vector 
Solution  of  Linear  Systems.  New  York:  Plenum  Press, 
1988. 

[E24]  Overton,  M.  L.,  Numerical  Computing  with  IEEE 
Floating  Point  Arithmetic.  Philadelphia:  SIAM,  2004. 

[E25]  Press,  W.  H.  et  al..  Numerical  Recipes  in  C:  The  Art 
of  Scientific  Computing.  2nd  ed.  New  York:  Cambridge 
University  Press,  1992. 

[E26]  Shampine,  L.  F.,  Numerical  Solutions  of  Ordinary 
Differential  Equations.  New  York:  Chapman  and  Hall, 
1994. 

[E27]  Varga,  R.  S.,  Matrix  Iterative  Analysis.  2nd  ed.  New 
York:  Springer,  2000. 

[E28]  Varga,  R.  S.,  Gersgorin  and  His  Circles.  New  York: 
Springer,  2004. 

[E29]  Wilkinson,  J.  H.,  The  Algebraic  Eigenvalue 
Problem.  Oxford:  Oxford  University  Press,  1988. 

Part  F.  Optimization,  Graphs  (Chaps.  22-23) 

[FI]  Bondy,  J.  A.  and  U.S.R.  Murty,  Graph  Theory  with 
Applications.  Hoboken,  NJ:  Wiley-Interscience,  1991. 

[F2]  Cook,  W.  J.  et  al.,  Combinatorial  Optimization.  New 
York:  Wiley,  1997. 

[F3]  Diestel,  R.,  Graph  Theory.  4th  ed.  New  York: 
Springer,  2006. 

[F4]  Diwekar,  U.  M.,  Introduction  to  Applied  Optimization. 
2nd  ed.  New  York:  Springer,  2008. 

[F5]  Gass,  S.  L.,  Linear  Programming.  Method  and 
Applications.  3rd  ed.  New  York:  McGraw-Hill,  1969. 

[F6]  Gross,  J.  T.  and  J.Yellen  (eds.),  Handbook  of  Graph 
Theory  and  Applications.  2nd  ed.  Boca  Raton,  FL:  CRC 
Press,  2006. 

[F7]  Goodrich,  M.  T.,  and  R.  Tamassia,  Algorithm 
Design:  Foundations,  Analysis,  and  Internet  Examples. 
Hoboken,  NJ:  Wiley,  2002. 


[F8]  Harary,  F.,  Graph  Theory.  Reprinted.  Reading,  MA: 
Addison-Wesley,  2000. 

[F9]  Merris,  R.,  Graph  Theory.  Hoboken,  NJ:  Wiley- 
Interscience,  2000. 

[F10]  Ralston,  A.,  and  P.  Rabinowitz,  A First  Course  in 
Numerical  Analysis.  2nded.  Mineola,  NY:  Dover,  2001. 
[FI  1]  Thulasiraman,  K.,  and  M.  N.  S.  Swamy,  Graph 
Theory  and  Algorithms.  New  York:  Wiley-Interscience, 
1992. 

[FI 2]  Tucker,  A.,  Applied  Combinatorics.  5th  ed. 
Hoboken,  NJ:  Wiley,  2007. 

Part  G.  Probability  and  Statistics 
(Chaps.  24-25) 

[Gl]  American  Society  for  Testing  Materials,  Manual  on 
Presentation  of  Data  and  Control  Chart  Analysis.  7th 
ed.  Philadelphia:  ASTM,  2002. 

[G2]  Anderson,  T.  W.,  An  Introduction  to  Multivariate 
Statistical  Analysis.  3rd  ed.  Hoboken,  NJ:  Wiley, 

2003. 

[G3]  Cramer,  H.,  Mathematical  Methods  of  Statistics. 
Reprinted.  Princeton,  NJ:  Princeton  University  Press, 
1999. 

[G4]  Dodge,  Y.,  The  Oxford  Dictionary  of  Statistical 
Terms.  6th  ed.  Oxford:  Oxford  University  Press, 
2006. 

[G5]  Gibbons,  J.  D.  and  S.  Chakraborti,  Nonparametric 
Statistical  Inference.  4th  ed.  New  York:  Dekker,  2003. 
[G6]  Grant,  E.  L.  and  R.  S.  Leavenworth,  Statistical 
Quality  Control.  7th  ed.  New  York:  McGraw-Hill, 
1996. 

[G7]  IMSL,  Fortran  Numerical  Library.  Houston,  TX: 
Visual  Numerics,  2002. 

[G8]  Kreyszig,  E.,  Introductory  Mathematical  Statistics. 

Principles  and  Methods.  New  York:  Wiley,  1970. 

[G9]  O’Hagan,  T.  et  al.,  Kendall’s  Advanced  Theory  of 
Statistics  3-Volume  Set.  Kent,  U.K.:  Hodder  Arnold, 

2004. 

[G10]  Rohatgi,  V.  K.  and  A.  K.  MD.  E.  Saleh,  An 
Introduction  to  Probability  and  Statistics.  2nd  ed. 
Hoboken,  NJ:  Wiley-Interscience,  2001. 

Web  References 

[Wl]  upgraded  version  of  [GenRefl]  online  at 
http://dlmf.nist.gov/.  Hardcopy  and  CD-Rom:  Oliver, 
W.  J.  et  al.  (eds.),  NIST  Handbook  of  Mathematical 
Functions.  Cambridge;  New  York:  Cambridge  University 
Press,  2010. 

[W2]  O’Connor,  J.  and  E.  Robertson,  MacTutor  History 
of  Mathematics  Archive.  St.  Andrews,  Scotland: 
University  of  St.  Andrews,  School  of  Mathematics  and 
Statistics.  Online  at  http://www-history.mcs. st-andrews. 
ac.uk.  (Biographies  of  mathematicians,  etc.). 


E N D I X 2 

Answers  to 
Odd-Numbered 

Problem  Set  1.1,  page  8 

1.  y = — cos  2ttx  + c 3.  y 

77 


Problems 


= ce 


x 


5.  y = 2e  ^(sin  x — cos  x)  + c 


1 

7.  y = sinh  5.13x  + c 

5.13 


9.y=  1.65e_4'T  + 0.35  11.  y = (x  + \)ex 

13.  y = 1/(1  + 3e~x)  15.  y = 0 and  y = 1 because  y'  = 0 for  these  y 

17.  exp(— 1.4  • 10_110  = |,  t = 10n(ln2)/1.4  [sec] 

19.  Integrate  y"  = g twice,  y\t)  = gt  + Uo,  yr(0)  = Vo  = 0 (start  from  rest),  then 
y(t)  = hgt2  + yo,  where  y(0)  = y0  = 0 


Problem  Set  1.2,  page  11 

11.  Straight  lines  parallel  to  the  x-axis  13.  y = x 

15.  mv'  = mg  — bv2,  v'  = 9.8  — vz,  v(0 ) =10,  v'  = 0 gives  the  limit/9.8  = 3.1 
[meter/sec] 

17.  Errors  of  steps  1,  5,  10:  0.0052,  0.0382,  0.1245,  approximately 
19.  x5  = 0.0286  (error  0.0093),  xw  = 0.2196  (error  0.0189) 


Problem  Set  1.3,  page  18 

1.  If  you  add  a constant  later,  you  may  not  get  a solution. 

Example:  y'  = y,  In  |y|  = x + c,  y = ex+c  = cexbut  not  ex  + c (with  c + 0) 

3.  cos2  y dy  = dx,  gy  + 4 sin  2 y + c = x 

2 2 2 
5.  y + 36x  = c,  ellipses  7.  y = x arctan  (x  + c) 

9.  y = x/(c  — x)  11.  y = 24/x,  hyperbola 

13.  <:/>’/ si  11 2 y = dx/ cosh2x,  — coty  = tanhx  + c,  c = 0,  y = — arccot  (tanhx) 

15.  y2  + 4x2  = c = 25  17.  y = x arctan  (x3  — 1) 

19.  y0ekt  = 2 y0,  ek  = 2 (1  week),  e2k  = 22  (2  weeks),  e4fc  = 24 

21.  69.6%  ofyo  23.  PV  = c = const 

25.  T =22  - 17e_0'5306t  = 21.9  [°C]  when  t = 9.68  min 

27.  e~k  l°  = §,  k = jo.  In  2,  e~kt°  = 0.01,  t = (In  100)/*  = 66  [min] 

29.  No.  Use  Newton’s  law  of  cooling. 

31.  y = ax,  y'  = g(y/x)  = a = const,  independent  of  the  point  (x,  y) 

33.  AS  = 0.15SA</>,  dS/d4>  = 0.1551,  S = S0e0154,  = 1000So, 

4>  = (1/0.15)  In  1000  = 7.3  • 2t 7.  Eight  times. 
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Problem  Set  1.4,  page  26 

1.  Exact,  2x  = 2x,  xzy  = c,y  = c/x2  3.  Exact,  y = arccos  (c/cos  x) 

5.  Not  exact,  y = Vx2  + cx  7.  F = ex  , ex  tan  y = c 

9.  Exact,  u = e2xcosy  + k(y),  uy  = —e2x  sin  y + k',  k!  = 0.  A ns.  e2x  cosy  = 1 
11.  F = sinh  x,  sinh2  x cos  y = c 

13.  u = ex  + k(y),  uy  = k'  = — 1 + ey,  k = —y  + ev.  Ans.  ex  — y + ey  = c 
15.  b = k,  ax2  + 2 kxy  + ly2  = c 


Problem  Set  1.5,  page  34 

3.  y = cex  — 5.2  5.  y = (x  + c)e~kx 

7.  y = x\c  + ex)  9.  y = (x  - 2.5/e)ecos  x 

11.  y = 2 + c sinx  13.  Separate,  y — 2.5  = c cosh4  1.5x 

15.  (y!  + y2)'  + p(y1  + y2)  = (y{  + py'i)  + (y2  + py2)  = 0 + 0 = 0 

17.  (>!  + y2)'  + p(yi  + v2)  = (y i + py i)  + (y2  + /%)  = r + 0 = r 

19.  Solution  of  cy { + pc\\  = c(y  i + pyx)  = cr 

21.  y = uy *,  y + py  = uy*  + uy*’  + puy*  = uy*  + u(y*’  + py*)  = uy*  + u • 0 

= r,  u = r/y * = re^pdx,  u = j e^p  dx  r dx  + c.  Thus,  y = uy ^ gives  (4).  We  shall 

see  that  this  method  extends  to  higher-order  ODEs  (Secs.  2.10  and  3.3). 

23.  v2  = 1 + 8<?-x2 

25.  y = 1/m,  m = ce~3  Zx  + 10/3.2 

27.  dx/dy  = 6ey  — 2x,  x = ce~2y  + 2ey 

31.  T = 240efct  + 60,  7/10)  = 200,  k = -0.0539,  f = 102  min 
33.  y = A - ky,  y( 0)  = 0,  y = A{  1 - e~kt)/k 
35.  y = 175(0.0001  - y/450),  y(0)  = 450  • 0.0004  = 0.18, 
y = 0.135e_O'3889t  + 0.045  = 0.18/2, 
e-o.3889t  = (0  09  _ 0.045)/0.135  = 1/3, 
t = (In  3)/0.3889  = 2.82.  Ans.  About  3 years 
37.  y'  = y ~ y2  ~ 0.2 y,  y = 1/(1 .25  - 0.75e_o'8t),  limit  0.8,  limit  1 
39.  y = By 2 — Ay  = By(y  — A/B),A  > 0,  B > 0.  Constant  solutions  y = 0, 
y = A/B,  y'  > 0 if  y > A/B  (unlimited  growth),  y'  < 0 if  0 < y < A/B 
(extinction),  y = A/(ceAt  + B),  y(0)  > A/B  if  c < 0,  y( 0)  < A/B  if  c > 0. 


Problem  Set  1.6,  page  38 

1.  x2/(c2  + 9)  + y2/c 2 —1=0  3.  y — cosh  (x  — c)  — c = 0 

5.  y/x  = c,  y' /x  = y/x2,  y,  = y/x,  y ' = — x/y , y2  + x2  = c,  circles 

7.  2y2  — x2  = c 9.  y = — 2xy,  y ' = l/(2xy),  x = cey 

11.  y = cx 

13.  y'  = — 4x/9y.  Trajectories  y , = 9y/4x,  y = cx9,/4  Cc  > 0). 

Sketch  or  graph  these  curves. 

15.  u = c,  uxdx  + uy  dy  = 0,  y'  = —ux/uy.  Trajectories  y ' = uy/ux.  Now 

v = c,  vxdx  + vydy  = 0,  y = ~vx/vy.  This  agrees  with  the  trajectory  ODE 
in  u if  ux  = vy  (equal  denominators)  and  uy  = —vx  (equal  numerators).  But  these 
are  just  the  Cauchy-Riemann  equations. 
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Problem  Set  1.7,  page  42 

1.  y'  = f(x,y)  = r(x)  — p(x)y\  hence  df/dy  = — p(x)  is  continuous  and  is  thus 
bounded  in  the  closed  interval  \x  — jc0I  = a. 

3.  In  \x  — x0  < a;  just  take  b in  a = b/K  large,  namely,  b = aK. 

5.  R has  sides  2 a and  2b  and  center  (1,1)  since  y(  1 j = 1.  In  R, 
f=2y2^2{b+\f  = K,  a = b/K  = b/(2(b  + l)2),  da/db  = 0 gives  b = 1 , 
and  aopt  = b/K  = g.  Solution  by  dy/y2  = 2 dx,  etc.,  y = 1/(3  — 2x). 

7.  1 1 + /|  g K = 1 + b2,  a = b/K,  da/db  = 0,  b = 1,  a = 

9.  No.  At  a common  point  (xi,  Vi)  they  would  both  satisfy  the  “initial  condition” 
y(x  /)  = j’i,  violating  uniqueness. 


Chapter  1 Review  Questions  and  Problems,  page  43 

11.  y = ce~2x  13.  y = 1 /(ce-4*  + 4) 

15.  y = ce~x  + 0.01  cos  lOx  + 0.1  sin  lOx 
17.  y = ce~2  5x  + 0.640a-  - 0.256 

19.  25y2  — 4x2  = c 21.  F = x,  x3ey  + x2y  = c 

23.  y = sin  (x  + \tt)  25.  3 sin  x + g sin  v = 0 

27.  ek  = 1.25,  (In  2)/ln  1.25  = 3.1,  (In  3)/ln  1.25  = 4.9  [days] 
29.  ek  = 0.9,  6.6  days.  43.7  days  from  ekt  = 0.5,  ekt  = 0.01 


Problem  Set  2.1,  page  53 


1.  F{x,  z,  z)  = 0 
5.  y = (Clx  + c2)-1/2 
7.  ( dz/dy)z  = -z3siny,  —l/z  = 
9.  y2  = x3  In  x 
13.  y(t)  = cpe  ~ t + kt  + c2 
17.  y = — 0.75x3/2  - 2.25x~1/z 


3.  y = C!e  x + c2 

dx/  dy  = cosy  + ci,  x = — sin  y + ciy  + c2 

U2x  . 

,y  = c1e  + c2 
15.  y = 3 cos  2.5x  — sin  2.5x 
19.  y = 15e-x  — sinx 


Problem  Set  2.2,  page  59 


t —2.5x  . 2.5x 

1.  y = c\e  + c2e 

5.  y = (ci  + c2x)e~7TX 

9.  y = Cle-2  (ix  + c2e°-8x 

13.  y = (Cl  + c2x)e5x/3 

17.  y"  + 2V5 y’  + 5y  = 0 

21.  y = 4.6  cos  5x  — 0.24  sin  5x 

25.  y = 2e-x 


'y  —2.8a:  . — 3.2x 

3.  y = c\e  + c2e 

7 . y = d + c2e-4-5x 

11.  y = cie-x/2  + c2e3x/2 

15.  y = e~°-21x  (A  cos  (Vttx)  + B sin  (V/rx)) 

19.  y"  + 4 y + 5y  = 0 

23.  y = 6e2x  + 4e~3x 

27.  y = (4.5  - x)e~7TX 


29.  v = — ' e °"27x  sin  ( V77x)  31.  Independent 

V 77 

33.  C]X2  + c2x2  In  x = 0 with  x = 1 gives  cq  = 0;  then  c2  = 0 for  x = 2,  say. 
Hence  independent 

35.  Dependent  since  sin  2x  = 2 sin  x cos  x 
37.  yi  = e~x,  y2  = 0.001ex  + e_x 
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Problem  Set  2.3,  page  61 

1.  4e2x , —e~x  + 8eZx,  — cosx  — 2 sinx 

3.  0,  0,  (D  - 27)(— 4<?-2x)  = 8e-2x  + 8e~Zx 

5. 0,  5eZx,  0 

7.  (2D  - 7)(2D  + /),  y = Cle0  5x  + c2e-05x 
9.  (D  - 2. 1/)2,  y = (ci  + c2x)e21x 
11.  (D  - 1.6/)(D  - 2.47),  y = Cle16x  + c2e2Ax 

15.  Combine  the  two  conditions  to  get  L(cy  + kw ) = L(cy)  + L(kw ) = cLy  + kLw. 

The  converse  is  simple. 

Problem  Set  2.4,  page  69 

1.  y'  = Vo  cos  co0t  + (vq/coo)  sin  co0t.  At  integer  t (if  <w0  = 77),  because  of  periodicity. 
3.  (i)  Lower  by  a factor  V2,  (ii)  higher  by  V2 

5.0. 3183,  0.4775,  V(£i  + k2)/m/(2TT ) = 0.5738 

7.  mLO”  = —mg  sin  9 ~ —mgO  (tangential  component  of  W = mg), 

9"  + coo*0  = 0,  co0/(2tt)  = V^/L/(2tt) 

9.  my"  = —ayy,  where  m = 1 kg,  ay  = 77  • 0.01 2 ■ 2y  meter3  is  the  volume  of  the 
water  that  causes  the  restoring  force  ayy  with  y = 9800  nt  (=  weight/meter3). 
y"  + co0zy  = 0,  <wo2  = ay/m  = ay  = 0. 0006287.  Frequency  u>q/2tt  = 0.4  [sec-1]. 
13.  y = [y0  + (v0  + ayo)i]e~at,  y = [1  + (v0  + l)f]e-t; 

(ii)  Vo  = —2,  — |,  — — § 

15.  co*  = [co02  - c2/(4;«2)]1/2  = co0[  1 - c2/(4;n£)]1/2  « w0(l  - cz/8mk)  = 2.9583 
17.  The  positive  solutions  of  tan  7=1,  that  is,  77/4  (max),  577/4  (min),  etc 
19.  0.0231  = (In  2)/30  [kg/sec]  from  exp  (—10  • 3c/2m)  = 


Problem  Set  2.5,  page  73 

3.  y = (ci  + c2lnx)x-1'8 

5.  Vx  (ci  cos  (In  x)  + c2  sin  (In  x)) 

7.  y = cjx2  + c2x3 

9.  y = (ci  + c2  In  x)x°'6 

11.  y = x2(ci  cos  ( V6  In  x) 

+ c2  sin  ( V6  In  x)) 

13.  y = x-3/2 

15.  y = (3.6  + 4.0  lnx)/x 

17.  v = cos  (lnx)  + sin  (lnx)  19.  y = — 0.525x5  + 0.625x  3 

Problem  Set  2.6,  page  79 

3.  W=  -2.2e~3x  5.  W = -x4  7.  W = a 

9.  y"  + 25 y = 0,  W = 5,  y = 3 cos  5x  — sin  5x 
11.  y"  + 5y  + 6.34  = 0,  W = 0.3e"5x,  3e-2'5  cos  0.3x 
13.  y"  + 2y'  = 0,  W = ~2e~Zx,  y = 0.5(1  + e~2x) 

15.  y"  - 3.24y  = 0,  W = 1.8,  y = 14.2  cosh  1.8x  + 9.1  sinh  1.8x 

Problem  Set  2.7,  page  84 

1.  y = c\e~x  + c2e-4x  — 5e-3x  3.  y = Cie-2x  + c2e-x  + 6x2  — 18x  + 21 

5.  y = (ci  + c2x) e~2x  + ^e~x  sinx  7.  y = C\e~x^2  + c2e-3x^2  + |ex  + 6x  — 16 
9.  y = Cle4x  + c2e-4x  + 1.2xe4x  - 2ex 
11.  y = cos(V3x)  + 6x2  — 4 
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13.  y = ex/4  - 2ex/2  + h~x  + ex  15.  y = In  x 
17.  y = e~01x  (1.5  cos  0.5x  - sin  0.5x)  + 2e03x 


Problem  Set  2.8,  page  91 

3.  vp  = 1.0625  cos  2 ? + 3.1875  sin  2 ? 

5.  vp  = — 1.28  cos  4.5?  + 0.36  sin  4.5? 

7.  Vp  = 25+5  cos  + sin  3? 

9.  v = e~15t(A  cos  ? + B sin  ?)  + 0.8  cos  ? + 0.4  sin  ? 

11.  y = A cos  V2?  + B sin  V2?  + ?(sin  V2?  — cos  V2?)/(2V2) 

13.  y = A cos  ? + B sin  ? — (cos  a>t)/(co2  — 1) 

15.  y = e~2t(A  cos  2?  + B sin  2?)  + 5 sin  2? 

17.  y = | sin  ? — 515  sin  3?  — 555  sin  5? 

19.  v = e_t(0.4  cos  ? + 0.8  sin  ?)  + e-t/2(— 0.4  cos \t  + 0.8  sin  2?) 

25.  CAS  Experiment.  The  choice  of  w needs  experimentation,  inspection  of  the  curves 
obtained,  and  then  changes  on  a trail-and-error  basis.  It  is  interesting  to  see  how  in 
the  case  of  beats  the  period  gets  increasingly  longer  and  the  maximum  amplitude  gets 
increasingly  larger  as  u>/(2tt)  approaches  the  resonance  frequency. 

Problem  Set  2.9,  page  98 

1.  Rl'  + I/C  = 0,  I = ce_t/(RC) 

3.  LI'  + RI  = E,  I = ( E/R ) + ce~Rt/L  = 4.8  + ce~4()l 

5. 1 = 2 (cos  ? — cos  20?)/399 

7. 10  is  maximum  when  5 = 0;  thus,  C = 1/(<w2L). 

9. 1 = 0 11. 1 = 5.5  cos  10?  + 16.5  sin  10?  A 

13. 1  = e~5t{A  cos  10?  + B sin  10?)  — 400  cos  25?  + 200  sin  25?  A 

15.  R > Rcril  = 2 \Zl/C  is  Case  I,  etc. 

17.  E( 0)  = 600,  /'( 0)  = 600,  / = e-3t(-100  cos  4?  + 75  sin  4?)  + 100  cos  ? 

19./?  = 2 a,  L = 1 H,  C=  J^F,  £ = 4.4  sin  10?  V 

Problem  Set  2.10,  page  102 

1.  y = A cos  3 x + B sin  3x  + jj(cos  3x)  In  | cos  3x  \ + \x  sin  3x 

3.  y = Cjx  + C2X2  — x sin  x 5.  y = A cos  x + B sin  x + |v:(cos  x + sin  x) 

7.  y = (ci  + C2x)e2x  + x~2e2x  9.  y = (ci  + C2x)ex  + 4x1^2 ex 

11.  y = C]X2  + C2.r3  + l/(2x4)  13.  y = c\x  3 + C2X3  + 3x5 


Chapter  2 Review  Questions  and  Problems,  page  102 


n — 4.5x  . —3.5a; 

7 . y = c\e  + c^e 

11.  y = (c1  + c2x)e0  8x 

15.  y = C\e2x  + C2e~x^2  — 3x  + x2 

19.  y = 5 cos  4x  — | sin  4x  + ex 

23. 1 = -0.01093  cos  415?  + 0.05273 


9.  y = e 3t(A  cos  5 x + B sin  5x) 

13.  y = cut-4  + C2X3 
17.  y = (d  + c2x)e15x  + 0.25x2e15x 
21.  y = — 4x  + 2x3  + 1/jc 
sin  415?  A 


App.  2 Answers  to  Odd-Numbered  Problems 
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25. 1 = (50  sin  At  — 110  cos  At)  A 

27.  /?LC-circuit  with  R = 20  O,  L = A H,  C = 0.1  F,  E = —25  cos  At  V 
29.  a)  = 3.1  is  close  to  co0  = \fkjm  = 3,  y = 25  (cos  3 1 — cos  3. It). 


Problem  Set  3.1,  page  111 

9.  Linearly  independent  11.  Linearly  independent 

13.  Linearly  independent  15.  Linearly  dependent 

Problem  Set  3.2,  page  116 

1.  y = ci  + c2  cos  5x  + C3  sin  5x  3.  y = c±  + c2x  + C3  cos  2x  + C4  sin  2x 

5.  y = Ai  cos  x + Bi  sin  x + A2  cos  3x  + sin  3x 

7.  y = 2.398  + e~16x  (1.002  cos  1.5x  - 1.998  sin  1.5x) 

9.  y = Ae~x  + 5e~x^2  cos  3x  11.  y = cosh  5x  — cos  Ax 

13.  y = e0  25x  + A.3e~°'7x  + 12.1  cos  O.lx  - 0.6  sin  O.lx 


Problem  Set  3.3,  page  122 

1.  y = (ci  + c2x  + C3X2)e-x  + \ex  — x + 2 

3.  y = Ci  cos  x + c2  sin  x + C3  cos  3x  + C4  sin  3x  + 0. 1 sinh  2x 

5.  y = c ix  + c2x  + C3X  — 12  x 

7 . y = (ci  + c2x  + c$x2)e3x  — 4 (cos  3x  — sin  3x) 

9.  y = cos  x + 2 sin  Ax  11.  y = e~3x(—l.A  cos  x — sin  x) 

13.  y = 2 — 2 sin  x + cos  x 


Chapter  3 Review  Questions  and  Problems,  page  122 
7.  y = Ci  + e~2x(A  cos  3 x + B sin  3x) 

9.  y = ci  cosh  2x  + c2  sinh  2x  + C3  cos  2x  + C4  sin  2x  + cosh  x 

11.  y = (ci  + c2x  + C3X2)e-1'5'T  13.  y = (ci  + c2x  + c^x2)e~2x  + x2  — 3x  + 3 

15.  y = cix  + c2x1/2  + c3x3/2  - 30  17.  y = 2e~2x  cos  4x  + 0.05  x - 0.06 

19.  y = 4e-4x  + 5e-5x 


Problem  Set  4.1,  page  136 
1.  Yes 

5.  y\  = 0.02(-yi  + y2),  y2  = 0.02(yi  - 2y2  + y3),  y2  = 0.02(y2  - y3) 

7.  Ci  =1,  c2  — —5  9.  ci  = 10,  c2  = 5 

11.  y'\  = y2,  y2  = yi  + + X.V2,  y = ci[l  4]Tc4t  + c2  [1  -i]Tc_t/4 
• y 1 - J2,  = 24^1  - 2^2,  3^1  - C±e  + c2e  - y,  y2  = y 

15.  (a)  For  example,  C = 1000  gives  -2.39993,  -0.000167.  (b)  -2.4,  0. 

(d)  a22  = —4  + 2 v'6.4  = 1.05964  gives  the  critical  case.  C about  0.18506. 
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Problem  Set  4.3,  page  147 


1. 

3. 

5.  3 ! 

y2 

7-yi 

,V2 

>’3 

9.Vi 

y2 

y3 

11.  V! 

>'2 


_ —2 1 i 2 1 o —2t 

- c-^e  + c2e  , y2  - ~3 cxe 
= 2ciezt  + 2 c2,  y2  = c xe2t  ~ c2 
= 5 Cl  + 2 c2c14'5t 

0 i c 14.5 t 
= —2ci  + 5c2e 

= —c2  cos  V2 1 + C3  sin  V2 1 + ci 
= C2  V2  sin  V2t  + C3  cos  V2f 
= C2  cos  V2 1 — C3  sin  V2f  + ci 
= ^Clc-18t  + 2c2e9t  - c3c18t 

— Cie  + c2e  + c2e 

CiC  - 2c2c  - 2C3C 

= -20e‘  + 8e_t/2 
= 4e*  - 4e_t/2 


+ c2e 


2t 


13.  yi  = 2 sinh  t,  y2  = 2 cosh  t 

1C  It 

15.  y 1 = 

1 t 
y2  = 2« 

17.  32  = yl  + y^  32  = >’1  + >’i  = -31  - ,V2  = -yi  - (31  + yi), 

3"  + 2y'i  + 23!  = 0,  31  = e_t(A  cos  t + B sin  t), 

32  = 31  + 31  = e_t{B  cos  t — A sin  t ).  Note  that  r2  = yf  + y\  = e~zt{A2  + B2). 
19.  /1  = Cie_t  + 3c2c_3t,  I2  = — 3cie_t  — C2e_3t 


Problem  Set  4.4,  page  151 

1.  Unstable  improper  node,  31  = cie\  y2  = c2e2t 

3.  Center,  always  stable,  y\  = A cos  3 1 + B sin  3f,  32  = 3 B cos  3 1 — 3 A sin  3 1 
5.  Stable  spiral,  y^  = e~zt(A  cos  2 1 + B sin  2 r),  32  = e~zt(B  cos  2t  — A sin  2 1) 

7.  Saddle  point,  always  unstable,  31  = cie_t  + c2e3t,  y2  = — Cie_t  + C2e3t 
9.  Unstable  node,  31  = Cie6t  + c2ezt,  y2  = 2c±e6t  — 2c2ezt 
11.  3 = e~l  (A  cos  t + B sin  t).  Stable  and  attractive  spirals 
15.  p = 0.2  A 0 (was  0),  A <0,  spiral  point,  unstable. 

17.  For  instance,  (a)  —2,  (b)  —1,  (c)  = — (d)  =1,  (e)  4. 

Problem  Set  4.5,  page  159 

5.  Center  at  (0,  0).  At  (2,  0)  set  31  = 2 + 31.  Then  32  = 3i-  Saddle  point  at  (2,  0). 

7.  (0,  0),  31  = —31  + 32,  32  = —31  — 32,  stable  and  attractive  spiral  point;  (—2,  2), 
3i  = -2  + 31.,  32  = 2 + 32,  31  = -31  ~ 3y2,  y2  = -31  - 32,  saddle  point 
9.  (0,  0)  saddle  point,  (—3,  0)  and  (3,  0)  centers 
11.  (2 77  ± 27177 , 0)  saddle  points;  (— ^77  ± 2«77,  0)  centers. 

Use  —cos  (±2tt  + 3!)  = sin  (±31)  ~ ±3 1. 

13.  (±2/777  , 0)  centers;  y1  = ( 2n  + 1)77  + y[,  (77  ± 2/777,  0)  saddle  points 

f 3 f 

15.  By  multiplication,  3232  = (4yi  — 31)31.  By  integration, 

3!  = 43?  - 231  + c*  = 2(c  + 4 - 3i)(c  - 4 + 31),  where  c*  = \cz  - 8. 

Problem  Set  4.6,  page  163 

? -t,t  -t  i t 3t 

3.  yi  = c\e  + c^e  , y 2 = — c^e  + c^e  — e 

5.  31  = cic5t  + c2ezt  - 0.43 1 - 0.24,  32  = cie5t  - 2c2e2t  + 1.127+  0.53 
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All 


7.  V!  = cy4  + 4c2e2t  — 3?  — 4 — 2e  f,  y2  = — cy*  — 5c2e2t  + 5t  + 7.5  + e 1 

9.  The  formula  for  v shows  that  these  various  choices  differ  by  multiples  of  the  eigen- 

vector for  A = — 2,  which  can  be  absorbed  into,  or  taken  out  of,  ci  in  the  general 
solution  y(W. 

11.  yi  = — | cosh  f — 4 sinh  t + ^ e2t , y2  = — I sinh  t — § cosh  t + |e2t 

13.  yi  = cos  2 1 + sin  2f  + 4 cos  t,  y2  = 2 cos  2t  — 2 sin  2f  + sin  f 

15.  y±  = 4e_t  — 4ef  + e2t,  y2  = —4e~t  + t 
17.  4 = 2cyAlt  + 2c2eAzt  + 100, 

/2  = (1.1  + V0.41)ryA|t  + (1.1  - V(X4T)  c2eAzt, 

Ai  = -0.9  + V04l,  A2  = -0.9  - VO Al 
19.  Cl  = 17.948,  c2  = -67.948 


Chapter  4 Review  Questions  and  Problems,  page  164 

11.  >’i  = <y4t  + c2e~4',  y2  = 2<y4t  — 2c2e-4t.  Saddle  point 

13.  yi  = e~4t(A  cos  t + B sin  t ),  y2  = § e~4t[(B  — 2 A)  cos  t — (A  + 2 B)  sin  f]; 

asymptotically  stable  spiral  point 
15.  \’i  = cy  _5t  + c2e~t,  y2  = cy_5t  — c2e~t.  Stable  node 

17.  yj  = e~l(A  cos  2 1 + B sin  2 1),  y2  = e^iB  cos  2 1 — A sin  2f).  Stable  and  attractive 

spiral  point 


19.  Unstable  spiral  point 

21.  V! 

= Cle~4t  + c2e4t  - 

- 1 - 8r2,  y2 

= -cy_4t  + c2e4t  - 4r 

23.  y! 

= 2cy_t  + 2c2e3t 

+ cos  t — sin  t, 

V2  = -C!?-4  + c2e3t 

25.  ![ 

+ 2.5(7],  - I2)  = 169  sin  t,  2.5(72 

- I[)  + 2572  = 0, 

h 

= (19  + 32.5  t)e~5t 

— 19  cos  t + 62.5  sin  t, 

h 

= (-6  - 32.5f)e_5t  + 6 cos  t + 2.i 

5 sin  t 

27.  (0,  0)  saddle  point;  (—1,  0),  (1,  0)  centers 

29.  (mr,  0)  center  when  n is  even  and  saddle  point  when  n is  odd 


Problem  Set  5.1,  page  174 

3.  V\k\ 

5.  V3/2 

7.  y = a0(  1 - x2  + x4/2!  - x6/3!  + - 


9.  y = a0  + flix  - | 


t 3 , 
6«1*  + 


11.  fl0(l  “ y*4 

13.  flo(l  — h*2  ~ 


“ (m  + l)(m  + 2) 

15.  2 V- V 


2a0x 
1 5 

60  x 

14  , T3  6 , 

9dx  i 790-1 


\ —X 

" ) — aoe 

= ciq  cos  x + ai  sin  x 
) + ai(x  + \x2  + gx3  + z^x4  — 
) + ai(x 


Ir  3 
6x 


1 5 , 

X + 


24 


JL  5 _ 
24* 

— r7 

1008* 


5 3 


m=5 

2 4 


(m  — 4)2 

2 — xm 

(m  - 3)!' 

+ ■?(!)  = 


923 

768 


m= 1 ("*+!)  + 1 
17.  s = 1 + x — j 

19.  j = 4 — x2  — gx3  + go x5,  s(2)  = — |;  but  x = 2 is  too  large  to  give  good 

values.  Exact:  y = (x  — 2)2ex 


Problem  Set  5.2,  page  179 

5.  P6(x)  = 113(23 lx6  - 315x4  + 105x2  - 5), 
Pj(x)  = iii(429x7  - 693x5  + 315x3  - 35x) 
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11.  Set  x = az.  y = C\Pn{xj  a)  + c2<2„(x/a) 

15.  P\  = Vl  ~ x2,  P\  = 3xVl  - x2,  Pf  = 3(1  - x2), 
Pf  = (1  - x2)(105x2  — 15)/2 


Problem  Set  5.3,  page  186 

2 4 

3.  Vi  — 1 1 + 

yi  3!  5! 


sin  x 1 x x 

Vo  — b 

x •7Z  x 2!  4! 


5.  £>0  =1-  c0  = 0 

- ,,12  13 

7.  Vl  = 1 + 2X  - 6 

1 ..4 


r2  = 0, 


2-t 

J2  = x + 6X°  — T2X'*  + 120 x 


y 1 = e 

i 4 _ 

24 x 30 
1 ..5  _J_ 

120 


-x 


4x5  + 


y2  = e 
1 6 
144  X 


In  x 


x6  + 


9.  yiVx,  y2  = 1 + x 
U.yi=ex,  y2  = ex/x 
13.yi  = ex,  y2  = ex  lnx 
15.  y = AF{\,  1,  4;  x)  + Px3/2P(§,  |;  x) 

17.  y = A{  1 - 8x  + f x2)  + Px3/4P(|, 

19.  y = ClP(2,  -2,  t - 2)  + c2(f  - 2)3/2P(l,  -±  |;  t - 2) 


5 7 \ 

4,  4,  X) 


COS  X 


Problem  Set  5.4,  page  195 

3.  Ci70(Vx) 

5.  ci./v(Ax)  + c2/_v( Ax),  v =A  0,  ±1,  ±2,  • • • 

7.  Ci/i/2(2x)  + c2/_i/2(2x)  = x_1/2(ci  sin  2x  + c2  cos  2x) 

9.  x_v(ci7v(x)  4-  c2/_v(x)),  v # 0,  ±1,  ±2,  ■ • • 

13.  Jn(x  j J = Jn(x2)  = 0 implies  x \nJn(x  \ ) = x2‘”7n(x2)  = 0 and 
[x~nJn{x)\  = 0 somewhere  between  Xi  and  x2  by  Rolle’s  theorem. 

Now  use  (21b)  to  get  /n+1(x)  = 0 there.  Conversely,  /n+1(x3)  = /n+i(x4)  = 0, 
thus  X3+1iTO+i(x3)  = X4 +1/n+i(x4)  = 0 implies  Jn(x ) = 0 in  between  by  Rolle’s 
theorem  and  (21a)  with  v = n + 1. 

15.  By  Robe,  j'o  = 0 at  least  once  between  two  zeros  of  J0.  Use  j'0  = —J\  by  (21b) 
with  v = 0.  Together  = 0 at  least  once  between  two  zeros  of  J0.  Also  use 
(xJi)'  = xj()  by  (21a)  with  v = 1 and  Robe. 

19.  Use  (21b)  with  v = 0,  (21a)  with  v = 1,  (21d)  with  v = 2,  respectively. 

21.  Integrate  (21a). 

23.  Use  (21a)  with  v = 1,  partial  integration,  (21b)  with  v = 0,  partial  integration. 

25.  Use  (2 Id)  to  get 


J${x)  dx 


—2J^(x)  + 


Js(x)  dx 


—2J^(x)  — 2 /2(x)  + 


— 2/4(x)  — 2 ./2(x)  — Jq(x)  + c. 


A(x)  dx 


Problem  Set  5.5,  page  200 

1.  C\J$(x)  + C2l4(x) 

3.  Ci72/3(x2)  + c2y2/3(x2) 

5.  ci7o(Vx)  + c21q(Vx) 
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7.  Vt(ci/i/4(|far2)  + c2ii/4(5far2)) 
9.  x3(ci/3(x)  + c2Y3(x )) 

11.  Set  //(1)  = £H(2)  and  use  (10). 

13.  Use  (20)  in  Sec.  5.4. 


Chapter  5 Review  Questions  and  Problems,  page  200 

11.  cos  2x,  sin  2x 

13.  (x  — I )-5,  (x  — l)7;  Euler-Cauchy  with  x — 1 instead  of  x 
15.  (x) , ./_  v,,(x) 

17.  ex,  1 + x 

19.  Vx/i(Vx),  VxFi(Vx) 


Problem  Set  6.1,  page  210 

1.  3/s2  + 12 /s 
5.  l/((5  - 2)2  - 1) 

s 5 

13.  0 - 

s 

19.  Use  eat  = cosh  at  + sinh  at. 
23.  Set  ct  = p.  Then  -£(  f(ct))  = 

25.  0.2  cos  1.8 1 + sin  1.8? 

29.  2t3  - 1.9 r5 
2 


3.  S/(S2  + 7T2) 

7.  (co  cos  9 + s sin  0)/(s2  + <w2) 


11. 


15. 


1 — e 


—bs 


be 


-bs 


s 

e~s  - 1 


2 s 


e-+l- 
2s  s 


Jo 


33. 


(s  + 3)d 


e stf(ct)  dt  = 

27.  — cos 
L2 

31. 


35. 


e (s/c)p/(/?)  dp/c  = F(s/c)/c. 
Jo 

llTTt 

L 

4 3_ 

5 — 2 s + U 

0.5  • 2t t 


= 4e2t  - 3e_t 


37.  Trte~Trt 
41.  e~5lrt  sinh  lit 
45.  (k0  + kit)e~at 


(s  + 4.5)2  + 4772 
39.  lt3e~tV2 
43.  e3t(2  cos  37  + § sin  3 1) 


Problem  Set  6.2,  page  216 

1.  y = 1.25e-5'2t  - 1.25  cos  2 1 + 3.25  sin  2 1 

3.  ( 5 - 3)(5  + 2)  = 1 Is  + 28  - 11  = 1 Is  + 17,  Y = 10/(s  - 3)  + l/(s  + 2), 
y = 10e3t  + e~2t 

5.  (s2  — |)7  = 12s,  y = 12  cosh \t 

'•  y 2e  ' 2e  ' 2e  y.  y = e — e + 2t 

11.  (s  + 1.5)27  = s + 31.5  + 3 + 54/s4  + 64/s, 

7 = l/(s  + 1.5)  + l/(s  + 1.5) 2 + 24/s4  - 32/s3  + 32/s2, 
y = (1  + f)e-1'5t  + 4 13  - 16 12  + 32 1 
13.  t = t - 1,  7 = 4/(s  - 6),  y = 4e6t,  y = 4e6(t+1) 
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15.  t = 7 + 1.5,  (s  - 1)0  + 4 )Y  = 4s  + 17  + 6/0  - 2),  >■  = 3e4 


+ e 


17. 


1 


19. 


2wz 


s(s2  + 4m2) 


(5  + a) 

21.  ^£(/,)  = ££(sinh  2?)  = si£(/)  — 1.  Answer:  0 2 
23.  12(1  - e_4/4) 


2)/  (s3 
25.  (1  — cos  m7)/m2 


4s) 


27.  g(l  + 7 — cos  3?  — g sin  3 7) 

Problem  Set  6.3,  page  223 
3.  £((t  - 2)u(t  - 2))  = e~2s/s2 


29.  —Ae 

a A 


-at 


i)  + - 
7 a 


5.  | e \ 1 — ;/(  ? — —77 


- ( J _ e-rrs/2  + Tr/2 ) 


7. — - — 

S + 77 

9.  e 


(e 


-2(s  + 7r) 


e 

3s/2(  A + A + i 
s3  s2  5 


-4(s  + 7r)\ 


13.  2[1  + u(t  — 77)]  sin  3/ 

17.  e-4  cos  7 (0  < t < 277 ) 

21.  sin  3t  + sin  t (0  < t < 77);  § sin  3 1 ( t > 77) 

sin  t 0 > 1 ) 


11.  (se_  175/2  + e_17S)/(s2  + 1) 

15.  0 - 3)3m(7  - 3)/6 
19.  ^(e4  - l)3e-54 

23.  e4  — sin  t (0  < t < 277),  e4  — | sin  2?  (7  > 277) 

25.  r — sin  f (0  < t < 1),  cos  ( t — 1)  + sin  (r  — 1) 

27.  t = 1 + F,  y"  + Ay  = 8(1  + f)2(l  - u(t  - 4)),  cos  27  + 2 1 

cos  2 1 + 49  cos  (2 1 — 10)  + 10  sin  (2?  — 10)  if  t > 5 
29.  0.1/'  + 25 i = 490e_5t[l  - u(t  - 1)], 

i = 2 0(e  — e ) + 20u(t  — l)[—e  + e J 

31.  Rq  + q/C  = 0,  Q = £(q),  q(0)  = CV0,  i = q\t), 

R(sQ  - CV0)  + Q/C  = 0,  q = CV0e~V™ 


1 if  t < 5, 


I = e 


-2s, 


1 


s + 10/ 


i = 0 if  t < 2 and 


33.  10/  + — / = — e_2s, 

S 2 

1 - e“10('“2)  i ft  > 2 

35.  i = (10  sin  107  + 100  sin  ?)(n(?  — 77)  — u(t  — 377)) 

37.  (0.5s2  + 20)/  = 78s(l  + e_17S)/(s2  + 1), 

7 = 4 cos  7 — 4 cos  V/407  — 4m(7  — 77)[cos  ? + cos  (V40(7  — 77))] 

rt 

39.  i ' +2i  + 2 


7(7)  dr  = 1000(1  - 77(?  - 2)),  I = 1000(1  - e~zs)/{sz  + 2 s + 2), 


7 = 1000<?  4 sin  7 


1000t7(?  - 2)e_t  + 2 sin  (7  - 2) 


Problem  Set  6.4,  page  230 

3.  j = 8 cos  27  + 377(7  — 77)  sin  27 

5.  sin  ? (0  < 7 < 77);  0 (77  t 2 vi ) , sin  t (J/  2tt) 

7.  j = e-4  + 4e-34  sin  + \ u(t  - 2)e_3(4_1/2)  sin  (g7  - |) 

9.  37  = 0.1 [e4  + e~2t{— cos  7 + 7 sin  7)]  + 0.  1m(?  — 10)[—  e-4  + 
g-2t  + 30(cos  (f  _ 10)  _ 7 sin  (f  _ IQ))] 
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11.  y = -e  34  + e 24  + g u(t  - 1)(1  - 3e  2(t  15  + 2e  3(4  1))  + 
u(t  ~ 2)(e_2C4_2)  - e-3(t-2)) 

15.  ke~ps/(s  — se~ps ) ( s > 0) 


Problem  Set  6.5,  page  237 


1. ? 

5 .\t  sin  u>t 

9.  y — 1 * y = 1 , y = e 


13.  y(t)  + 2 


r y(r)  dr  = tet, 


y 


n4t  -1.5t 

. e — e 

21.  (cot  — sin  wt)/(o2 
25.  1.5?  sin  6? 


3.  (e4  — e t)/2  = sinh  ? 
7.  e4  — ? — 1 
11.  y = cos  ? 

sinh  ? 

19.  ? sin  77? 

23.  4.5(cosh  3?  - 1) 


Problem  Set  6.6,  page  241 


7. 


0 + 3)2 
2s3  + 24s 
(s2  - 4)3 


11. 


As 


77 

1_2>.2 


/ 2 I 1„Z\ 

(s  + 577  ) 


5. 

9. 


/ 2 1 2\2 

(i  + (O  ) 
TT(3s2  - 772) 

(,2  + 772)3 


15.  F(i) 


, m = 


17.  In  .v  - In  (5  - 1);  (-1  + e‘)/? 

19.  [In  02  + 1)  - 2 In  0 - 1)]'  = 2s/(sz  + 1)  - 2/0  - 1);  2(-cos  ? + e4)/? 


Problem  Set  6.7,  page  246 

•3  —5 1 1 a 2t  —5 1 . q 2 1 

3.  yi  = — e + 4e  , = e + 3e 

5.  y\  = —cos  ? + sin  ? + 1 + m(?  — 1)[— 1 + cos  (?  — 1)  — sin  (?  — 1)] 
= cos  ? + sin  ? — 1 + m(?  — 1 )[  1 — cos  (?  — 1)  — sin  (?  — 1)] 

7.  >’!  = —e~2t  + 4e4  + |m(?  - l)(-e3-24  + e4), 
y2  ~ ~e  + e + — 1)(— £ + e ) 

9.  >’i  = (3  + 4?)e34,  y2  = (1  - 4?)e34 

nt  . 2t  2t 

. yi  = e ^ , y2  = e 

13.  vi  = — 4e4  4-  sin  10?  + 4 cos  ?,  yi  = 4e4  — sin  10?  + 4 cos  ? 

15.  yi  = e4,  = <?-4,  y3  = et  ~ e-4 

19.  4ij  + 8(?i  — (2)  + 2i[  = 390  cos  ?,  8/2  + 8(?2  — 0)  + 4?2  = 0, 
ii  = — 26e~24  — 16e~84  + 42  cos  ? + 15  sin  ?, 
i 2 = — 26e~24  + 8e~84  +18  cos  ? + 12  sin  ? 


Chapter  6 Review  Questions  and  Problems,  page  251 


s - 4 sz  - 1 
15.  e~3s  + 3/2/(s  - |) 


13.  g(l  — cos  77?),  7TZ/(2s3  + 2772i) 

17.  Sec.  6.6;  2s2/02  + l)2 


05|M 


A16 
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19.  12/Cy2Cj  + 3)) 
23.  sin  ( cot  + 0) 


27.  e 2tO  cos  t — 2 sin  t) 

31.  e-t  + u(t  - 77)[1.2  cos  t - 3.6  sin  t + 2e~t  + 7r  - 0.8e2t-27T] 


21.  tu(t  — 1) 

25.  3 12  + t3 

29.  y = e-2t(13  cos  t + 1 1 sin  f)  + lOf  — 
+ 2e 

2(4-2 } ( t > 2) 


33.  0 (0  g t g 2),  1 - 2e-tt-2)  + e 

35.  — e , y2  = e — e 

37.  = cos  t — u(t  — 7t)  sin  t + 2 u(t  — 2tt)  sin2  \ t, 

y2  = — sin  t — 2 u(t  — tt)  cos  ^ + u(t  — 2tt)  sin  t 
39.  V!  = (1/ VTO)  sin  VlOf,  y2  = -(1/ VTO)  sin  VIO t 
41.  1 - e~U0<t<  4),  (e4  - l)e-t  (t  > 4) 

43.  i(t)  = e-4t(2gcos  3f  — gg  sin  3 1)  — ggcos  lOf  + ggsin  lOt 
45.  5/1  + 20(i!  - /2)  = 60,  30/2  + 20 (/2  - i[)  + 20/2  = 0, 
i1  = — 8e-24  + 5e-0'8t  + 3,  i2  = -4e~2t  + 4e~08t 


Problem  Set  7.1,  page  261 


3. 

3X3,  3X4, 

3X6,  2X2, 

2X3, 

3X2 

5. 

B = §A,  j^jA 

7. 

No,  no,  yes, 

no 

, no 

" 0 6 12" 

0 2.5 

f 

" 0 8.5 

13~ 

9. 

18  15  15 

2.5  1.5 

2 

20.5  16.5 

17 

3 0-9 

-1  2 

-1 

2 2 

-10 

, undefined 


11. 


0 

34 

28 


26 

32 

-10 


same. 


5.4  0.6 
-4.2  2.4 
-0.6  0.6 


same 


13. 


70  28 

-28  56 

14  0 


, same,  — D,  undefined 


15. 


5.5 

33.0 

-11.0 


same,  undefined,  undefined  17. 


-4.5 

-27.0 

9.0 


Problem  Set  7.2,  page  270 
5.  10,  n(n  + l)/2 


l 

O 

1 

"l  l" 

l 

O 

O 

1 

’ 

0 0 

7.  0,  I, 
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10 

-14 

—6 

10 

-5 

-15 

11. 

-5 

7 

-12 

, same, 

-14 

7 

-33 

, same 

-5 

-1 

-4 

-2 

-4 

-4 

1 2 0 

-9  -5 

-9 

13. 

2 13  -6 

’ 

3 -1 

, undefined. 

-5 

_0  -6  4 

4 °_ 

4 

0 


15.  Undefined, 


-4  , 
-3 


[7  -1 


3],  same 


-30 

-18 

22 

17. 

45 

9 

, undefined, 

4 

, undefined 

5 

-7 

-12 

" 10.5" 

7" 

19.  Undefined, 

0 

, 

-3 

, same 

-3 

1 

25.  (d)  AB  = (AB)t  = BtAt  = BA;  etc. 

(e)  Answer.  If  AB  = —BA. 

29.  p = [85  62  30]t,  v = [44,920  30,940]T 


Problem  Set  7.3,  page  280 

1.  x = —2,  y = 0.5  3.  x = 1,  y = 3,  z = — 5 

5.  x = 6,  y = —7  7.  x = —3 1,  y = t arb.,  z = 2t 

9.  x = 3t  — 1,  y = — t + 4,  z = t arb. 

11.  w = 1,  x = u arb.,  y = 2t2  — U,  z = r2  arb. 

13.  w = 4,  x = 0,  y = 2,  z = 6 17.  /i  = 2,  /2  = 6,  I3  = 8 

19.  / ! = (tfx  + R2)E0/(R1R2)  A,  /2  = £0/^i  A,  /3  = £o/«2  A 
21.  x2  = 1600  — jti,  X3  = 600  + xi,  X4  = 1000  — x±.  No 
23.  C:  3xi  — x3  = 0,  H:  8x1  — 2x4  = 0,  0: 2x2  — 2x3  — x4  = 0,  thus 
C3H8  + 502  3C02  + 4H20 

Problem  Set  7.4,  page  287 

1.1;  [2  -1  3];  [2  -1]T  3.3;  {[3  5 0],  [0  3 5],  [0  0 1]} 

5.3;  {[2  -1  4],  [0  1 -46],  [0  0 1]};  {[2  0 1],  [0  3 23], 

[0  0 1]} 


A18 
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4],  [0  2 0] 
1 0] 


7.  2;  [8  0 4 

0], 

[0  2 0 

4];  [8  0 

9.  3;  [9  0 1 

0], 

[0  9 8 

9],  [0  0 

11.  (c)  1 

17.  No 

19.  Yes 

21.  No 

23.  Yes 

25.  Yes 

27.  2,  [-2  0 

29.  No 

1], 

[0  2 1] 

31.  No 

33.  1,  solution  of  the  given  system  c[l  3],  basis  [1  Tp  3] 
35.1,  [4  2 | 1] 


Problem  Set  7.7,  page  300 

7.  cos  (a  + /3) 

11.  40 
15.  -64 
19.  2 

23.  x = 0,  y = 4,  z = ~ 1 

Problem  Set  7.8,  page  308 


1. 


5. 


9. 


1.20 

4.64 

0.50 

3.60 

1 

0 

-2 

1 

3 

-4 

0 

0 

l 

8 

0 

0 

l 

4 

1A-1 

= I, 

>lem  Set  73 

1 0] 

T,  [0 

[1 

11 

0 

0 

1 


-i,-i 


7.  Dimension  2,  basis  xe  x,e~ 


9.  1 

13.  289 
17.  2 


21.x 

= 3.5, 

y = -1.0 

25.  w 

= 3, 

X 

II 

© 

Vi 

II 

54 

0.9 

-3.4~ 

3. 

2 

0.2 

— 0.2 

-30 

-0.5 

2 

7.  A-1  = A 


11.  (A2)-1  = (A-1)2 


3.760  22.272 

2.400  15.280 


A 1 = I.  Multiply  by  A from  the  right. 


-if;  [1  if,  [-1  if 

5.  No 


1 o' 

0 

f 

0 -1 

’ 

0 

0 

’ 

0 

1 


11.  X!  = 5yi  - y2,  x2  = 3yi  - y2 

13.  x1  = 2y±  - 3 y2,  x2  = -10yi  + 16y2  + y3,  x3  = ~7yx  + lly2  + y3 


0 

0 
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15.  V26 
19.  1 

23.  a = [3 
25.  a = [5 


17.  V5 

21.  k = -20 

1 — 4]t,  b = [-4  8 -1]T,  || a + b[|  = VT()7  g 5.099  + 9 

3 2]t,  b = [3  2 -1]T,  90  + 14  = 2(38  + 14) 


Chapter  7 Review  Questions  and  Problems,  page  318 


~ -1  6 f 

1 18  13~ 

11. 

-18  8 -7 

» 

—6  -8  2 

-13  -2  -7 

-17  7 

13.  [21  -8  — 31  |T, 

[21 

-8  31] 

15.  197,  0 

17.  -5,  det  A2  = (det  A)2  = 25,  0 


-2 

-12 

-12 

19. 

-12 

16 

-9 

21.  x = 4,  y = 

-2,  z=8 

-12 

-9 

-14 

23. 

x = 6, 

y = 

2 1 + 2, 

z = t arb. 

25.  x = 0.4,  y = 

-1.3,  2 = 

1.7 

27. 

r = 10, 

y = 

-2 

29.  Ranks  2,  2, 

OO 

31.  Ranks  2, 

2, 

1 

33.  h = 16.5  A, 

h = 11  A, 

h 

35.  h = 4 A,  I2  = 5 A,  I3  = 1 A 


Problem  Set  8.1,  page  329 


1. 3,  [1  0]T;  -0.6,  [0  1]T  3. -4,  [2  9]T;  3,  [1  1]T 

5. -3/,  [1  -;];  3;,  [1  ;|.  ; - V~] 

7.  A2  = 0,  [1  0]T 

9.  0.8  + 0.6;,  [1  — ;]T;  0.8  - 0.6;,  [1  ;]T 

11.  -(A3  - 18A2  + 99A  - 1 62)/ (A  - 3)  = -(A2  - 15A  + 54);  3,  [2  -2 

6,  [1  2 2]t;  9,  [2  1 -2]T 

13.  -(A  - 9)3;  9,  [2  -2  1]T,  defect  2 

15.  (A  + 1)2(A2  + 2A  - 15);  -1,  [1  0 0 0]T,  [0  1 0 0]T; 

-5,  [-3  -3  1 1]T,  3,  [3  -3  1 -1]T 

'o  -l" 


17. 


1 


0 


. Eigenvalues  ;,  — ;.  Corresponding  eigenvectors  are  complex. 


19. 


indicating  that  no  direction  is  preserved  under  a rotation. 

V 

0,  . A point  onto  the  x2-axis  goes  onto  itself. 


0 0 
0 1 


1, 


0 


1] 


T. 


a point  on  the  Xi-axis  onto  the  origin. 

23.  Use  that  real  entries  imply  real  coefficients  of  the  characteristic  polynomial. 


A20 
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Problem  Set  8.2,  page  333 

1.  1.5,  [1  -l]7, -45°;  4.5,  [1  1]T,45° 

3.  1,  [-1/V6  1]T,  112.2°;  8,  [1  1/V6]7  22.2° 

5.  0.5,  [1  — 1]T;  1.5,  [1  1]T;  directions -45°  and  45° 

7.  [5  8]t 
9.  [11  12  16]t 

11.  1.8 

13.  c[10  18  25]t 

15.  x = (I  - A)_1y  = [0.6747  0.7128  0.7543]7 

17.  Axj  = A jXj  ( Xj  ± 0),  (A  — kl )Xj  = AjXj  — kxj  = (A j — k)xj. 

19.  From  Axj  = A.;x;/-  (xj  ± 0)  and  Prob.  18  follows  kpAPxj  = kp\fxj  and 

kqAq\j  = k qXjXj  (p  ± 0,  q 0,  integer).  Adding  on  both  sides,  we  see  that 
kp Ap  + kqAq  has  the  eigenvalue  kv\f  + kqAj . From  this  the  statement  follows. 


Problem  Set  8.3,  page  338 

1.  0.8  ± 0.6;',  [1  ±;']T;  orthogonal 

3.  2 ± 0.8/,  [1  ±/].  Not  skew-symmetric! 

5.  1,  [0  2 1]T;  6,  [1  0 0]T,  [0  1 -2]T;  symmetric 

7. 0,  ±25/,  skew-symmetric 

9.  1,  [0  1 0]T;  /,  [1  0 /]T;  -/,  [1  0 -/]T,  orthogonal 

15.  No  17.  A-1  = (-A7)-1  = — (A-1)7 

19.  No  since  det  A = det  (A7 ) = det  (—A)  = (—  l)3det  (A)  = — det  (A)  = 0. 


Problem  Set  8.4,  page  345 


'-25 

12’ 

’3' 

Y 

’ -2 

’2" 

1. 

-50 

25 

, -5, 

5 

; 5, 

5 

; x = 

4 

1 

’ 3.008 

-0.544' 

’-17' 

’-2’ 

’25' 

10" 

3. 

5.456 

6.992 

, 4, 

31 

; 6, 

11 

; x = 

25 

* 

5 

4 

3 

-9" 

0 

1 

-1 

3 

0 

1 

5. 

0 

-5 

15 

, 0, 

3 

; 4, 

0 

; 10, 

1 

; x = 

0 

1 

1 

, 

-1 

0 

-5 

15 

1 

0 

1 

1 

0 

1 

1 

5 

2" 

5 

A 

"l 

-2’ 

'5 

0' 

2 

5 

1 

5. 

2 

1 

0 

0 

’-2 

l" 

A 

1 

l" 

’2 

0' 

3 

-1 

3 

2 

0 

-5 

11. 
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1 

0 

0" 

1 

0 

0 

4 

0 

0 

13. 

-2 

1 

0 

A 

2 

1 

0 

= 

0 

-2 

0 

1 

-2 

1 

_3 

2 

1 

_0 

0 

1 

1 

3 

1 

3 

1“ 

3 

”l 

-2 

O” 

~10 

0 

0" 

15. 

1 

3 

1 

6 

1 

6 

A 

1 

1 

-1 

= 

0 

1 

0 

0 

1 

2 

1 

2_ 

1 

1 

1 

0 

0 

5 

17.  C = 

’7  3' 

3 7 

’ 

4 y?  + 10y| 

= 200, 

x = 

1 

V2 

1 

-1 

l" 

1 

ellipse 


19.  C 


3 

11 


11 

3 


14 yi  - 8j|  = 0, 


1 

1 


pair  of  straight  lines 


21.  C 


23.  C 


1 -6 


-6 

1 

-11 

42 

42 

24 

7y\  ~ 5v|  = 70,  X 


1 

V2 


-1 


hyperbola 


, 5 2v  1 - 39y|  = 156, 


Vl3 


hyperbola 


Problem  Set  8.5,  page  351 

1.  Hermitian,  5,  [— i 1]T,  7,  [i  1]T 

3.  Unitary,  (1  - iV3)/2,  [-1  1]T;  (1  + iV3)/2,  [1  1]T 

5.  Skew-Hermitian,  unitary,  —i,  [0  —1  1]T,  i,  [1  0 0]T,  [0  1 1]T 

7.  Eigenvalues  — 1,  1;  eigenvectors  [1  — 1 ]T,  [1  1]T;  [1  — i]T,  [1  j]T; 

[0  1]T,  [1  0]T,  resp. 

9.  Hermitian,  16  11.  Skew-Hermitian,  —6; 

13.  (ABC)T  = CTBTAT  = C_1(— B)A 

15.  A = H + S,  H = \ (A  + AT),  S = \ (A  — AT)  (H  Hermitian,  S skew-Hermitian) 

19.  AAt  - AtA  = (H  + S)(H  - S)  - (H  - S)(H  + S)  = 2(-HS  + SH)  = 0 

if  and  only  if  HS  = SH. 


Chapter  8 Review  Questions  and  Problems,  page  352 


11.  3,  [1 
13.  3,  [1 
15.  0,  [2 

17.  -1,  1; 


i]T; 

5]t; 

-2 


A = 


2,  [1  -1]T 

7,  [1  1]T 

1]T;  9/,  [-1+3/  1 + 3 i 4]t;  -9 i,  [-1  - 3 i 


5 

-3" 

’23 

2 

_ 1 

-1 

l" 

-3 

5_ 

39 

1 

~ 8 

63 

1 

1 - 3/  4]t 
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1 

1 

-1 

21.  — 

3 

1 

-1 

0 

0 

1 

1 

4 

12' 

23.  C 

12 

-14 

’3.7 

1.6" 

25.  C 

= 

1.6 

1.3 

r 

’-0.9 

0 

2 

0 

0.6 

1 2 


1 4 


0 


0 


1 -1  1 


0 -20  0 


-1 


1 2 


0 


0 22 


lOy?  - 20y!  = 20, 


hyperbola 


4.5y2  + 0.5)|  = 4.5,  x 


1 

V5 


ellipse 


Problem  Set  9.1,  page  360 


1.5, 1,0;  V26;  [5/ V26,  1/ V26,  0] 

3.  8.5,  -4.0,  1.7;  V9Tl4,  [0.890,  -0.419,  0.178] 
5.  2,  1,  —2;  u = [§,  g,  — §],  position  vector  of  Q 
7.  Q:  (4,  0, 1),  |v|  = VT^25 

11.  [6,4,0],  [|l,0],  [-3, -2,0] 

15.  7[9,  -7,  8]  = [63,  -49,  56] 

21.  [4,  9,  -3],  VT06 

25.  [6,2,  -14]  = 2u,  V236 

29.  v = [Ui,  v2,  3],  Ui,  v2  arbitrary 

33.  |p  + q + u|  g 18.  Nothing 

35.  vB-  vA=  [-19,  0]  - [22/ V2,  22/ V2]  = [-19 

37.  u + v + p = [-£,  0]  + [/,  1]  + [0,  -1000]  = 0, 

0 + l - 1000  = 0,  l = 1000,  k = 1000 


9.  Q : (0,  0,  -8), 
13.  [1,5,  8] 

17.  [12,  8,  0] 

23.  [0,  0,  5],  5 
27.  p = [0,  0,  -5] 
31.  k = 10 

- 22/V2,  -22/V2] 

-k  + l + 0 = 0, 


M = 8 


Problem  Set  9.2,  page  367 

1. 44,  44,  0 3.  V35,  V320,  V86 

5.  | [2,  9,  9]  | = VI66  = 12.88  < V80  + V86  = 18.22 
7.  |— 24|  = 24,  |a||c|  = V35V86  = V30l0  = 54.86;  cf.  (6) 

9.  300;  cf.  (5a)  and  (5b)  13.  Use  (1)  and  |cos  y\  Si  1. 

15.  |a  + b|2  + |a  - b|2  = a*a4-2a,b  + b*b  + (a*a-2a*b  + b*b) 

= 2 1 a | 2 + 2|b|2 
17.  [2,  5,  0]  • [2,  2,  2]  = 14 
19.  [0,  4,  3]  • [-3,  -2,  1]  = -5  is  negative!  Why? 

21.  Yes,  because  W=(p  + q),d  = p*d  + q*d.  23.  arccos  0.5976  = 53.3° 
27.  j3  — a is  the  angle  between  the  unit  vectors  a and  b.  Use  (2). 

29.  y = arccos  (1 2/(6  Vl3))  = 0.9828  = 56.3°  and  123.7° 

31.«i=-f  33.  ±[|,-|] 

35.  (a  + b)  • (a  - b)  = | a | 2 - | b | 2 = 0,  |a|  = |b|.  A square. 

37.  0.  Why? 

39.  If  | a | = |b|  or  if  a and  b are  orthogonal. 


App.  2 Answers  to  Odd-Numbered  Problems 
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Problem  Set  9.3,  page  374 

5.  — m instead  of  m,  tendency  to  rotate  in  the  opposite  sense. 

7.  | v | = | [0,  20,  0]  x [8,  6,  0]|  = | [0,  0,  — 160]  | = 160 
9.  Zero  volume  in  Fig.  191,  which  can  happen  in  several  ways. 

11.  [0,  0,  7],  [0,  0,  -7],  -4  13.  [6,  2,  7],  [-6,  -2,  -7] 

15.  0 17.  [-32,  -58,  34],  [-42,  -63,  19] 

19.  1,  -1 

21.  [-48,  -72,  -168],  12\/248  = 189.0,  189.0 

23. 0,  0,  13 

25.  m = [—2,  —2,  0]  x [2,  3,  0]  = [0,  0,  —10],  m = 10  clockwise 
27.  [6,2,0]  x [1,2,0]  = [0,0,  10]  29.  \ | [-12,  2,  6]  | = V46 

31.  3x  + 2y  - z = 5 33.  474/6  = 79 


Problem  Set  9.4,  page  380 
1.  Hyperbolas 

3.  Parallel  straight  lines  (planes  in  space)  y = \x  + c 
5.  Circles,  centers  on  the  y-axis 

7.  Ellipses  9.  Parallel  planes 

11.  Elliptic  cylinders  13.  Paraboloids 


Problem  Set  9.5,  page  390 


1.  Circle,  center  (3,  0),  radius  2 3.  Cubic  parabola  x = 0,  z — y 

5.  Ellipse  7.  Helix 

9.  A “Lissajous  curve”  11.  r = [3  + Vl3  cos  7,  2 + VT3  sin  ?,  1] 

13.  r = [2  + t,  1 + It,  3]  15.  r = [t,  4 1 - 1,  5 1] 

17.  r = [V2  cos  t,  sin  t,  sin  7]  19.  r = [cosh  t,  (V3/2)  sinh  t,  —2] 

21.  Use  sin  (—a)  = —sin  a. 

25.  u = [ — sin  t,  0,  cos  ?].  At  P,  r'  = [—8,  0,  6].  q(vv)  = [6  — 8w,  i,  8 + 6w]. 

27.  q(w)  = [2  + w,  g - \w,  0]  29.  Vr'  • r'  = cosh  t,  l = sinh  / = 1.175 

31.  Vr'  • r*  = a,  l = 077/2  33.  Start  from  r(7)  = [?, fit)]. 

35.  v = r'  = [1,  2f,  0],  | v|  = Vl  + 4r2,  a = [0,  2,  0] 

37.  v(0)  = (w  + 1)  Ri,  a(0)  = ~co2Rj 

39.  v = [—sin  t — 2 sin  2 1,  cos  t — 2 cos  2t\,  |v|2  = 5 — 4 cos  3 1, 

, „ „ • „ . „ n 6 sin  3t 

a = [—cos  t — 4 cos  2 1,  — sin  t + 4 sin  27],  and  atan  = ^ — v. 


4 cos  3t 


v|2  = 4 + sin2  t, 


41.  v = [—sin  t,  2 cos  2 1,  —2  sin  2 1], 

a = [—cos  r,  —4  sin  2 1,  —4  cos  2 1],  and  atan  = 


\ sin  2 1 
4 + sin2  t 


- v. 


43.  1 year  = 365  • 86,400  sec,  R = 30  • 365  • 86,400/277  = 151  • 106  [km], 

| a | = cj2R  = |v| 2/R  = 5.98  ■ 10-6  [km/sec2] 

45.  R = 3960  + 80  mi  = 2.133  • 107ft,  g = \a\  = cozR  = \\\Z/R,  |v|  = VgR  = 

V6.61  • 108  = 25,700  [ft/sec]  = 17,500  [mph] 

49.  r (t)  = [7,  y(7),  0],  r'  = [1,  y , 0]  r • = 1 + y'2,  etc. 


A24 


App.  2 Answers  to  Odd-Numbered  Problems 


51  r/r  _ dr  Ids  d2 r _ d2 r //  rfc\2  r/3r  _ t/3r  //"  t/s\3  + 

ds  dt  / dt  ’ rfc2  dt2  / \dt ) ’ ds 3 dt3/\dtj 

53.  3/(1  + 9/2  + 9r4) 


Problem  Set  9.7,  page  402 

1.  [2y  -1,2 x + 2] 


5.  [4x3,  4y3] 


3.  l-y/xz,  1/x] 

7.  Use  the  chain  rule. 
9.  Apply  the  quotient  rule  to  each  component  and  collect  terms. 

11.  [y,x],  [5,-4] 

[0.16,  0.12] 


13.  [2x/(x2  + y2),  2y/(x2  + y2)], 

15.  [8x,  18y,  2z\,  [40,-18,-22] 

19.  [-1.25,0] 

23.  Points  with  y = 0,  ±77,  ±27 T, 

31.  V f=  [32x,  -2 v],  Yf(P)  = [160,  -2] 

35.  [-2x,  -2y,  1],  [-6, -8,  1] 

39.  [1,  1,  1]  • [-3/125,  0,  — 4/125]/ V3  = - 7/(125  V3) 

41.  V8/3  43./  = xyz 

45. / = Jvi  dx  + fv2  dy  + Ju3  dz 


17.  For  P on  the  x-  and  y-axes. 

21.  [0,  -e] 

25.  -V7(P)  = [0,4,  -1] 

33.  [12x,4y,  2z],  [60,20,10] 

37.  [2,  1]  • [1,  — 1]/V5  = 1/V5 


Problem  Set  9.8,  page  405 

1.  2x  + 8y  + 1 8z;  7 3.  0,  after  simplification;  solenoidal 

5.9 xVz2;  1296  7. -2ex(cosy)z 

9.  (b)  (fv i)x  + (/u2)»  + (/U3)Z  =/[(ltl)a;  + (v2)y  + 73>2]  +fxV  1 + fyV2  + fzV 3,  etc. 
11.  [Ui,  1>2,  U3]  = r'  = [x,  y,  Z ] = [y,  0,  0],  z!  = 0,  z = c3,  y'  = 0,  y = c2,  and 

x = y = c2,  x = c2t  + cy.  Hence  as  r increases  from  0 to  1,  this  “shear  flow” 

transforms  the  cube  into  a parallelepiped  of  volume  1. 

13.  div  (w  X r)  = 0 because  v \ , v2,  u3  do  not  depend  on  x,  y,  z,  respectively. 

15.  —2  cos  2x  + 2 cos  2y  17.  0 

19.  2/(x2  + y2  + z2)2 


Problem  Set  9.9,  page  408 

3.  Use  the  definitions  and  direct  calculation. 

5.  [x(z2  - y2),  y(x2  - z2),  z(y2  - x2)]  7.  e_x[cos  y,  sin  y,  0] 

9.  curl  v = [— 6z,  0,  0]  incompressible,  v = 7 = [x\  y' , z;]  = [0,  3z2,  0],  x = ci, 
z = c3,  y = 3z2  = 3c|,  y = 3 c2t  + c2 
11.  curl  v = [0,  0,  —3],  incompressible,  x = y,  y,  = — 2x,  2xx,  + yy,  = 0, 

*2  + b2  = c,  Z = c3 

13.  curl  v = 0,  irrotational,  div  v = 1,  compressible,  r = [c]eL,  c2et,  c3e_t].  Sketch  it. 
15.  [—1,  —1,  —1],  same  (why?) 

17.  — yz  - zx  ~ xy,  0 (why?),  -y  — z - x 
19.  [— 2z  — y,  — 2x  — z,  — 2y  — x],  same  (why?) 


App.  2 Answers  to  Odd-Numbered  Problems 
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Chapter  9 Review  Questions  and  Problems,  page  409 

11.-10,  1080,  1080,  65 

13.  [-10, -30, 0],  [10,30,0],  0,  40 

15.  [-1260,  -1830,  -300],  [-210,  120,  -540],  undefined 

17.-125,  125,  -125 

19.  [70,  -40,  -50],  0,  V352  + 202  + 252  = V2250 
21.  [-2,  -6,  -13] 

23.  y!  = arccos  (— 10/V65  • 40)  = 1.7682  = -101.3°,  yz  = 23.7° 

25.  [5,  2,  0]  • [4  - 1,  3 - 1,  0]  = 19  27.  v • w/|w|  = 22/V8  = 7.78 

29.  [0,  0,  — 14],  tendency  of  clockwise  rotation  31.  4 
33.  1,  -2y 

35.  0,  same  (why?),  2(y2  + x2  — xz ) 

37.  [0,-2,  0]  39.  9/V225  =| 


Problem  Set  10.1,  page  418 
3.4 

5.  r = [2  cos  t,  2 sin  t],  0 S / § 77/2;  § 

7.  “Exponential  helix,”  (efi7r  - l)/3  9.  23.5,  0 

11.  2e~l  + 2 te~e,  -2e~2  - e-4  + 3 15.  18tt,  f (4t7)3,  1877 

17.  [4  cost,  + sin  t,  sin  t,  4 cos  t],  [2,  2,  0]  19.  144t4,  1843.2 


Problem  Set  10.2,  page  425 

3.  sin  gx  cos  2y,  1 — 1/V2  = 0.293  5.  exy  sin  z,  e — 0 

7.  cosh  1 - 2 = -0.457 

9.  ex  cosh  y + ez  sinh  y,  e — (cosh  1 + sinh  1)  = 0 

13.  ea  cos  2b  15.  Dependent,  x2  # — 4y2,  etc. 

17.  Dependent,  4 A 0,  etc.  19.  sin  (a2  + 2 b2  + c2) 


Problem  Set  10.3,  page  432 

3.  8y3/3,  54 


5. 


1 

[x  — X3  — (x2  — x5)]  dx 


Jo 


7.  cosh  2x  — cosh  x,  | sinh  4 — sinh  2 9.  36  + 27y2,  144 

11.  z = 1 — r2,  dx  dy  = rdrd9.  Answer : 7t/2 
13.  x = 26/3,  y = h/3  15. 3c  = 0,  y = 4r/37r 

17.  Ix  = bh3/\2,  Iy  = b3h/4 
19.  Ix  = (a  + b)h3/24,  Iy  = h{a 4 - h4)/(48(a  - b)) 


I 

12 


Problem  Set  10.4,  page  438 

1.  (-1  - 1)  • 77/4  = -77/2  3.  9(e2  - 1)  - |(e3  - 1) 

5.  2x  — 2y,  2x(l  - x2)  - (2  - x2)2  + 1,  x=-l-l,  -ff 
7.  0.  Why?  9. 

13.  V2w  = cosh  x,  y = x/2  • ■ ■ 2,  \ cosh  4 — \ 


A26 


App.  2 Answers  to  Odd-Numbered  Problems 


15.  V2vr  = 6xy,  3jc(10  - x2)2  - 3x,  486  17.  V2n>  = 6x  - 6y,  - 38.4 

19.  | grad  w \ 2 = e2x,  f(e4  - 1) 

Problem  Set  10.5,  page  442 
1.  Straight  lines,  k 

3.  z = c\/ x2  + y2,  circles,  straight  lines,  [— cu  cos  v,  —cu  sin  v,  u] 

5.  z = x2  + y2,  circles,  parabolas,  [—2m2  cos  v,  —2m2  sin  v,  u] 

7.  x2/a2  + y2/b2  + zZ/c2  = 1,  [be  cos2  v cos  u,  ac  cos2  v sin  n,  ab  sin  v cos  u], 
ellipses 

11.  [m,  v,  u2,  + v2],  N = [-2m,  -2v,  1] 

13.  Set  x = u and  y = v. 

15.  [2  + 5 cos  m,  — 1 + 5 sin  u,  u],  [5  cos  n,  5 sin  n,  0] 

17.  [a  cos  v cos  u,  —2.8  + a cos  v sin  n,  3.2  + a sin  u],  a = 1.5; 

p p p p p 

[a  cos  v cos  m,  a cos  v sin  n,  a cos  v sin  u] 

19.  [cosh  m,  sinh  n,  u],  [cosh  n,  — sinh  n,  0] 


0]  • [-3,  2,  1]  = 3 m2  + 2u2,  29.5 


Problem  Set  10.6,  page  450 

1.  F(r)  • N = [-m2,  v2, 

3.  F(r)  • N = cos3  v cos  n sin  n from  (3),  Sec.  10.5.  Answer:  | 

5.  F(r)  • N = -m3,  -12877 

7.  F • N = [0,  sin  u,  cos  v]  • [1,  -2m,  0],  4 + (-2  + 772/16  - 7r/2)V2  = -0.1775 
9.  r = [2  cos  u,  2 sin  n,  i>],  0^m§  77/4,  0 i r g 5.  Integrate  2 sinh  v sin  it  to 

get  2(1  - 1/ V2)(cosh  5 - 1)  = 42.885. 

13.  7t73/V6  = 88.6 

15.  G(r)  = (1  + 9m4)3/2,  |N|  = (1  + 9m4)1/2.  Answer : 54.4 


21.  L 


[h  (x  ~ y )2  + z2]  v dA 


23.  [m  cos  v,  it  sin  v,  m], 


2tt 


rh 


‘'O  *^0 


m2  • m V2  du  dv  = — — /i4 

V2 


25.  [cos  m cos  u,  cos  m sin  u,  sin  m],  dA  = (cos  u)  du  dv,  B the  z-axis,  /g  = 87t/3, 
Ik  = h + l2  • 477  = 20.9. 


Problem  Set  10.7,  page  457 
1.  224 

3.  — e-1-z  + e-y~\  — 2e_1_z  + e“z, 
5.  |(sin  2x)  (1  — cos  2x),  g,  | 

7.  [r  cos  m cos  u,  cos  u sin  v,  rsinn], 
9.  div  F = 2x  + 2 z,  48 
13.  div  F = — sin  z,  0 
17.  /i477/2 

21.  (a4/4)  • 277  • h = ha4Tr/2 
25.  Do  Prob.  20  as  the  last  one. 


2e-3  - e~2  - 2e-1  + 1 

2 2 3/ 

dV  = r cos  u dr  du  dv,  cr  = v,  277  a /3 
11.  12(e  - 1/e)  = 24  sinh  1 
15.  1/77  + ^ = 0.5266 
19.  8 abc(b2  + c2)/ 3 
23.  /7577/10 


App.  2 Answers  to  Odd-Numbered  Problems 
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Problem  Set  10.8,  page  462 

1.  x = 0,  y = 0,  z = 0,  no  contributions.  x = a:  df/dn  = df/dx  = —2x  = —2a,  etc. 

Integrals  x = a:  (—2 a)bc,  y = b:  (—2 b)ac,  z = c:  (4c)  ab.  Sum  0 

3.  The  volume  integral  of  By2  + [0,  8v]  • [2x,  0]  = 8v2  is  8v:i/3  = |.  The  surface 
integral  of  fdg/dn  = f • 2x  = 2/  = 8y2  over  x = 1 is  8y3/3  = |.  Others  0. 

5.  The  volume  integral  of  6 y2  • 4 — 2x2  • 12  is  0;  8(jc  = 1),  — 8(v  = 1),  others  0. 

7.  F = [x,  0,  0],  divF  = 1,  use  (2*),  Sec.  10.7,  etc. 

9.  z = 0 and  z = \/a2  — x2  — y2  = \/ a2  — r2,  dx  dy  = r dr  cl6, 

1/  2 2x3/2  2 \a  2 3 

277  • 2\a  r)  * 3 1 0 3 77(7 

11.  r = a,  cj)  = 0,  cos  <f>  = 1,  v = • (47ra2) 


Problem  Set  10.9,  page  468 

U:z  = y(0grgl,0§yS4),  [0,  2z,  ~2z\  • [0, -1,  1],  ±20 

3.  [2e~z  cos  y,  ~e~z,  0]  • [0,  —y,  1 \=ye~z,  ±(2  - 2/Ve) 

5.  [0,  2z,i]-[0,  0,1]  = |,  ±la2 

7.  [-ez,  —ex,  - ev ] • [~2x,  0,  1],  ±(e4  - 2e  + 1) 

9.  The  sides  contribute  a,  3az/2,  —a,  0. 

11.  -2tt;  curl  F = 0 13.  5k,  80tt 

15.  [0,  -l,2x  - 2 y]  • [0,  0,  1],| 

17.  r = [cos  u,  sin  u,  v],  [— 3u2,  0,  0]  • [cos  u,  sin  n,  0],  — 1 
19.  r = [u  cos  v,  u sin  v,  u],  0§«§  1,  Oitg  77/  2, 

[— ez,  1,  0]  • [— h cos  u,  — Msinu,  it].  Answer  1/2 


Chapter  10  Review  Questions  and  Problems,  page  469 


11.  r = [4  - 10/,  2 + 8/],  F(r)  • dr  = [2(4  - 
—4528/3.  Or  using  exactness. 

13.  Not  exact,  curl  F = (5  cosx)k,  ±10 
17.  By  Stokes,  ±1877 
21.  M = 8,  x = |,  y 


16 

5 


23.  M = fj,  x = ®=  1.14,  y 


118 

49 


= 2.41 


_5_ 

16’ 


y 


25.M  = 4k/l5,  x = 

29.  div  F = 20  + 6z2.  Answer.  21 
33.  Direct  integration,  ^ 


10/)  , -4(2 / + 8/)2]  • [-10,  8]  df. 


15.  0 since  curl  F = 0 
19.  F = grad  (y2  + xz),  277 


27.  288(a  + b + c)tt 
31.  24  sinh  1 = 28.205 
35.  7277 


Problem  Set  11.1,  page  482 

1.  277,  277,  77,  77,  1,  1,  \ 


5.  There  is  no  smallest  p > 0. 


13.  — (cos  x + — cos  3x  H — — cos  5x  + ■■■)  + 2 (sin  x + — sin  3x  4-  — sin  5x  + 

77  9 25  3 5 

15.  §772  + 4 (cos  x + | cos  2x  + e;  cos  3x  + ■ ■ ■ ) — 477  (sin  x + \ sin  2*  + 

1 sin  3x  + • • • ) 

77  4 / 1 1 

17. 1 cos  x 3 — cos  3x  H cos  5x  + ■ ■ 

2 77  V 9 25 


A28 


App.  2 Answers  to  Odd-Numbered  Problems 


19. 


77 


77 


CoS*  + icos3*  + Acos5*  + 


+ sin  x sin  2x  + 

2 


— sin  3x  — + • • 
3 


21.  2 (sin  x + \ sin  2x  + | sin  3x  + \ sin  4x  + § sin  5x  + • • • ) 


Problem  Set  11.2,  page  490 


1.  Neither,  even,  odd,  odd,  neither 
9.  Odd,  L = 2,  4 


3.  Even 


5.  Even 


77 

1 


. 77X  1 . 377X  1 . 577X 

sin 1 — sin 1 — sin b ■ • 

2 3 2 5 2 


77 


1 


11.  Even,  L = 1, 2 1 cos  7TX cos  277x  H — cos  377x  — + 


13.  Rectifier,  L = 


1 

2’  8 


1 


77 


9 COS  277X  -I COS  677X  H COS  1077X  + • 

2'  9 25 


77 


— sin  2ttx  — — sin  477X  + — sin  677x  — — sin  877x  -I 

2 4 6 8 


1 


15.  Odd,  L = 77,  — I sin  x sin  3x  H sin  5x  — + 


25 


77 


1 


17.  Even,  L = 1,  — H 2 ( cos  7Tx  + A cos  377x  + — cos  577x  + 


25 


19. 1 +5  cos  2x  + A cos  4x 


23.  L = 4,  (a)  1,  (b) 


77 


77  4 

25.  L = 77,  (a) 1 

2 77 


. 77X  1 . 377X  1 . 57TX 

sin 1 — sin 1 — sin h 

4 3 4 5 4 

cos  x + — cos  3x  + — - cos  5x  + ■ 

9 25 


(b)  2 (sin  x + \ sin  2x  + | sin  3x  + \ sin  4x  + • • • ) 

27.  L = 77,  (a)  + — ( cos  x — — cos  2x  + — cos  3x  H — cos  5x  — 


77 


' cos  6x  + cos  lx  + cos  9x 
18  49  81 


9 


25 


— cos  lOx  H — cos  llx  + 

50  121 


( , 2 ' 

\ . 1 . „ | 

(\ 

2 N 

\ . „ 1 

1 + — 

sin  x -1 — sin  2x  + 

sin  3x  H — 

V Tty 

1 2 ' 

V3 

977  j 

1 4 

1 2 
5 + 2577 


sin  5x  + — sin  6x  + 
6 


29.  Rectifier,  L = 77, 

(a)  — — f T~7 

77  77  \ 1 • 3 


cos  x + 


3 • 5 


■ cos  3x  + 


5 • 7 


■ cos  5x  + 


(b)  sinx 


Problem  Set  11.3,  page  494 

3.  The  output  becomes  a pure  cosine  series. 

5.  For  An  this  is  similar  to  Fig.  54  in  Sec.  2.8,  whereas  for  the  phase  shift  Bn 
the  sense  is  the  same  for  all  n. 


App.  2 Answers  to  Odd-Numbered  Problems 
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7.  y = C i cos  cot  + C2  sin  cot  + a(co ) sin  t,  a(co)  = l/(&>2  — 1)  = — 1.33, 
—5.26,  4.76,  0.8,  0.01.  Note  the  change  of  sign. 


11.  y = C 1 cos  cot  + C2  sin  cot  + 


77  \co' 


1 1 . „ 

sin  t 0 sin  3 1 + 

-9  ai  - 49 


1 


■ sin  5?  + 


o>  - 121 

N 

13.  y = 2 (An  cos  nt  + Bn  sin  ”/)>  An  = [(1  — n2)an  — nbrlc]/Dri 

n= 1 

Bn=  [(1  - n2)bn  + ncan\/Dn,  Dn  = (1  - n2)2  + n2 c2 


15.  bn  = (— l)m+1  • 12 /n3  (n  odd),  y = 2( Ancosnt  + Bnsinnt), 


n=  1 

An  = (—  l)n  ■ 12 nc/n3Dn,  Bn  = (—  l)n+1  • 12(1  — n2)/(n3Dn)  with  Dn  as  in 
Prob.  13. 

17.  / = 50  + Ai  cos  t + B\  sin  t + A3  cos  3t  + B3  sin  3f  + • • • , An  = (10  — ri2)an/Dn 
Bn  = 1 Qrian/Dn,  an  = -400/(n27r),  Dn  = ( n 2 - 10)2  + 100n2 


19. 1(t)  = ^ (An  cos  nt  + Bn  sin  nt),  An  = (—  l)?l 

n=  1 


j 2400(10  - n2) 


2n 
n D y, 


J.1  24,000 

fln  = (-l)  , Dn  = (10  - «2)2  + 100/72 


III),, 


Section  11.4,  page  498 


_ „ 77  4 

3.  F = — - 

2 77 


cos  x + ^ cos  3x  + ^ cos  5x  + 


, E*  = 0.0748, 


0.0748,  0.01 19,  0.01 19,  0.0037 

5.  F = — ( sin  x + — sin  3x  + — sin  5x  + 
77 


,E*  = 1.1902,  1.1902,0.6243,  0.6243, 


3 5 

0.4206  (0. 1 272  when  N = 20) 

7.  F = 2 [(772  — 6)  sin  x — g (4772  — 6)  sin  2x  + 27  (9772  — 6)  sin  3x  — +••■]; 
E*  = 674.8,  454.7,  336.4,  265.6,  219.0.  Why  is  E*  so  large? 


Section  11.5,  page  503 

3.  Set  x = ct  + k.  5.  x = cos  6,  dx  = —sin  6 dO , etc. 

7.  Am  = (77777/ 10)2,  m = 1,  2,  ■ • ■ ; ym  = sin  (t?777x/10) 

9.  A = [(2/77  + 1)77/(2L)]2, 777  = 0,  1,  • • • ,ym  = sin  ((2m  + \)ttx/(2L)) 
11.  Am  = 772 2, 777  = 1,  2,  • • ■ , ym  = x sin  (m  In  |x|) 

13.  p = e8x,  q = 0,  r = e8x,  Am  = m2,  ym  = e_4x  sin  mx,  m = 1,  2,  • • • 


Section  11.6,  page  509 

1.  8(Pi(x)  - P3(x)  + P5(x)) 

3.  tP0(x)  - fp2(x)  - iP4(-t) 

9.  -0.4775Pi(x)  - 0.6908P3(x)  + 1.844P5(x)  - 0.8236 Pw(x)  + 0.1658P9(x)  + 
772 0 = 9.  Rounding  seems  to  have  considerable  influence  in  Probs.  8-13. 
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11.  0.7854Po(x)  - 0. 3540/2 (x)  + 0.0830P4(x) ,mQ  = 4 

13.  0.1212PoW  - 0.7955P2W  + 0.9600P4W  - 0.3360P6(x)  + • ■ ■ , m0  = 8 

15.  (c) 

Q-m  (2/  J i(a0  ,m  ))  a0,m)  2/ {<Xq  mJ i(cro,m)) 


Section  11.7,  page  517 
l./(x)  = ve~x(x  > 0)  gives  A = 


e v cos  wv  dv  = 


1 


1 + w2 


-,B  = 


1 + w' 


(see  Example  3),  etc. 


3.  Use  (11);  B = 


v J 


0 


77  . 1 — COS  77W 

— sin  wv  dv  = 

2 w 


5.  B(w)  = — 
77  J 


1 


1 . sin  w — w cos  w 

— 77i)  sin  vvi)  dv  = o 

2 w 


7. 


77 


sin  w cos  xw 
w 


dw 


9.  A (w)  = — 

77 


cos  wv 


dv  = e w (w  > 0) 


11. 


TT 


COS  77W  + 1 

1 - w2 


cos  xw  dw 


15.  For  n = 1,  2,  11,  12,  31,  32,  49,  50  the  value  of  Si(n77)  - 77/2  equals  0.28,  -0.15, 
0.029,  -0.026,  0.0103,  -0.0099,  0.0065,  -0.0064  (rounded). 


17.— 

77 


1 — cos  w . 
sin  xw  dw 


Jo 


19. 


77 


°w  — e(w  cos  w — sin  w) 
1 + w2 


sin  xw  dw 


Section  11.8,  page  522 

1 -fc(yv)  = V(2/77)  (2  sin  w — sin  2w)/w 

3. /c(w)  = V(2/77)  (cos  2w  + 2w  sin  2w  — l)/w2 

- / 2 (w2  — 2)  sin  w + 2vr  cos  w 

5./cM  = J- 3 

V 77  W 

7.  Yes.  No 

11.  V2/77  ((2  — w2)  cos  w + 2w  sin  w — 2 )/w3 


13.9?s(e-x)  = 


9.  V2/77  w/(a2  + w2) 


_L 

w2  + 1 


f2~  w 
V 77  tv2  + 1 


Problem  Set  11.9,  page  533 

3.  i(e~lbw  — e~mw)/(wV2TT)  if  a < b',0  otherwise 
5_  [ea-iw)a  _ e-(l-^)a]/(V277(1  - w)) 

7.  (e-Mm,(l  + taw)  — 1)/(V277w2)  9.  V2/7r(cos  w + w sin  w — 1 )/w2 

11.  1 V2/77  (cos  w — l)/w  13.  e_w’  by  formula  9 
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17.  No,  the  assumptions  in  Theorem  3 are  not  satisfied. 


21. 


./1  + h + h + A.  A - ' 

1 1 

A 

A +/2 

1 -1 

A. 

1 

> 

1 

Al 


Chapter  11  Review  Questions  and  Problems,  page  537 


...  . 4 ( . ttx  1 . 3ttx  1 . 57 TX 

11.  1 H sin 1 — sin I — sin h 

77  V 2 3 2 5 2 


1 


77 


1 


1 


13. o COS  TTX  + cos  3ttx  3 cos  5ttx  + ■ 


25 


sin  77x  — — sin  2ttx  + — sin  377x  — + ■ 
77  \ 2 3 

15.  coshx,  sinh  x (—5  < x < 5),  respectively 

1 4 / 1 

19.  — 9 I COS  77X  + — COS  377X  + • 


17.  Cf.  Sec.  11.1. 


77 


— | sin  ttx 

77 


21.  y = C i cos  cot  + C2  sin  cot  + 


77 


12 


cos  t 

2 1 

CO  — 1 


1 

4 


— sin  2ttx  + 
2 

cos  2 1 1 

9 


2 A 

co  — 4 


cos  3 1 

2 n 

co  — 9 


H 


1 cos  At 
16  co2  - 16 
23.  0.82,  0.50,  0.36,  0.28,  0.23 
25.  0.0076,  0.0076,  0.0012,  0.0012,  0.0004 


27.— 

77 


(cos  w + w sin  w — l)cos  wx  + (sin  w — w cos  w)sin  wx 


dw 


29.  V2/77  (cos  ciw  — cos  w + aw  sin  aw  — w sin  w)/w2 


Problem  Set  12.1,  page  542 

1.  L(C\U\  + C2«2 ) = C\L(u\)  + C2.L(K2)  = Cl  • 0 + C2  • 0 = 0 
3.  c = 2 5.  c = a/b 

7.  Any  c and  co  9.  c = 77/25 

15.  u = 110  — (110/ln  100)  In  (x2  + y2)  17.  u = a(y ) cos  477jc  + b(y)  sin477x 

19.  u = c(x)  e~u'/?‘ 

21.  u = e~3y(a(x ) cos  2y  + b(x ) sin  2y)  + 0.1e3y 

23.  u = ci( y)x  + C2(y)/x2  (Euler-Cauchy) 

25.  u(x,  y)  = axy  + bx  + cy  + k\  a , b , c,  k arbitrary  constants 


Problem  Set  12.3,  page  551 


5.  k cos  3777  sin  377x 


8 k 


77 

08 

772 


1 


7.  — ^ COS  777  sin  77X  H COS  3777  sin  377*  + 


27 


125 


■ cos  5777  sin  5ttx  + 


cos  77 ? sin  ttx  — ^ cos  3777  sin  377.x  + ^ cos  5777  sin  577x  — + 
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2 ( 1 
11.  — 2 ( (2  — V2)  cos  777  sin  ttx  — — (2  + V2)  cos  3777  sin  377x 


+ ^ (2  + V2)  cos  577f  sin  577x  — + ■ 


13.  — 3 I (4  — 77)  cos  7Tt  sin  77x  + cos  27 77  sin  2vx  + 

7 7 ' 


4 + 377 
27 


cos  3777  sin  3ttx 


4 — 577  _ . 

H : — cos  5777  sin  577*  + 


17.  u = 


125 
8 L2 


77 


COS 


2 n 
7 


sin  - 


TTX 


. No  terms  with  n = 4,  8,  12, 

2 i 


1 


: COS 


377 

L 


. 3irx 

sin h 

L 


19.  (a)  m( 0, 7)  = 0,  (b)  u(L,  t)  = 0,  (c)  ux( 0,  ?)  = 0,  (d)  ux(L,  t)  = 0.  C = —A,  D = —B 
from  (a),  (c).  Insert  this.  The  coefficient  determinant  resulting  from  (b),  (d)  must  be 
zero  to  have  a nontrivial  solution.  This  gives  (22). 


Problem  Set  12.4,  page  556 

3.  c2  = 300/[0.9/(2  • 9.80)]  = 80.832  [m2/sec2] 

9.  Elliptic,  u = fi(y  + 2 ix)  + f2(y  — 2 ix) 

11.  Parabolic,  u = x/1(x  — v)  + f2(x  — y) 

13.  Hyperbolic,  u = f\( y — 4x)  + f2(y  — x) 

15.  Hyperbolic,  xy'2  + yy'  = 0,  y = v,  xy  = w,  uw  = z,  u = ,/i (xy)  + f^iy) 

17.  Elliptic,  u = fi(y  — (2  — i)x)  + f2(y  — (2  + i)x).  Real  or  imaginary  parts  of  any 
function  u of  this  form  are  solutions.  Why? 


Problem  Set  12.6,  page  566 

3.  Mi  = sin  x e-t,  u2  = sin  2x  e-4t,  M3  = sin  3x  e~9t  differ  in  rapidity  of  decay. 
5.«  = sin0.l77xe-1-752^t/100 


800 


77 


7.  u = — ^ ( sin  0.1 77X  g-0.01752^  + 4 sin  037TX  e 


-0.01752(377)  t 


+ 


9 . u = uj  + Uu,  where  Mn  = u — u\  satisfies  the  boundary  conditions  of  the  text, 

■ L 


that  Mn  = ^Bn  sii 


nTfx  _ 
e 


(cmr/Lrt 


n=  1 


>Bn~  T 


flTTX 

[f(x)  — Mj(x)]  sin— — dx. 


11.  F = A cos  px  + B sin  px,  F'{ 0)  = Bp  = 0,  B = 0,  F'{L)  = —Ap  sin  pL  = 0, 
p = mr/L,  etc. 

13.  m = 1 


15-  2 + t4 


+ 1 „ Q f 1 _ 9^7 

cos  x e + — cos  3x  e + — cos  5x  e + ■ ■ 


9 


25 


17.  nBne-^ 

n = 1 


19.  m = 1000  (sin  \rnx  sinh  \ 77y)/sinh  77 


71  100  V 

21.  M = 2j 

n=  1 


1 


(2 n — 1)  sinh  (2 n — 1)77 


. (2n  — l)77x  . , (2m  — l)77y 

- sin — sinh — 


77 


24 


24 
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~ sinh  (mtx/24)  mry 

23.  u = Aqx  + Zj  An — cos 

sinh  nrr 

n=  1 


24  ’ 


An  — 


242 


,24 


f(y)  dy,  An  = 


12 


,24 


,,  , mry 

f(y)  cos cfy 

24 


^ x ■ n7TX  ■ 1 n7Tib  - y)  2 

25.  sin sinh ,An  = — r-r— ■ 

a a a sinh  (nirb/a) 

n—  1 


a 

f(x)  sin 
■t) 


nJTx 

ax 

a 


Problem  Set  12.7,  page  574 


3.  A = — 

77  J 


COS  pv  . 2 7T 

ci  v = — • — e , u = 


1 + V * 


77  2 


e v c v f cos  pX  dp 


r 1 


5.  A = — 

77 


. 2 cos  p + p sin  p — 1 

v cos  pv  dv  = — • 2 , etc. 

77  p 


7.  A 


2_ 

77 


sin  t; 


cos  pv  dv 


2 77 

77  2 


1 if  0 < p < 1 and  0 if  p > 1 , 


u 


1 

_ 2 2x 

cos  /?x  e c p dp 


Jo 

9.  Set  w = — u in  (21)  to  get  erf  (— x)  = —erf  x. 

13.  In  (12)  the  argument  x + 2cz\ft  is  0 (the  point  where /jumps)  when  z 
This  gives  the  lower  limit  of  integration. 

15.  Set  w = s/y/2  in  (21). 


— x/(2cVf). 


Problem  Set  12.9,  page  584 


1.  (a),  (b)  It  is  multiplied  by  V2.  (c)  Half 

5.  Bmn  = (—  l)w  + 18/(mn772)  if  ???  odd,  0 if  m even 

7.  Bmn  = (—  1 )m+n4ab/(mmr2) 

11.  u = 0. 1 cos  V20t  sin  2x  sin  4y 


GO  OO 


13.  — ^ ^ 3 cos  2 + , 2)  sin  mx  sin  ny 


Q Q 

77“  , ; m n 

m= 1 n= 1 


m,n  odd 

17.  c77  V260  (corresponding  eigenfunctions  F4jig  and  /’  i e,  1 4),  etc. 


19.  cos  777 


36 


2 ,2 

fl  o 


4 \ . 677x  . 477y 


sin sin 

a b 


Problem  Set  12.10,  page  591 

5.  110  + ( r cos  0 — — r3  cos  3 6 + — r5  cos  56 (-•■•) 

77  3 5 

440  1 o l _ 

7.  5577  ( r cos  6 H — r cos  30  H r cos  50  + • ■ ■ ) 

77  9 25 


A34 
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11.  Solve  the  problem  in  the  disk  r < a subject  to  u0  (given)  on  the  upper  semicircle 
and  — m0  on  the  lower  semicircle. 


Problem  Set  12.11,  page  598 

5.  A4  = A6  = A8  = A10  = 0,  A5  = 605/16,  A7  = -4125/128,  A9  = 7315/256 
9.  V2//  = u"  + 2u  /r  = 0,  u"/u'  = —2 /r.  In  \u  = —2  In  |r|  + c4, 


13.  u = 320/ r + 60  is  smaller  than  the  potential  in  Prob.  12  for  2 < r < 4. 

17.  u = 1 

19.  cos  2(f>  = 2 cos2  <f>  — 1,  2vv2  — 1 = § P2(w)  — u = §/-2P2(cos  (f>)  — g 
25.  Set  1 /r  = p.  Then  u(p,  9,  cf>)  = w(r,  9,  <f>),  up  = (v  + rvr)(—  1/p2), 

upp  = (2vr  + wrr)(l/pA)  + (v  + rvr)(2/p3),  upp  + (2 /p)up  = r5(urr  + (2/r)vr). 
Substitute  this  and  u ^ = ru ^ etc.  into  (7)  [written  in  terms  of  p]  and  divide  by  r5. 

Problem  Set  12.12,  page  602 

5.  IT  = — + , W( 0,  s)  = 0,  c(s)  = 0,  w(x,  t)=x(t-  1 + e_t) 

xs  ^2(i  + 1) 

7.  w = f(x)g(t),  xf'g  + fg  = xt,  tak e/(x)  = x to  get  g = ce-t  + t — 1 and  c = 1 from 
w(x,  0)  = ^(c  — 1)  = 0. 

11.  Set  jc2/ (4 c2t)  = z2.  Use  z as  a new  variable  of  integration.  Use  erf(°°)  = 1. 

Chapter  12  Review  Questions  and  Problems,  page  603 

17.  u = Ci(x)e~3v  + c2(x)e2y  — 3 19.  Hyperbolic,  fc(x)  +/2(y  + x) 

21.  Hyperbolic,  f4(y  + 2x)  + /2(y  — 2x)  23.  | cos  2t  sin  x — \ cos  6t  sin  3x 


13.  Increase  by  a factor  V2 

17.  No 


15.  T = 6.826 pR2f\ 

25.  an/(277)  = 0.6098;  See  Table  A1  in  App.  5. 


u'  = c/rz,  u = c/r  + k 


25.  sin  0.0177X  e 


— 0.001143t 


29.  100  cos  2x  e_4t 

39.  u = («!  - M0)(ln  r)/ln  (r4/r0)  + (w0ln  r4  - ux  In  r0)/ln  (r4/r0) 


Problem  Set  13.1,  page  612 


1.  1/f  = i/i2  = —i,  l/i3  = if  ;4  = i 


3.  4.8  - 1.4* 

9.  -117,  4 
13.  -120  - 40/ 


5.  x — iy  = — (x  + iy),  x = 0 

11.  -8  - 6 i 
15.  3 - i 

19.  (x2  - y2)/ (x2  + y2),  2xy/(x2  + y2) 


17.  -4x2y 


.2,2 


Problem  Set  13.2,  page  618 

1.  V2  (cos  \ tt  + i sin  | tt) 

3.  2(cos277  + /sin277),  2(005  2 77  — / sin  2 77) 


App.  2 Answers  to  Odd-Numbered  Problems 
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5.  2(cos  + i sin  77) 
9.  37r/4 

13.  — 1 024.  Answer:  77 
17.  2 + 2 / 

23.  6,  -3  ± 3 V3Z 


7.  \/ 1 + ^772  (cos  arctan  \ 77  + / sin  arctan  \ it) 
11.  ± arctan  (| ) = ±0.9273 

15.  -3 / 

21.  ^2  (cos  ^77  + /sin  ^77),  k = 1,9,  17 


25.  cos  (^ 77  + \kjr)  + / sin  (^77  + \kTT),  k = 0,  1,  2,  3 

27.  cos  5 77  ± / sin  § 77,  cos  §77  ± / sin  1 77,  — 1 

29.  i,  -1  - Z 31.  ±(1  - /),  ±(2  + 2i) 

33.  |zi  + 72 12  = (zi  + 72)(zi  + Z2)  = (zi  + z2)(zi  + z2).  Multiply  out  and  use 
Re  7i72  = IZ1Z2I  (Prob.  34). 

Z1Z1  + ZiZ2  + Z2Z1  + Z2Z2  = |zil2  + 2 Re  Z1Z2  + \z2\2  ^ Izil2 
+ 2|z1||z2I  + |z2|2  = (Izil  + lz2l)2-  Hence  | zi  + Zi|2  ^ (Izil  + IZ2I)2-  Taking 
square  roots  gives  (6). 

35.  [(X!  + x2)2  + (yi  + V2)2]  + [(*1  - ^2)2  + (n  - .V2)2]  = 2(x?  + y2  + x2  + yl) 


Problem  Set  13.3,  page  624 

1.  Closed  disk,  center  — 1 + 5/,  radius  § 

3.  Annulus  (circular  ring),  center  4 — 2/,  radii  77  and  377 

5.  Domain  between  the  bisecting  straight  lines  of  the  first  quadrant  and  the  fourth 

quadrant. 

7.  Half-plane  extending  from  the  vertical  straight  line  x = — 1 to  the  right. 

11.  n(x,  y)  = (1  — x)/((l  — x)2  + y2),  u(  1,  — 1)  = 0, 
v(x,y)  = y((l  - xf  + y2),  f (1,  — 1)  = — 1 
15.  Yes,  since  Im(|z|2/z)  = Im(|z|2z/(zz))  = Im  7 = ~r  sin  0 — > 0. 

17.  Yes,  because  Re  z = r cos  0 — * 0 and  1 — |z|  — > 1 as  r — > 0. 

19.  f\z)  = 8(7  - 4i)7.  Now  7 - 4/  = 3,  hence/' (3  + 4/)  = 8 • 37  = 17,496. 

21.  n(l  - 7)  “n_1Z,  ni  23.  3iz2/(z  + if,  -3//16 


Problem  Set  13.4,  page  629 

1.  rx  = x/r  = cos  0,  ry  = sin  9,  9X  = —(sin  6)/r,  9y  = (cos  9)/r 

(a)  0 = ux  — vy  = ur  cos  6 + ug{—  sin  9)/r  — vr  sin  9 — vg(cos  9)/r 

(b)  0 = uy  + vx  = ur  sin  9 + ug( cos  9)/r  + vrcos  9 + ug(— sin  9)/r 


Multiply  (a)  by  cos  9,  (b)  by  sin 
3.  Yes 

7.  Yes,  when  7 ¥=  0.  Use  (7). 

11.  Yes 

15.  f(z)  = 1/z  + c ( c real) 

19.  No 

23.  a = 0,  v = 2b(yz  — x2)  + c 
29.  Use  (4),  (5),  and  (1). 

Problem  Set  13.5,  page  632 

3.  e2™e~2rr  = e-2lT  = 0.001867 
7.  eVzi  = 4.113/ 

11.  6.3 e™ 


>,  and  add.  Etc.  _ 

5.  No,  /(z)  = (z2) 

9.  Yes,  when  7 # 0,  —277/,  277/ 

13.  f(z)  = —2i(z2  + c),  c real 
17.  f(z)  = z2  + z + c (c  real) 

21.  a = 77,  v = e"x  sin  77 y 
27./=  u + iv  implies  if  = —v  + iu. 


5.  e2(—  1)  = -7.389 

g 5^2  arctan  (3/4)  _ ^0.644i 

13.  Vie™1* 
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15.  exp  (x2  — v2)  cos  2 xy,  exp  ( x 2 — y2 ) sin  2 xy 
17.  Re  (exp  (z3))  = exp  (x3  — 3xv2)  cos  (3x2y  — v3) 

19.  z = 2mri,  n = 0,  1, ■ • ■ 

Problem  Set  13.6,  page  636 

1.  Use  (11),  then  (5)  for  elv,  and  simplify.  7.  cosh  1 = 1.543,  / sinh  1 = 1.175/ 

9.  Both  -0.642  - 1.069/.  Why?  11.  i sinh  7 t = 11.55/,  both 

15.  Insert  the  definitions  on  the  left,  multiply  out,  and  simplify. 

17.  z = ±(2 n + l)i/2  19.  z = ±mri 

Problem  Set  13.7,  page  640 

5.  In  11  + 7 Ti  7.  gin  32  - 7T//4  = 1.733  - 0.785/ 

9.  Z arctan  (0.8/0. 6)  = 0.927/  11.  In  e + 77//2  = 1 + 7T//2 

13.  ±272777,  22  = 0,  1,  • • • 

15.  In  |e*|  + / arctan  S'n  ± 2/277/  = 0 + / + 2mri,  n = 0,  1,  • ■ • 
cos  1 

17.  In  (/2)  = In  (—1)  = (1  ± 2/2)777,  2 In  / = (1  ± 4/2)77/,  n = 0,  1,  • ■ • 

19.  e4-3i  = e4  (cos  3 - / sin  3)  = -54.05  - 7.70/ 

21.  e°-6e0Ai  = e0  6 (cos  0.4  + / sin  0.4)  = 1.678  + 0.710/ 

2^  ^(1— z)  Ln  (l+i)  __  ^lnV2  + 7rz/4— i lnV2  + 7t/4  __  ^ 3Q79  -j-  | 3179/ 

25.  e(3_ixln3+7ri)  = 27e7r(cos  (3tt  - In  3)  + / sin  (3tt  - In  3))  = -284.2  + 556.4/ 
27.  e<2— *>Ln(-D  = = 23.14 


Chapter  13  Review  Questions  and  Problems,  page  641 

1.  2 - 3/  3.  27.46<?°  "29\ 

11.-5  + 12/  13.  0.16  — 0.12/ 

15.  i 

19.  15e-17i/2 
23.  (±1  ± /)/V2 
27.  f(z)  = e-2z 

31.  cos  3 cosh  1 + / sin  3 sinh  1 = —1.528 
33.  i tanh  1 = 0.7616/ 

35.  cosh  77  cos  77  + / sinh  77  sin  77  = — 1 1.592 


7.616e 


1.9762 


17.  4V2e-3lri/4 
21.  ±3,  ±3/ 

25. /(z)  = — /z2/  2 
2 9./(z)  = e“z2/2 
+ 0.166/ 


Problem  Set  14.1,  page  651 

1.  Straight  segment  from  (2,  1)  to  (5,  2.5). 

3.  Parabola  y = x2  from  (1,  2)  to  (2,  8). 

5.  Circle  through  (0,  0),  center  (3,  —1),  radius  VT6,  oriented  clockwise. 
7.  Semicircle,  center  2,  radius  4. 

9.  Cubic  parabola  v = x3  ( 2 ± x ± 2) 

11.  z(f)  = / + (2  + f)/  (-1  g f g 1) 

13.  z(f)  = 2 - / + 2ert  (0  g / g 77) 


App.  2 Answers  to  Odd-Numbered  Problems 
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15.  z{t)  = 2 cosh  t + / sinh  t ( — °°  < t < 00 ) 

17.  Circle  z(r)  = —a  — ib  + re~lt  (0  § 277) 

19.  z(f)  = t + (1  - |t2)/  (—2  Si  f Si  2) 

21.  z(f)  = (1  + i)t  (1  Si  7 Si  3),  Re  z = 7,  z;  (f)  = 1 + /.  Answer:  4 + 4/ 

23.  e2™  - e77*  = 1 - (-1)  = 2 

25.  lexpz2!^  = \{e~1  — e 1)  = —sinh  1 

27.  tan  \ Tri  — tan  5 = / tanh  5 — 1 

29.  Im  z2  = 2xy  = 0 on  the  axes,  z = 1 + (—  1 + i)t  (0  =§  t g 1), 

(Imz2)  z = 2(1  — t)y(  — 1 + i)  integrated:  (— 1 + ;')/ 3. 

35.  | Re  z|  = U|  g3=MonC,L  = V8 


Problem  Set  14.2,  page  659 


1.  Use  (12),  Sec.  14.1,  with  m = 2.  3.  Yes  5.  5 

7.  (a)  Yes.  (b)  No,  we  would  have  to  move  the  contour  across  ±2 i. 


9.  0,  yes 
13.  0,  yes 
17.  0,  no 
21.  277/ 

25.  0 (Why?) 
29.  0 


11.  77 i,  no 
15.  —77,  no 
19.  0,  yes 

23.  1/z  + l/(z  — 1),  hence  277/  + 277 / = 477/. 
27.  0 (Why?) 


Problem  Set  14.3,  page  663 

1.  277/z2/(z  - l)|z=_i  = -77 / 

5.  277/(cos  3z)/6|z  = q = 77// 3 


11.  277/ 


1 

z + 2/ 


z = 2i 


77 

2 


3. 0 

7.  277/(Z/2)3/2  = 77/8 
13.  277/(z  + 2)1  z = 2 = 877/ 


15.  277/  cosh  (— 772  — 77/)  = —277/  cosh  772  = —60,739/  since  cosh  77/  = cos  77  = —1 
and  sinh  77/  = / sin  77  = 0. 


17.  277/  - 


Ln  (z  + 1 ) 


z + *'  *=* 

19.  27Tiezi/(2i)  = ire2i 


Ln  (1  + i) 

= 277/ = 77(lnV2  + /77 / 4)  = 1.089  + 2.467/ 

2/ 


Problem  Set  14.4,  page  667 
1.  (2t7//3!)(-cos0)  = — 77Z/3 


3.  (277// (n  - l)!)e° 


5.  ^-^(cosh  2 z)'"  = — • 8 sinh  1 = 9.845/ 
3!  3 


7.  (277//(2«)!)  (cos  z)1 


(2n)| 


= 0 = (277//(2n)!)(— 1)”  cos  0 = (- l)n277//(2«)! 


9.  — 277/(tan  77z)' 


2 = 0 


— 277/  • 77 
COS2  77Z 


= —277  / 


2 = 0 


Onri 

11.— ((1  + z)sinz)' 


2 = 1/2 


= 577/(sinz  + (1  + z)  COS  z)|z  = 1/2 

= 2 77/(sin  2 + 2 COS  2 ) 

= 2.821/ 
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13.  277 -i 


= 7 77 


15.  0.  Why? 


17.  0 by  Cauchy’s  integral  theorem  for  a doubly  connected  domain;  see  (6)  in  Sec.  14.2. 
7ri/4  = -977(1  + 0/(64 V2) 


19.  (277iy2!)4_3(e3z)"|z 


Chapter  14  Review  Questions  and  Problems,  page  668 
21. 1 cosh  (-3  772)  - l = 2.469 

23.  277i(ez)(4)|z:  = o = iez/ 1 2 |z  0 = iri/ 1 2 by  Cauchy’s  integral  formula. 
25.  — 277/(tan  77z)'lz  = 1 = — 2772//cos2  77zlz  = 1 = — 27727 
27.  0 since  z2  + z — 2 = 2(x2  — y2)  and  y = x 

29.  —4777 


Problem  Set  15.1,  page  679 

1.  zn  = ( 2//2)n;  bounded,  divergent,  ±1,  ±i 

3.  zn  = — 2 -77i/ ( 1 + 2 /(«/))  by  algebra;  convergent  to  —Trill 

5.  Bounded,  divergent,  ±1  + 10/ 

7.  Unbounded,  hence  divergent 
9.  Convergent  to  0,  hence  bounded 

17.  Divergent;  use  1/lnn  > 1/n.  19.  Convergent;  use  Sl/«2. 

21.  Convergent  23.  Convergent 

25.  Divergent 

29.  By  absolute  convergence  and  Cauchy’s  convergence  principle,  for  given  e > 0 we 
have  for  every  n > N(e ) and  p = 1,  2,  ■ ■ ■ 

l^-R+ll  * * ' d-  |Zn+pl  ^ C 

hence  |zn+i  + ■ ■ • + zn+p I < e by  (6*),  Sec.  13.2,  hence  convergence  by  Cauchy’s 
principle. 


Problem  Set  15.2,  page  684 

1.  No!  Nonnegative  integer  powers  of  z (or  z — zo)  only! 

3.  At  the  center,  in  a disk,  in  the  whole  plane 

5.  Zanz2n  = Sa„(z2)TC,  I z2 1 < R = lim  \an/an+1\\  hence  |z|  < VR. 
1.  77/2,  oo  9.  i,V3  11.  0,V% 

13.  -/,  \ 15.  2 /,  1 17.  1/ V2 

Problem  Set  15.3,  page  689 

3./  = /// 7 . Apply  l’Hopital’s  rule  to  In/  = (In  ri)/n. 

5.  2 7.  V3  9.  1/V2 

11.  VI  13.  1 15. 1 


Problem  Set  15.4,  page  697 


3.  2z2  - 


/o  2\3 

(2z  ) 


+ 


_ r,  2 „6  , „10 

“2z  3*  +I5z 


3! 


+ • • • , R = ™ 
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S.i-h'  + k - i,z12  + &Z1’ - + R-li 


„ 1 1 , 1 

7.  — I — cos  z = 1 

2 2 2-2! 


z2  + 


1 


9. 


1 2 , 1 4 

-r  + -r 

2 8 


+ 


2 • 4! 

dt  = z 


z4  - 


2 • 6! 


zD+  - 


I 3 , J_  5 

6 + 40  ~ 


+ ■ 


R = °° 
R = OD 


11.  z3/(1!3)  — z7/(3!7)  + zn/(5!ll)  — = °o 

13.  (2/VttXz  - z3/3  + z5/(2!5)  - z7/(3!7)  +•••),  = oo 

17.  Team  Project,  (a)  (Ln  (1  + z)/  = 1 — z + z2  — +•••=  1/(1  + z). 

(c)  Use  that  the  terms  of  (sin  iy)/(iy)  are  all  positive,  so  that  the  sum  cannot  be  zero. 

19.|  + h + Uz  ~ i ) + (4  + 4 0(z  - 02  - |(z  - 03  + R = V 2 


23.  - | t'(z  - 0 + _ 02  + mKz  ~ if  ~ m(z  ~ if  + ■■■ , R = 2 


25.  2 


R = co 


Problem  Set  15.5,  page  704 


3.  |z  + i\  g V3  - S, 

S>0 

5.  z + 2*1  = j-S, 

8 > 0 

7.  Nowhere 

9.  |z  - 2*|  ==  2 - 5, 

8 > 0 

11.  \zn\  S 1 and  2 1/n2  converges.  Use  Theorem  5. 

13.  |sinfl  z 1 1 = I for  all  z,  and  Sl/n2  converges.  Use  Theorem  5. 
15.  R = 4 by  Theorem  2 in  Sec.  15.2;  use  Theorem  1. 

17.  R = I/Vtt  > 0.56;  use  Theorem  1. 


Chapter  15  Review  Questions  and  Problems,  page  706 

11.  1 13.  3 

15.  2 17.  oo,  e2z 


19.  oo,  cosh  Vz 

oo  4 n 

2i.  y — — , 

^ (2 n + 1)! 


/?  = 00 


n = 0 


ii  i * (—i)” 

234  + ^cos  2z  = 1 + ^ 2 UTTU-  (2z)2n,  /?  = 


2 ^ (2n)! 

n=l  v f 


25.2 


(-D 


n+1 


n! 


Z2^-2,  = 00 


n=l 

27.  cos  [(Z  - 277)  + 277]  = -(z  - |t7)  + g(z  - 27r)3  “ 
29.  ln  3 + |(z  - 3)  - y^(z  - 3)2  + (z  - 3)3  - 


+ • • • = -sin  (z  - 277) 
+ •••,  /?  = 3 


A40 
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Problem  Set  16.1,  page  714 

1.  z-4  - \z~2  + M - vkz2  + . 0 < |z|  < 00 

3.  z-3  + z_1  + \z  + j^3  + ±Z5  + ■ ■ ■ , 0 < Izl  < 00 
5.  z~2  + z_1  + 1 + z + z2  + ■ ■ ■ , 0 < Izl  < 1 
7.  Z3  + 2.Z  + 24 Z 4-  720O3  + ' ' ' > 0 < |zl  < 00 

9.  exp  [1  + (z  - 1 )] (z  — ir2  = e-[(z-  D“2  + (z  - D_1  + 2 + g(z  “ D + ■"], 

0 < |z  - 1 1 < oo 

n [77/  + (Z  ~ 77/)]2  = (TTif  + 277/  + 1 

(Z  - 77/)4  (z  - 77/)4  (z  - 77/)3  (z  - 77/)2 

f 3^/_3_”(z  - /)n_2  = Kz  ~ i)~2 


— 3(z  - 

0_1  - 6/  + 10(z  - 

/)  + ■■• 

, 0 < 

z — / < 1 

(-cos  (z 

- 77))(Z  - 77) 

-2  _ 

: ~(z  - 

77)_ 2 + 

\ ~ 24 (Z  - TZr 

0 < z - 

■ 77  < 00 

00 

v 2n 

2j  z ’ 

Izl  <1,  - 

00 

2 

1 

2n  + 2’ 

Izl  > 

1 

n=0 

n=0 

z 

_(z  + \ 

77)_ 1 COS  (z  + 

\tt) 

= -(z  4 

- 277)_1 

± 2 (z  + 277)  - 

\z  + 27Z 

1 >0 

z + z 

+ z16±---, 

Izl 

< 1,  - 

-z4  - 1 

1 

CO 

1 

In; 

1 

1 

1 

25.  - —2  H : + / + (z  - i) 

(z~  1)  Z-  I 

Section  16.2,  page  719 

1.  0 ± 277,  ±477,  ■ ■ ■ , fourth  order  3.  —81/,  fourth  order 

5.  ±1,  ±2,  • • • , second  order  7.  ±(2  + 2/),  ±Z,  simple 

9. 1 sin  4z,  z = 0,  ±77/4,  ±77/2,  • • • , simple 
11  -f(z)  = (z  ~ z0)ng(z),  g(z 0)  =£  0,  hence /2(z)  = (z  - z0fng2(z). 

13.  Second-order  poles  at  / and  —2/ 

15.  Simple  pole  at  oo;  essential  singularity  at  1 + / 

17.  Fourth-order  poles  at  ±n77/,  n = 0,  essential  singularity  at  °° 

19.  ez(  1 — ez)  = 0,  ez  = 1,  z = ±2mri  simple  zeros.  Answer:  simple  poles  at  ±2/777/, 
essential  singularity  at  °° 

21.  1,  oo  essential  singularities,  ±2// 77/,  n = 0,  1,  ■ ■ • , simple  poles 

Section  16.3,  page  725 

3.  ^ at  0 5.  ±4/  at  + / 

7.  I/77  at  0,  ± 1,  ■ • • 9.  — 1 at  ±2//77/' 

11.  (ez)" /2\\z=7ri  = -l  at  z = 77/ 

15.  Simple  pole  at  5 inside  C,  residue  —1/(277).  Answer:  — i 
17.  Simple  poles  at  77/2,  residue  e7r//2/(  — sin  77/2),  and  at  —77/2,  residue 
e_7I’/2/Sin  77/2  = e-17/2.  Answer:  —Airi  sinh  77/2 
19.  277/  (sinh  I /)/2  = —77  sin  2 

21.  z-5  cos  77z  = • • • + 77 4/ (4!z)  — + ■ • • . Answer:  2775i/24 


13. 


1 + 


(z  - O' 


= 2 

n= 0 
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23.  Residues  2 at  7 = 2,2  at  7 = 5.  Answer:  5vi 

25.  Simple  poles  inside  C at  2 i,  —2 i,  3 i,  —3 i,  residues  (2 / cosh  2i)/(4z3  + 26z)\z=2i  = 
IT) > IT) > IT) > IT) > respectively.  Answer:  277/  • jg 


Problem  Set  16.4,  page  733 

1.  277/ V/t2  - 1 
5.  577/12 

9.  0.  Why?  (Make  a sketch.) 
13.  0.  Why? 

17.  0.  Why? 

19.  Simple  poles  at  ± 1,  i (and 


3.  77/V2  

7.  2fl77/Vfl2  - 1 

11. 77/2 
15.  77/3 

-/);  277/  • 4 /'  + 77/(  — 4 + 4 ) = 2 77 


77 


21.  Simple  poles  at  1 and  ±277/,  residues  i and  — /.  Answer:  — (cos  1 — e ) 


23.  -77/2  25.  0 

27.  Let  <7(7)  = (z  — a±)(z  — a2)  - ■■  (z  — up)-  Use  (4)  in  Sec.  16.3  to  form  the  sum  of 
the  residues  1 /q'{ci\)  + • ■ ■ + I / q'(cip)  and  show  that  this  sum  is  0;  here  k > 1. 


Chapter  16  Review  Questions  and  Problems,  page  733 


11.  677/ 

15.  2t7/(25z2)'|z  = 5 = 50077/ 
19.  77/6 
23.  0.  Why? 


13.  2t7/(-10  - 10) 

17.  0 ( n even),  (- l)(n_  1)/22t7//(//  - 1)!  ( n odd) 
21.  77/6O 

25.  Res  elz/{zZ  + 1)  = 1/(2 ie).  Answer:  Trie. 

z=i 


Problem  Set  17.1,  page  741 

5.  Only  in  size 

7.  x = c,  w = —y  + ic;  y = k,  w = —k  + ix 

9.  Parallel  displacement;  each  point  is  moved  2 to  the  right  and  1 up. 
11.  |w|  = \,  —77/4  < Arg  w < 77/4  13.  -5  g Re  z g —2 

15.  w = 1 17.  Annulus  2 = |w|  = 4 

19.  0 < u < In  4,  77/4  < u g 377/4 

21.  7 3 + azz  + bz  + c,  7 = — |(a  ± Va2  — 3b) 

23.7  = (-1  ± V3)/2 

25.  sinh  7 = 0 at  7 = 0,  ±77/,  ±277/,  • • • 

29.  M = I7I  = 1 on  the  unit  circle,  J = I7I2 
31.  | | = 1/ 1 7 1 2 = 1 on  the  unit  circle,  J = 1/ 1 7 1 4 

33.  M = ex  = 1 for  x = 0,  the  y-axis,  J = e2x 
35.  M = I/I7I  = 1 on  the  unit  circle,  J = l/|z|2 


Problem  Set  17.2,  page  745 


7.  7 = 


w + / 
2w 


11.7  = 0,  1 /(a  + ib) 


„ 4w  + / 

9.  7 = 

— 3/w  + 1 

13.7  = 0,  ±g,  ± = ±i/2 


A42 
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15.  z = i,  2 i 


17.  w 


az 

cz  + a 


19.  w 


az  + b 
—bz  + a 


Problem  Set  17.3,  page  750 

3.  Apply  the  inverse  g of/ on  both  sides  of  Z!  = f(z i)  to  get  g(zi)  = g(f(zi))  = Zi- 

9.  w = iz,  a rotation.  Sketch  to  see.  11.  w = (z  + i)/(z  — i) 

13.  w = 1/z,  almost  by  inspection  15.  w = 1/z  — 1 

17.  w = (2z  - i)/(~iz  ~ 2)  19.  w = (z4  - i)(-iz4  + 1) 


Problem  Set  17.4,  page  754 

1.  Circle  |w|  = ec  3.  Annulus  l/Ve  = |w|  Si  Ve 

5.  w-plane  without  w = 0 7.  1 < |w|  < e,  v > 0 

9.  ±(2n  + 1)77/2,  n = 0,  1,  • • ■ 

11.  n2/cosh2  2 + u2/sinh2  2 < 1,  u > 0,  v > 0 

13.  Elliptic  annulus  bounded  by  w2/cosh2  1 + i>2/sinh2  1 = 1 and 
w2/cosh2  3 + v2/ sinh2  3 = 1 

15.  cosh  z = cos  iz  = sin  (iz  + \ tt) 

17.  0 < Im  t < 77  is  the  image  of  R under  t = z2/ 2.  Answer:  e = ez  ^2. 

2/2  2/2  2 2 

19.  Hyperbolas  u / cos  c — v /sin  c = cosh  c — sinh  c = 1 when  c A 0,  7T,  and 
u = ± cosh  y (thus  |w|  =g  1),  v = 0 when  c = 0,  77. 

21.  Interior  of  «2/cosh2  2 + u2/sinh2  2 = 1 in  the  fourth  quadrant,  or  map 
7t/2  < x < 77,  0 < y < 2 by  w = sin  z (why?). 

23.  v < 0 

25.  The  images  of  the  five  points  in  the  figure  can  be  obtained  directly  from  the 
function  w. 


Problem  Set  17.5,  page  756 

1.  w moves  once  around  the  circle  | w | = |. 
3.  Four  sheets,  branch  point  at  z = — 1 
5.  —i/4,  three  sheets 
7.  z o,  n sheets 

9.  Vz(z  — i)(z  + i),  0,  ±i,  two  sheets 


Chapter  17  Review  Questions  and  Problems,  page  756 


11.  1 < | w | <4,  |arg  w\  < 7 t/A 

h 


15.  u = 1 — \ v2,  same  (why?) 


19.  h < 
23.  w = 


\w\  <|, 
lOz  + 5 i 
z + 2 i 


v < 0 


27.  w = 1/z 

31.  z = 2 ± V6 
35.  w = e4z 
39.  w = z2/ (2c) 


13.  Horizontal  strip  — 8 < v < 8 
17.  | w | > 1 

21.  w = 1 + iv,  v < 0 
25.  Rotation  w = iz 

29.  z = 0 

33.  z = 0,  ±i,  ±3 i 
37.  w = iz2  + 1 
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Problem  Set  18.1,  page  762 

1.  2.5  mm  = 0.25  cm;  $ = Re  110(1  + (Lnz)/ln4) 

5.  $ (x)  = Re  (375  + 25 z) 

7.  <P  (r)  = Re  (32  - z) 

13.  Use  Fig.  391  in  Sec.  17.4  with  the  z-  and  vr-p lanes  interchanged  and 
cos  z = sin  ( z + h77)- 
15.  <J>  = 220  (x3  - 3xy2)  = Re  (220z3) 


Problem  Set  18.2,  page  766 

3.  w = iz2  maps  R onto  the  strip  —2  g u S 0;  and  d>*  = U2  + (U\  — 1/2) (1  + 2 u)  = 
U2  + (Ui  - U2){\  - xy). 

- ,,  O - 2)(2x  - 1)  + 2y2  ...  2 2 X 

5.  (a) = c,  (b)  x — y = c,  xy  = c,  e cos  y = c 

(x  - 2)2  + y2 

7.  See  Fig.  392  in  Sec.  17.4.  = Re  (sin2  z),  sin2x  (y  = 0),  sin2  x cosh2  1 — cos2  x 

sinh2  1 (y  = 1),  — sinh2  y (x  = 0,  7 r). 

2 o 2 2 2 77* 

9.  d>(x,  y)  = cos  xcosh  y — sin  x sinh  y;  cosh  y (x  = 0),  — sinhy(x  = 2O, 
cos  x(y  = 0),  cos  xcosh  1 — sin  x sinh  1 (y  = 1) 

13.  Corresponding  rays  in  the  vv-plane  make  equal  angles,  and  the  mapping  is  conformal. 
15.  Apply  w = z2. 

17.  z = (2Z  — i)/(-iZ  - 2)  by  (3)  in  Sec.  17.3. 

19.  $ = |-Arg(z  - 2),  F=-^Ln(z-2) 


Problem  Set  18.3,  page  769 

1.  (80 /d)y  + 20.  Rotate  through  77/2. 

80; 


e 80  7 ri 

5.  — arctan  = Re 

77  X 


77 


Ln  z 


7.  Tx  + ^ (T2  - 71)  arctan  = Re  ( 7i 


2; 

77 


(T2  - Ti)  Ln  z 


it 

100 


9.  — arctan  - 
77 1 


b 


arctan  - 


y 


a 


n . iT\  z ~ a 

= Re  — Ln t 

77  z ~ b 


11.  '7T'J  (Arg  (z  - 1)  - Arg  (z  + 1))  = Re  Ln 

100  o o 2 

13.  -^r  [Arg  (z2  — 1)  — Arg  (z2  +1)]  from  w = z2  and  Prob.  11. 

15.  -20  + (320/77)  Arg  z = Re  f -20  - ^ Ln  z 
17.  Re  F(z)  = 100  + (200/77)  Re  (arcsin  z) 


Problem  Set  18.4,  page  776 

1.  V(z)  continuously  differentiable. 

3.  F'(iy)  = 1 4-  1/y2,  |y|  = 1,  is  maximum  at  y = ±1,  namely,  2. 
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5.  Calculate  or  note  that  V2  = div  grad  and  curl  grad  is  the  zero  vector;  see  Sec.  9.8  and 
Problem  Set  9.7. 

7.  Horizontal  parallel  flow  to  the  right. 

9.  F(z)  = z 4 

11.  Uniform  parallel  flow  upward,  V = F'  = iK,  V\  = 0,  V2  = K 

13.  F(z)  = z3 

15.  F(z)  = z/r0  + r0/z 

17.  Use  that  w = arccos  z gives  z = cos  vv  and  interchanging  the  roles  of  the  z-  and 
w-planes. 

19.  y/(x2  + y2)  = c or  x2  + (y  — k )2  = k2 


Problem  Set  18.5,  page  781 

5.  3>  = f r3  sin  36 

7.  + ^ar8  cos  8 9 

9.  <1>  = 3 — 4r2  cos  26  + r4  cos  46 

1 O . ..  . 1 


11.  O = — ( r sin  0 — — r sin  20  + — r sin  30 b 

77 

2 1 2 1 

13.  = — r sin  6 H — r2  sin  26 —r3  sin  36 r4  sin  4 6 + 3 — 

77  2 977  4 


I5.d,=i+A 

2 77 


1 o 1 = 

r cos  6 r cos  36  H — r cos  56 h ■ ■ 

3 5 


17.  <!>='- 


77 


1 


1 


7 cos  6 r cos  26  3 — r cos  3 6 b 

4 9 


Problem  Set  18.6,  page  784 

1.  Use  (2).  F(z o + eia ) = (I  + eiaf,  etc.  F( |)  = ^ 
3.  Use  (2).  F(z0  + ew)  = (2  + 3e”)2,  etc.  F( 4)  = 100 
5.  No,  because  |z|  is  not  analytic. 


7. 


3>(2,  -2) 


rl  r2n 

•'o  •'o 


(1  + rcos  a)(- 3 + r sin  a)r  dr  da 


77  J 


,2  TV 


(—3  r + • • • ) dr  da 


o Jo 


• 277 


9.  5>(1,  1)  = 3 = 


r27J- 


(3  + r cos  a + r sin  a + rz  cos  a sin  a)r  dr  da 


Jo 

_ 3 _ 

77  2 


277 


13.  |F(z)|  = [cos2x  + sinh2y]1/2,  z = ±i,  Max  = [1  + sinh2  1]1/2  = 1.543 
15.  | F(z)  I = sinh2  2x  cos2  2 y + cosh2  2x  sin2  2 y = sinh2  2x  + 1 • sin2  2y,  z = 1, 
Max  = sinh  2 = 3.627 

17.  | F(z) | 2 = 4(2-2  cos  26),  z = 77/2,  377/2,  Max  = 4 

19.  No.  Make  up  a counterexample. 
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Chapter  18  Review  Questions  and  Problems,  page  785 

11.  $ = 10(1  - x + y),  F = 10  - 10(1  + i)z 

220 

13.  $ = Re  (220  - 95.54  Ln  z)  = 220  In  r = 220  - 95.54  In  r. 

In  10 

17.  2(1  - (2/ 7t)  Arg  z) 

19.  30(1  - (2/77)  Arg  (7  - 1))  

21.  $ = x + y = const,  V = F'(z)  = 1 — i,  parallel  flow 
23.  T(x,  y)  = x{2 y + 1)  = const 
25.  F'(z)  = z + 1 = x + 1 - iy 


Problem  Set  19.1,  page  796 


1.  0.84175  • 102,  -0.52868  • 103,  0.92414  • 10-3,  -0.36201  • 106 
3.  6.3698,  6.794,  8.15,  impossible 
5.  Add  first,  then  round. 


7.  29.9667,  0.0335;  29.9667,  0.0333704  (6S-exact) 
9.29.97,0.035;  29.97,0.03337;  30,0.0;  30,0.033 


11.  |e|  = \x  + y - (x  + 50 1 = l(*  ~ x)  + (y  - y)|  = 


se  e. 


+ \ey\  - Px  + Pij 


CL\  1 + 61 

13.—  = 


02 


0 2 + e2 


Q,\  6 1 

02 


2 

1 H o "t"  ' 

02  02 


ex  + e^l 


01*1 


^2 


e2 

fl2 


1 

^2 


hence 


f «1 

--V 

U\ 

el 

^2 

V«2 

3 2 )! 

a2 

«1 

a2 

I e /•  1 1 "h  | Cr2 1 - firl  "h  fir2 


15.  (a)  1.38629  - 1.38604  = 0.00025,  (b)  ln  1.00025  = 0.000249969  is  6S-exact. 

19.  In  the  present  case,  (b)  is  slightly  more  accurate  than  (a)  (which  may  produce 
nonsensical  results;  cf.  Prob.  20). 

21.  c4  • 24  + ■ ■ • + c0  • 2°  = (1  0 1 1 l.)2,  NOT  (1110  l.)2 
23.  The  algorithm  in  Prob.  22  repeats  0011  infinitely  often. 

25.  n = 26.  The  beginning  is  0.09375  (n  = 1). 

27.  /14  = 0.1812  (0.1705  4S-exact),  /13  = 0.1812  (0.1820),  I12  = 0.1951  (0.1951), 
7n  = 0.2102  (0.2103),  etc. 

29.  -0.126  • 10-2,  -0.402  • 10-3;  -0.266  • 10-6,  -0.847  • 10-7 


Problem  Set  19.2,  page  807 

3.  g = 0.5  cos  x,  x = 0.450184  (=  xio,  exact  to  6S) 

5.  Convergence  to  4.7  for  all  these  starting  values. 

7.  x = x/(ex  sin  .7);  0.5,  0.63256,  • ■ • converges  to  0.58853  (5S-exact)  in  14  steps. 

9.x  = x4  - 0.12;  x0  = 0,x3  = -0.1 19794  (6S-exact) 

11.  g = 4/x  + x3/16  - x5/576;  x0  = 2,  xn  = 2.39165  (n  ^ 6),  2.405  4S-exact 
13.  This  follows  from  the  intermediate  value  theorem  of  calculus. 

15.  x3  = 0.450184 

17.  Convergence  to  x = 4.7,  4.7,  0.8,  —0.5,  respectively.  Reason  seen  easily  from  the 
graph  off. 
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19. 0.5,  0.375,  0.377968,  0.377964;  (b)  1/ VI 

21.  1.834243  (=  x4),  0.656620  (=  x4),  -2.49086  (=  x4) 

23.  x0  = 4.5,  x4  = 4.73004  (6S-exact) 

25.  (a)  ALGORITHM  BISECT  (/,  a0,  ^0,  e,  N)  Bisection  Method 

This  algorithm  computes  the  solution  c of  fix ) = 0 (/  continuous)  within  the 
tolerance  e,  given  an  initial  interval  [up,  £>o]  such  that/(ao)/(7>o)  < 0. 

INPUT:  Continuous  function/,  initial  interval  [c/(>,  Z?0],  tolerance  e,  maximum 

number  of  iterations  N. 

OUTPUT:  A solution  c (within  the  tolerance  e),  or  a message  of  failure. 

For  n = 0,  1,  ■ • ■ , N — 1 do: 
c = 2 (an  + bn) 

If  /(c)  = 0 then  OUTPUT  c Stop.  [Procedure  completed] 

Else  if  f(an)f(bn)  < 0 then  set  an+i  = an  and  bn  + 1 = c. 

Else  set  an+ 1 = c,  and  bn  + i = bn. 

If  + i — bn+ 1|  < e|c|  then  OUTPUT  c.  Stop.  [Procedure  completed] 

End 

OUTPUT  [c/;y,  bN]  and  a message  “Failure”.  Stop. 

[Unsuccessful  completion;  N iterations  did  not  give  an  interval  of  length  not 
exceeding  the  tolerance.] 

End  BISECT 

Note  that  [ojv,  bN]  gives  (ajv  + bN)/2  as  an  approximation  of  the  zero  and  ( bN  — aN)/2 
as  a corresponding  error  bound. 

(b)  0.739085;  (c)  1.30980,  0.429494 
27.  x2  = 1.5,  x3  = 1.76471,  • • ■ , x7  = 1.83424  (6S-exact) 

29.  0.904557  (6S-exact) 


Problem  Set  19.3,  page  819 


1.  L0(x)  = -2x  + 19,  Lxix)  = 2x-  18,  Pl( 9.3)  = L„(9.3)  • f0  + ^(9.3)  ■ h 
= 0.1086  • 9.3  + 1.230  = 2.2297 


3.  p2(x)  = 


(x  - 1.02)(x  - 1.04) 


+ 


(-0.02)(-0.04) 
(x  - l)(.r  - 1.02) 


1.0000  + 


(x  - l)(x  - 1.04) 


0.9888 


0.04  • 0.02 


• 0.9784  = 


0.02  (-0.02) 

2.580x  + 2.580;  0.9943,  0.9835 


5.  0.8033  (error  -0.0245),  0.4872  (error  -0.0148);  quadratic:  0.7839  (-0.0051), 
0.4678  (0.0046) 

7.  p2(x)  = 1.1640*  - 0.3357x2;  -0.5089  (error  0.1262),  0.4053  (-0.0226), 

0.9053  (0.0186),  0.9911  (-0.0672) 

9.  p2(x)  = -0.44304x2  + 1.30896x  - 0.023220,  p2( 0.75)  = 0.70929 
(5S-exact  0.71116) 

11.  L0  = — g(x  — l)(x  — 2)(x  — 3),  Li  = |x(x  — 2)(x  — 3),  L2  = — 2x(x  — l)(x  — 3), 
L3  = gx(x  - l)(x  - 2);  p3(x)  = 1 + 0.039740x  - 0.335 187x2  + 0.060645*3; 
p2( 0.5)  = 0.943654,^3(1.5)  = 0.510116,  /?3(2.5)  = -0.047991 
13.  2x2  — 4x  + 2 

15.  p3(x)  = 2.1972  + (x  - 9)  • 0.1082  + (x  - 9)(x  - 9.5)  • 0.005235 

17.  r = —1.5,  p2(0.3)  = 0.6039  + (-1.5)  ■ 0.1755  + 2(-1.5)(-0.5)  ■ (-0.0302) 

= 0.3293 
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Problem  Set  19.4,  page  826 

9.  [— 1.39(x  - 5)2  + 0.58  (x  - 5)3]"  = 0.004  at  x = 5.8  (due  to  roundoff; 
should  be  0). 

11.  1 - lx2  4-  jx4 

13.  1 - x2,  — 2(x  - 1)  - (x  - l)2  4-  2(x  - l)3,  -1  4 2(x-  2)  4 5(x  - 2)2 

- 6(x  - 2)3 

15.  4 4-  x2  - x3,  — 8(x  - 2)  - 5(x  - 2)2  4-  5(x  - 2)3, 

4 4-  32(x  - 4)  + 25 (x  - 4)2  - 1 l(x  - 4)3 
17.  Use  the  fact  that  the  third  derivative  of  a cubic  polynomial  is  constant,  so  that  g" 
is  piecewise  constant,  hence  constant  throughout  under  the  present  assumption. 
Now  integrate  three  times. 

19.  Curvature  f"/(l  4 -/'2)3/2  —/"  if  \f'\  is  small. 


Problem  Set  19.5,  page  839 

1.  0.747131,  which  is  larger  than  0.746824.  Why? 

3. 0.5,  0.375,  0.34375,  0.335  (exact) 

5.  e0.5  « 0.03452  (e0.5  = 0.03307),  e0.25  = 0.00829  (e0.25  = 0.00820) 

7.  0.693254  (6S-exact  0.693147) 

9.  0.073930  (6S-exact  0.073928) 

11.  0.785392  (6S-exact  0.785398) 

13.  (0.785398126  - 0.785392156)/15  = 0.39792  • 10-6 

15.  (a)  Mz  = 2,  \KM2\  = 2/(12 nz)  = 10_5/2 ,n  = 183.  (b)/iv  = 24/x5,  M4  = 24, 
|CM4|  = 24/(180  • (2m)4)  = 10_5/2,  2m  = 12.8,  hence  14. 

17.  0.94614588,  0.94608693  (8S-exact  0.94608307) 

19.  0.9460831  (7S-exact) 

21.  0.9774586  (7S-exact  0.9774377) 

23.  Set  x = \{t  + 1),  0.2642411177  (lOS-exact),  1 - 2/e 

25.  x = \[t  4-1),  dx  = \dt,  0.746824127  (9S-exact  0.746824133) 

27.0.08,  0.32,  0.176,  0.256  (exact) 

29.  5(0.1040  - I • 0.1760  4-  | • 0.1344  - \ • 0.0384)  = 0.256 


Chapter  19  Review  Questions  and  Problems,  page  841 

17. 4.375,  4.50,  6.0,  impossible 
19.  44.885  §sg  44.995 
21.  The  same  as  that  of  a. 

23.  x = 20  ± V398  = 20.00  ± 19.95,  X!  = 39.95,  x2  = 0.05,  x2  = 2/39.95 
= 0.05006  (error  less  than  1 unit  of  the  last  digit) 

25.  x = x4  — 0.1,  -0.1,  -0.999,  -0.99900399 

27.  0.824 

29.  -x  4-  x3,  2(x  - 1)  4-  3(x  - l)2  - (x  - l)3 
31.  0.26,  M2  = 6,  M2  = 0,  -0.02  gegO,  0.01 
33.  0.90443,  0.90452  (5S-exact  0.90452) 

35.  (a)  (0.43  - 2 • 0.23  4-  0)/0.04  = 1.2,  (b)  (0.33  - 2 • 0.23  4-  0.13)/0.01  = 1.2  (exact) 
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Problem  Set  20.1,  page  851 

l.Xi  = 7.3,  x2  = —3.2 

-3  6 -9  -46.725 

0 9 -13  -51.223 

0 0 -2.88889  -7.38689^ 

= 3.908,  x2  = -1.998,  x3  = 2.557 

178.54 

137.86 


3.  No  solution 


7. 


9. 


11. 


13. 


Xl 

13  -8  0 

0 6 13 

0 0 -16  —253.12^ 

xi  = 6.78,  x2  = —11.3,  x3  = 15.82 

3.4  -6.12  -2.72  0 

0 0 4.32  0 

L0  0 0 0j 

xi  = ti  arbitrary,  x2  = (3.4/6. 12)fi,  X3  = 0 

5 0 6 -0.329193 

0 -4  -3.6  -2.143144 

0 0 2.3  -0.4 

x ! = 0.142856,  x 2 = 0.692307,  x3  = -0.173912 

-8.7 


15. 


-1  -3.1  2.5  0 

0 2.2  1.5  -3.3 

0 0 -1.493182  -0.825 

0 0 0 6.13826 

xi  = 4.2,  x2  = 0,  X3  = —1.8,  X4  : 

Problem  Set  20.2,  page  857 

xi  = —4 
x2  = 6 


-9.3 

1.03773 

12.2765 

2.0 


1. 


"1 

o' 

’4 

5' 

3 

1 

0 

-1 

3. 


0 

1 

5 

0 

1 

9 


4 
1 
0 
9 

0 -6 


0 


1 
2 
3 
6 
3 

0 -3 


xi  = 0.4 
x2  = 0.8 
x3  = 1.6 

Xi 


X 

15 


r - A 

x2  — 15 


5.  xi  = 2,  x2 


*3  = 


5. 
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11. 


13. 


15. 


17. 


3 
2 

4 

0.1 

0 

0.3 

1 

-1 

3 

2 


0 

3 

1 

0 

0.4 

0.2 

0 

2 

-1 

0 


0 

0 

0.1 


0 

0 

3 

-1 


2 

3 

0 

0.1 

0 

0 

1 

0 

0 

0 


4 

1 

3_ 

0 

0.4 

0 

-1 


xi  = 0.6 

x2  = 1.2 
x3  = 0.4 

0.3 
0.2 
0.1 


3 2 

2-1  0 
0 3-1 

0 0 4 


X\  = 2 

x2  = -11 
x3=  4 

X\=  2 

x2  = -3 
x3  = 4 

X4  = ~ 1 


No,  since  xT(— A)x  = — xTAx  < 0;  yes;  yes;  no 
-3.5  1.25’ 

3.0  -1.0 

584  104  -66 

20  -12 
12  9 

-6  -14 


36 


19.— 


104 

—66 

21 

-6 

-14 

6 


36  -12  -4 


-12 

-4 


20 

-4 


Problem  Set  20.3,  page  863 

5.  Exact  0.5,  0.5,  0.5  7.  x\  = 2,  x2  = —4,  x3  = 8 

9.  Exact  2,  1,  4 

11.  (a)  x(3)T  = [0.49983  0.50001  0.500017], 

(b)  x(3)T  = [0.50333  0.49985  0.49968] 

13.8,  —16,  43,  86  steps;  spectral  radius  0.09,  0.35,  0.72,  0.85,  approximately 

15.[1.99934  1.00043  3.99684]T  (Jacobi,  Step  5);  [2.00004  0.998059  4.00072]T 

(Gauss-Seidel) 

19.  V306  = 17.49,  12,  12 


Problem  Set  20.4,  page  871 

1.  18,  VTTO  = 10.49,  8,  [0.125  -0.375  1 0 -0.75  0] 
3.5.9,  VT18T  = 3.716,  3,  ^[0.2  0.6  -2.1  3.0] 

5.5,  V5,  1,  [1  1 1 1 1]  7.  ab  + bc  + ca  = 0 
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9.  k = 5 • l = 2.5  11.  k = (5  + V5)(l  + 1/V5)  = 6 + 2V5 

13.  k = 19  ■ 13  = 247;  ill-conditioned 
15.  k = 20  • 20  = 400;  ill-conditioned 
17.  167  g 21  • 15  = 315 

19.  [—2  4]t,  [—144.0  184. 0]T,  k = 25,921,  extremely  ill-conditioned 

21.  Small  residual  [0.145  0.120],  but  large  deviation  of  x. 

23. 27,  748,  28,375,  943,656,  29,070,279 

Problem  Set  20.5,  page  875 

1.  1.846  - 1.038*  3.  1.48  + 0.09* 

5.  s = 90t  - 675,  vav  = 90  km/hr  9.  -11.36  + 5.45*  - 0.589*2 
11.  1.89  - 0.739*  + 0.207*2 

13.  2.552  + 16.23*,  -4.114  4-  13.73*  4-  2.500*2,  2.730  + 1.466* 

- 1.778*2  + 2.852*3 

Problem  Set  20.7,  page  884 

1.  5,  0,  7;  radii  6,  4,  6.  Spectrum  { — 1,  4,  9} 

3.  Centers  0;  radii  0.5,  0.7,  0.4.  Skew-symmetric,  hence  A = i/r.,  —0.7  g /r,  0.7. 

5.  2,  3,  8;  radii  1 + V2,  1,  V2;  actually  (4S)  1.163,  3.511,  8.326 
7.  til  = 100,  1 22  = *33  = 1 

9.  They  lie  in  the  intervals  with  endpoints  a:n  ± (n  — 1)  • 10-5.  Why? 

11.  p( A)  Row  sum  norm  ||  A 1 1„  = max  ^ \ajk\  = maxda^l  + Gerschgorin  radius) 

j k j 

13.  VT22  = 11.05 

15.  V052  = 0.7211 

17.  Show  that  AAT  = ATA. 

19.  0 lies  in  no  Gerschgorin  disk,  by  (3)  with  >;  hence  det  A = Ai  • • • An  A 0. 

Problem  Set  20.8,  page  887 

1.  q = 10,  10.9908,  10.9999;  |e|  S 3,  0.3028,  0.0275 

3.  q ± 8 = 4 ± 1.633,  4.786  ± 0.619,  4.917  ± 0.398 

5.  Same  answer  as  in  Prob.  3,  possibly  except  for  small  roundoff  errors. 

7.  q = 5.5,  5.5738,  5.6018;  |e|  g 0.5,  0.3115,  0.1899;  eigenvalues  (4S)  1.697, 

3.382,  5.303,5.618 

9.  y = Ax  = Ax,  yTx  = AxTx,  yTy  = A2xTx, 
e2  g yTy/xTx  - (yTx/xTx)2  = A2  - A2  = 0 
11.  q = 1,  • ■ • , —2.8993  approximates  —3  (0  of  the  given  matrix), 
e ts  1.633,  •••,0.7024  (Step  8) 

Problem  Set  20.9,  page  896 

0.98  -0.4418  0 

1.  -0.4418  0.8702  0.3718 

0 0.3718  0.4898 
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3. 


7 -3.6056 

-3.6056  13.462 


5. 


0 

3 

-67.59 

0 

0 


3.6923 

-67.59 

143.5 

45.35 

0 


7.  Eigenvalues  16,  6,  2 


0 

3.6923 

3.5385_ 

0 0 

45.35  0 

23.34  3.126 

3.126  -33.87 


’ 11.2903 

-5.0173 

0 

’ 14.9028 

-3.1265 

0 

" 15.8299 

-1.2932 

0 

-5.0173 

10.6144 

0.7499 

-3.1265 

7.0883 

0.1966 

-1.2932 

6.1692 

0.0625 

0 

0.7499 

2.0952_ 

0 

0.1966 

2.0089 

0 

0.0625 

2.0010 

9.  Eigenvalues  (4S)  141.4,  68.64,  -30.04 


141.1 

4.926 

0 

141.3 

2.400 

0 

141.4 

1.166 

0 

4.926 

68.97 

0.8691 

2.400 

68.72 

0.3797 

> 

1.166 

68.66 

0.1661 

0 

0.8691 

-30.03 

0 

0.3797 

-30.04 

0 

0.1661 

-30.04 

Chapter  20  Review  Questions  and  Problems,  page  896 


15.  [3.9  4.3  1.8]T 

17.  [-2  0 5]t 

0.28193  -0.15904  -0.00482" 

19.  -0.15904  0.12048  -0.00241 

-0.00482  -0.00241  0.01205 


5.750 

6.400 

6.390 

21. 

3.600 

, 

3.559 

3.600 

0.838 

1.000 

0.997 

Exact:  [6.4 

3.6  1.0]1 

~1.700~ 

1 .986 

_2.000" 

23. 

El  80 

0.999 

> 

1.000 

4.043 

4.002 

4.000 

Exact:  [2  1 4] 1 

25.42,  V674  = 25.96,  21 

29.  5 


27.  30 

31.  115  • 0.4458  = 51.27 
35.  1.514  + 1.1 29x  - 0.2 14x2 
37.  Centers  15,  35,  90;  radii  30,  35,  25,  respectively.  Eigenvalues  (3S)  2.63,  40.8,  96.6 
39.  Centers  0,  —1,  —4;  radii  9,  6,  7,  respectively;  eigenvalues  0,  4.446,  —9.446 


33.  5 


21 

63 
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Problem  Set  21.1,  page  910 

1.  y = 5e~0Zx,  0.00458,  0.00830  (errors  of  y5,  y10) 

3.  y = x — tanhx  (set  y — x = u ),  0.00929,  0.01885  (errors  of  y5,  y10) 

5.y=  ex,  0.0013,  0.0042  (errors  of  y5,  y10) 

7.  y = 1/(1  - xz/2),  0.00029,  0.01 187  (errors  of  y5,  y10) 

9.  Errors  0.03547  and  0.28715  of  ys  and  yio  much  larger 
ll.y  = 1/(1  - x2/2);  error -10-8,  -4  • 10-8,  • • • , -6  • 10-7,  +9  ■ 10-6; 

e = 0.0002/15  = 1.3  • 10-5  (use  RK  with  h = 0.2) 

13.  y = tan  x;  error  0.83  • 10-7,  0.16  • 10-6,  • ■ ■ , -0.56  • 10-6,  +0.13  • 10-5 
15.  y = 3 cos  x — 2 cos2  x;  error  • 107:  0.18,  0.74,  1.73,  3.28,  5.59,  9.04,  14.3,  22.8, 
36.8,  61.4 

17.  y'  = 1/(2  - x4);  error  • 109:  0.2,  3.1,  10.7,  23.2,  28.5,  -32.3,  -376,  -1656, 
-3489,  +80444 

19.  Errors  for  Euler-Cauchy  0.02002,  0.06286,  0.05074;  for  improved  Euler-Cauchy 
-0.000455,  0.012086,  0.009601;  for  Runge-Kutta.  0.0000011,  0.000016,  0.000536 

Problem  Set  21.2,  page  915 

l.y  = ex,  y%  = 1.648717,  y5  = 1.648722,  e5  = -3.8  ■ 10-8, 
y*o  = 2.718276,  y10  = 2.718284,  e10  = -1.8  • 10-6 
3.  y = tanx,  y4,  • ■ • ,yi0  (error  • 105)  0.422798  (-0.49),  0.546315  (-1.2), 

0.684161  (-2.4),  0.842332  (-4.4),  1.029714  (-7.5),  1.260288  (-13), 

1.557626  (-22) 

5.  RK  error  smaller  in  absolute  value,  error  • 105  = 0.4,  0.3,  0.2,  5.6 
(for  x = 0.4,  0.6,  0.8,  1.0) 

7.  y = 1/(4  + e~3x),  y4,  ■ • • , yio  (error  • 105)  0.232490  (0.34),  0.236787  (0.44), 
0.240075  (0.42),  0.242570  (0.35),  0.244453  (0.25),  0.245867  (0.16),  0.246926  (0.09) 
9.  y = exp  (x3)  - 1,  v4,  • • • , y10  (error  • 107)  0.008032  (-4),  0.015749  (- 10), 

0.027370  (-17),  0.043810  (-26),  0.066096  (-39),  0.095411  (-54), 

0.133156  (-74) 

13.  y = exp  (x2).  Errors  • 105fromx  = 0.3  to  0.7:  —5,  —11,  —19,  —31,  —41 
15.  (a)  0,  0.02,  0.0884,  0.215848,  y4  = 0.417818,  y5  = 0.708887  (poor) 

(b)  By  30-50% 

Problem  Set  21.3,  page  922 

1.  y4  = — e~Zx  + 4ex,  y 2 = — e~2x  + ex\  errors  of  y4  (of  y2)  from  0.002  to  0.5 
(from  —0.01  to  0.1),  monotone 

3.  >’1  = y2,  y2  = — 4y4,  y = y4  = 1,  0.99,  0.97,  0.94,  0.9005,  error 
—0.005,  —0.01,  —0.015,  —0.02,  —0.0229;  exact  y = cos  2x 
5-  Vi  = y2,  y2  = yi  + x,  y4(0)  = 1,  y2(0)  = -2,  y = y4  = e~x  - x,  y = 0.8 
(error  0.005),  0.61  (0.01),  0.429  (0.012),  0.2561  (0.0142),  0.0905  (0.0160) 

7.  By  about  a factor  105.  em(y4)  • 106  = —0.082,  • • ■ , —0.27, 
en(j2)  • 106  = 0.08,  ••■,0.27 

9.  Errors  of  y4  (of  y2)  from  0.3  • 10-5  to  1.3  ■ 10-5  (from  0.3  ■ 10-5  to  0.6  ■ 10-5) 
11.  (yi,  y2)  = (0,  1),  (0.20,  0.98),  (0.39,  0.92),  • • • , (-0.23,  -0.97),  (-0.42,  -0.91), 
(—0.59),  (—0.81);  continuation  will  give  an  “ellipse.” 
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Problem  Set  21.4,  page  930 

3.  — 3nn  + m12  = —200,  Mn  — 3mi2  = —100 

5.  105,  155,  105,  115;  Step  5:  104.94,  154.97,  104.97,  114.98 

7.  0,  0,  0,  0.  All  equipotential  lines  meet  at  the  corners  (why?). 

Step  5:  0.29298,  0.14649,  0.14649,  0.073245 
9.  0.108253,  0.108253,  0.324760,  0.324760;  Step  10:  0.108538,  0.108396, 
0.324902,  0.324831 

11.  (a)  m ii  = — M12  = —66.  (b)  Reduce  to  4 equations  by  symmetry. 

Mn  = M3i  = — Mis  = — W35  = —92.92,  U21  = — M25  = —87.45, 

m 12  — m 32  — m 14  — M34  — 64.22,  M22  — u24  — 53.98, 

«13  = m23  = M33  = 0 

13.  M12  — M32  — 31.25,  M21  — M23  — 18.75,  Mjk  — 25  at  the  others 

15.  m2i  = M23  = 0.25,  m 12  = m32  = —0.25,  = 0 otherwise 

17.  V3,  Mn  = M2i  = 0.0849,  M12  = m22  = 0.3170.  (0.1083,  0.3248  are  4S-values 
of  the  solution  of  the  linear  system  of  the  problem.) 


Problem  Set  21.5,  page  935 

5.  Mu  — 0.766,  u 2 1 — 1.109,  M12  — 1.957,  W22  — 3.293 
7.  A,  as  in  Example  1,  right  sides  —220,  —220,  —220,  —220. 

Solution  it  11  — u 2 1 — 125.7,  u 2 1 — M22  — 157.1 
13.  — 4mh  + m2i  + m 12  = —3,  Mu  4m21  + m2 2 = —12,  «n  — 4Mi2  + m22  ==  0, 
2i<2i  + 2mi2  — 12m22  = —14,  «n  = m22  = 2,  m2i  = 4,  u12  = 1. 

Here  — ^ = — 1(1  + 2.5)  with  § from  the  stencil. 

15.  b = [-200,  -100,  -100,  0]T;  ulx  = 73.68,  u21  = u12  = 4131,  u22  = 15.79  (4S) 


Problem  Set  21.6,  page  941 

5.  0,  0.6625,  1.25,  1.7125,  2,  2.1,  2,  1.7125,  1.25,  0.6625,  0 
7.  Substantially  less  accurate,  0.15,  0.25  (f  = 0.04),  0.100,  0.163  ( t = 0.08) 

9.  Step  5 gives  0,  0.06279,  0.09336,  0.08364,  0.04707,  0. 

11.  Step  2:  0 (exact  0),  0.0453  (0.0422),  0.0672  (0.0658),  0.0671  (0.0628),  0.0394 
(0.0373),  0 (0) 

13.  0.3301,  0.5706,  0.4522,  0.2380  (f  = 0.04),  0.06538,  0.10603,  0.10565,  0.6543 
(, t = 0.20) 

15.  0.1018,  0.1673,  0.1673,  0.1018  (r  = 0.04),  0.0219,  0.0355,  ■ ■ • (r  = 0.20) 


Problem  Set  21.7,  page  944 

1.  u(x,  1)  = 0,  -0.05,  -0.10,  -0.15,  -0.20,  0 

3.  For  x = 0.2,  0.4  we  obtain  0.24,  0.40  ( t = 0.2),  0.08,  0.16  (f  = 0.4), 

-0.08,  —0.16  (t  = 0.6),  etc. 

5.  0,  0.354,  0.766,  1.271,  1.679,  1.834,  ■■■(?=  0.1);  0,  0.575,  0.935,  1.135,  1.296, 
1.357,- ■■  ( t = 0.2) 

7.  0.190,  0.308,  0.308,  0.190,  (3S-exact:  0.178,  0.288,  0.288,  0.178) 
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17.  y = ex,  0.038,  0.125  (errors  of  y5  and  y10) 

19.  y = tanx;  0 (0),  0.10050  (-0.00017),  0.20304  (-0.00033),  0.30981  (-0.00048), 
0.42341  (-0.00062),  0.54702  (-0.00072),  0.68490  (-0.00076), 

0.84295  (-0.00066),  1.0299  (-0.0002),  1.2593  (0.0009),  1.5538(0.0036) 
21.0.1003346(0.8  • 10-7)  0.2027099  (1.6  • 10-7),  0.3093360(2.1  • 10-7), 
0.4227930(2.3  • 10-7),  0.5463023  (1.8  • 10-7) 

23.  y = sinx,  Vo.8  = 0.717366,  >’i  o = 0.841496  (errors  —1.0  ■ 10-5, 

-2.5  • 10-5) 

25.  y[  = >'2,  y2  = *zyi,  y = yi  = l,  l,  l,  l.oooi,  1.0006, 1.002 

27.  y{  = y2,  y2  = 2ex  — y\,  y = ex  — cos  x,  y = y\  = 0,  0.241,  0.571,  ■ • • ; 

errors  between  10~6  and  10-5 
29.  3.93,  15.71,  58.93 

31.  0,  0.04,  0.08,  0.12,  0.15,  0.16,  0.15,  0.12,  0.08,  0.04,  0 (t  = 0.3.  3 time  steps) 

33.  u(Pu)  = u (P31)  = 270,  u(P2 1)  = u(P13)  = u(P23 ) = u(P33 ) = 30, 
u(P12)  = u{P3Z)  = 90,  u(P2 2)  = 60 

35.  0.043330,  0.077321,  0.089952,  0.058488  (t  = 0.04),  0.010956,  0.017720,  0.017747, 
0.010964  (t  = 0.20) 


Problem  Set  22.1,  page  953 

3.  /(x)  = 2(Xl  - 1 f + (x2  + 2 f - 6;  Step  3:  (1.037,  -1.926),  value  -5.992 

9.  Step  5:  (0.11247,  -0.00012),  value  0.000016 

Problem  Set  22.2,  page  957 
7.  No 

9.  x3 , X4  is  the  unused  time  on  M 1,  M2,  respectively. 

11.  /( 2.5,  2.5)  = 100 
13.  /Hf>f)  = 198  g 
15.  /( 9,  6)  = 360 

17.  0.5x4  + 0.75x2  = 45  (copper),  0.5xi  + 0.25x2  = 30,/=  120x4  + 100x2, 

/max  =/( 45,30)  = 8400 

19.  / = x 4 + x2,  2x4  + 3x2  g 1200,  4x4  + 2x2  ^ 1600,  /max  = /( 300,  200)  = 500 
21.  xi/3  + x2/2  g 100,  xi/3  + x2/6  g 80,/=  150xi  + 100x2,/max  =/(210,  60)  = 
37,500 


Problem  Set  22.3,  page  961 

3.  /( 120/1 1,60/11)  = 480/11 

5.  Eliminate  in  Column  3,  so  that  20  goes./mjn  =/( 0,  2)  = —10. 

« /■  r,60  a 1500  a\  2200 

••  /max  — J\ 21 » u’  105  > — 7 

9-  /max  = 6 on  the  segment  from  (3,  0,  0)  to  (0,  0,  2) 

11.  We  minimize!  The  augmented  matrix  is 


1 

1.8 

2.1 

0 

0 

0 

To  = 

0 

15 

30 

1 

0 

150 

0 

600 

500 

0 

1 

3900 
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The  pivot  is  600.  The  calculation  gives 


1 

0 

6 

10 

o - 

3 

1000 

117 

10 

Row  1 

600  R°w  3 

Ti  = 

0 

0 

35 

2 

i 

1 

40 

105 

2 

Row  2 

- TOO  Row  3 

0 

600 

500 

0 

1 

3900_ 

Row  3 

The  next  pivot  is 

The  calculation 

gives 

1 

0 

0 

6 

175 

3 

1400 

27 

2 

Row  1 

— Row  2 

t2  = 

0 

0 

35 

2 

1 

1 

40 

105 

2 

Row  2 

0 

600 

0 

200 

7 

12 

7 

2400 

Row  3 

1000  r, 

35  Row 

Hence  —/has  the  maximum  value  —13.5,  so  that/has  the  minimum  value  13.5,  at 
the  point 


(x1,x2) 

13.  /max  =/( 5,4,  6)  = 478 


2400  105/2\ 
600’  35/2/ 


(4,  3). 


Problem  Set  22.4,  page  968 

1.  /( 6,  3)  = 84 
3.  /( 20,  20)  = 40 
5.  /(10,  5)  = 5500 
7./(l,  1,0)  = 13 
9.  /( 4,  0,  h)  = 9 


Chapter  22  Review  Questions  and  Problems,  page  968 

9.  Step  5:  [0.353  -0.028]7  Slower.  Why? 

11.  Of  course!  Step  5:  [-1.003  1.897]7 

17.  /( 2,  4)  = 100 
19.  /( 3,  6)  = -54 


Problem  Set  23.1,  page  974 


13. 


0 

0 

1 

0 

0 

1 


1 

0 

0 

1 

0 

1 


0 111 
0 0 0 0 

11. 

10  0 0 
_0  0 0 0 

15.©  © 


® © 
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17.  If  G is  complete. 

Edge 


19. 


x 

o 

tn 

<D 

> 


el 

«2 

<?3 

e4 

"-1 

-1 

1 

-f 

1 

0 

0 

0 

0 

1 

-1 

0 

0 

0 

0 

1 

Problem  Set  23.2,  page  979 
1.  5 3.  4 

5.  The  idea  is  to  go  backward.  There  is  a adjacent  to  Vp  and  labeled  fc  — 1,  etc. 
Now  the  only  vertex  labeled  0 is  s.  Hence  A(u0)  = 0 implies  u0  = s,  so  that 
Vo  — i>i  — ■ • • — — Ufe  is  a path  s^vp  that  has  length  k. 

15.  Delete  the  edge  (2,  4). 

17.  No 


Problem  Set  23.3,  page  983 

1.  (1,  2),  (2,  4),  (4,  3);  L2  = 12,  L3  = 36,  L4  = 28 

5.  (1,  2),  (2,  4),  (3,  4),  (3,  5);  L2  = 2,  L3  = 4,  L4  = 3,  L5  = 6 

7.  (1,  2),  (2,  4),  (3,  4);  L2  = 10,  L3  = 15,  L4  = 13 

9.  (1,  5),  (2,  3),  (2,  6),  (3,  4),  (3,  5);  L2  = 9,L3  = 7,  L4  = 8,  L5  = 4,  L6  = 14 

Problem  Set  23.4,  page  987 

2\ 

1.  ;4  - 3 - 5 L=  10 

r 

3.  5 - 3 - 6 ( L = 17 

2-4 

2 

5.  1 ^ 3 L=12 

xa/ 


9.  Yes 

2 

11.  1 - 3 - 4 ( L = 38 

5-6 

13.  New  York-Washington-Chicago-Dalles-Denver-Los  Angeles 
15.  G is  connected.  If  G were  not  a tree,  it  would  have  a cycle,  but  this  cycle  would 
provide  two  paths  between  any  pair  of  its  vertices,  contradicting  the  uniqueness. 
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19.  If  we  add  an  edge  (w,  u)  to  T,  then  since  T is  connected,  there  is  a path  u —>  v in  T 
which,  together  with  (u,  v),  forms  a cycle. 

Problem  Set  23.5,  page  990 

1.  If  G is  a tree. 

3.  A shortest  spanning  tree  of  the  largest  connected  graph  that  contains  vertex  1 . 

7.  (1,  4),  (1,  3),  (1,  2),  (2,  6),  (3,  5);  L = 32 
9.  (1,4),  (4,  3),  (4,  2),  (3,  5);  L = 20 
11.  (1,4),  (4,  3),  (4,  5),  (1,2);  L = 12 

Problem  Set  23.6,  page  997 

1.  {3,6},  11  + 3 = 14 

3.  {4,5,6},  10  + 5 + 13  = 28 

5.  {3,6,7},  8 + 4 + 4 = 16 
7.  S = {1,4},  8 + 6 = 14 

9.  One  is  interested  in  flows  from  s to  t,  not  in  the  opposite  direction. 

13.  A12  = 5,  A24  = 8,  A45  = 2;  A12  = 5,  A25  = 3;  A13  = 4,  A35  = 9 

Pi:  1 — 2 « 4 — 5,  A/  = 2;  P2:  1 - 2 - 5,  A f =3;  P3:  1 - 3 - 5,  A f = 4 

15.  1 - 2 - 5,  A/  = 2;  1 - 4 - 2 - 5,  A/  = 2,  etc. 

17./l3=/35  = 8,  /l4=/45  = 5,  /l2  = /24  = /46  = 4,  /56  = 13,  / = 4 + 13  = 17, 
/ = 17  is  unique. 

19.  For  instance,  f12  = 10,  /24  =/45  = 7,  /i3  = /25  = 5,  /3s  = 3,  /32  = 2, 

/ = 3 + 5 + 7 = 15,  /=  15  is  unique. 

Problem  Set  23.7,  page  1000 

3.  (2,  3)  and  (5,  6) 

5.  By  considering  only  edges  with  one  labeled  end  and  one  unlabeled  end 

7.  1-2  — 5,  At  = 2;  1 - 4 - 2 - 5,  At  = 1;  / = 6 + 2+1=  9,  where  6 is 

the  given  flow 

9.  1 — 2 — 4 — 6,  At  = 2;  1 - 3 - 5 - 6,  At  = 1;  /=4  + 2+  l=7,  where  4 

is  the  given  flow 

15.  5 = {1,2,4,  5},  T={3,  6},  cap (S,  T)  = 14 

Problem  Set  23.8,  page  1005 

1.  No  3.  No 

5.  Yes,  S = {1,4,  5,  8} 

7.  Yes,  5=  {1,3,  5}  11.1-2-3-7-5-4 

13.  1 — 2 — 3 — 7 — 5 — 4 is  augmenting  and  gives  1— 2 — 3 — 7 — 5— 4 and  (1,  2), 
(3,  7),  (5,  4)  is  of  maximum  cardinality. 

15.  1— 4 — 3 — 6 — 7 — 8 is  augmenting  and  gives  1— 4 — 3 — 6 — 7 — 8 and 
(1,4),  (3,  6),  (7,  8)  is  of  maximum  cardinality. 

19.  3 21.  2 

23.  3 25. 
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"o  0 1 f 


11. 


0 

1 

1 


13.  To  vertex  1 
From  vertex 


2 

1 

0 

1 


3 

0 

1 

0 


Vertex 

Incident  Edges 

1 

(1,2),  (1,4) 

2 

(2,  1),  (2,  4) 

3 

(3,  4) 

4 

(4,  1),  (4,  2),  (4,  3) 

0 


19.  (1,  2),  (1.  4).  (2,  3);  L2  = 2,  L3  = 5,  L4  = 5 
23.  (1,  6),  (4,  5),  (2,  3),  (7,  8) 


Problem  Set  24.1,  page  1015 

1.  ~ 19,  cjm  — 20,  qu  — 20.5  3.  qu  — 138,  q^  — 144,  qu  — 154 

5.  qu  ~ 199,  ~ 201,  qu  — 201  7.  qu  — 1.3,  qM  — 1.4,  qu  — 1.45 

9.  qL  = 89.9,  qM  = 91.0,  qu  = 91.8  11.  x = 19.875,  ,y  = 0.835,  IQR  = 1.5 
13.  i = 144.67,  5 = 8.9735,  IQR  = 16  15.  x = 1.355,  ^ = 0.136,  IQR  = 0.15 
17.  3.54,  1.29 


Problem  Set  24.2,  page  1017 

1.  23  outcomes:  RRR,  RRL , RLR,  LRR,  RLL,  LRL,  LLR,  LLL 

3.  62  = 36  outcomes  (1,  1),  (1,  2),  • • • , (6,  6),  first  number  (second  number)  referring 
to  the  first  die  (second  die) 

5.  Infinitely  many  outcomes  H TH  TTH  TTTH  • ■ ■ {H  = Head,  T = Tail ) 

7.  The  space  of  ordered  pairs  of  numbers 

9.  10  outcomes:  D ND  NND  ■ ■ ■ NNNNNNNNND 

11.  Yes 

17.  A U B = B implies  A Q B by  the  definition  of  union.  Conversely.  A C B implies 
that  A U B = B because  always  B C A U B,  and  if  ,4  C B,  we  must  have  equality 
in  the  previous  relation. 
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Problem  Set  24.3,  page  1024 


1.  1 - 4/216  = 98.15%,  by  Theorem  1 

3.  (a)  0.93  = 72.9%,  (b)  ^ •§§•§§  = 72.65% 

= 8 


7.  Small  sample  from  a large  population  containing  many  items  in  each  class  we  are 


interested  in  (defectives  and  nondefectives,  etc.) 


Q 498  497  496  495  494  __ 

500  ' 499  ‘ 498  * 497  ' 496  ~ 


0.98008 


11-  W 200  ’ m99  ~~  24.874%,  (b)  200 
(a)  + (b)  + (c)  = 1.  Why? 

13.  1 - 0.963  = 11.5% 


too  , too  too 
199  "r  200  ' 199 


50.25%,  (c)  same  as  (a). 


15.  1 - 0.8754  = 0.4138  < 1 - 0.752  = 0.4375  <0.5  (c  < b < a) 

17.  A = B U (An/?c),  hence  P(A)  = P(B ) 4-  P(Ar\Bc)  §g  P(B)  by  disjointedness  of  B 
and  ADBC 


Problem  Set  24.4,  page  1028 
1.  In  10!  = 3,628,800  ways 

*>2  1 4 3 2 1 4^3  2 1 2 ^ 4!2!  _ 2 1 _ J_ 

■^•6*5*4*3’2*1  6 * 5 ' 4 ' 3 * 2 ' 1 6!  — 6 ’ 5 — 15 

5.  ( '3°)  (2)  (2)  = 18,000  7.  210,  70,  112,  28 

9.  In  6!/6  = 120  ways  11.  9 • 8 = 72 

13.  (b)  1/ (12m) 

15.  P (No  two  people  have  a birthday  in  common)  = 365  • 364  ■ • - 346/36520  = 0.59. 
Answer:  41%,  which  is  surprisingly  large. 

Problem  Set  24.5,  page  1034 

1-  k = ,4  by  (6) 

3.  k = \ by  (10),  P( 0 g X g 2)  = | 

5.  No,  because  of  (6) 

7.  k = jqo  because  of  (6)  and  1 + 8 + 27  + 64  = 100 
9.  k = 5;  50% 

11.  0.53  = 12.5% 

13.  Fix)  = 0 if  x < -1,  F(x)  = \(x  + l)2  if  -1  g x < 0 
F(x)  = 1 - \(x  - l)2  if  0 g x < 1,  F(x)  = 1 if  x =s  1 
Answer:  500  cans,  P = 0.125,  0 
15.  X > b,  X g b,  X < c,  X g c,  etc. 


Problem  Set  24.6,  page  1038 

1 , _ 1 4 2 2 

1.  k — 2’  B ~ 3’°"  — 9 

r _ 1 2 _ 1 

5.  p-  — 4 , cr  — i6 
9. 750,  1,  0.002 

13.  $643.50 


3.  p = 77,  cr2  = 772/3;  cf.  Example  2 
7.  C = |,  p.  = 2,  o-2  = 4 
11.  c = 0.073 
15.|,i,(X-i)V20 


17.  X = Product  of  the  2 numbers.  E(X)  = 12.25,  12  cents 
19.  (0  + 1 • 3 + 3 • 8 + 1 • 27)/ 8 = 54/8  = 6-75 


A60 


App.  2 Answers  to  Odd-Numbered  Problems 


Problem  Set  24.7,  page  1044 

3.  38% 

5.  (5)  0.55,  0.03125,  0.15625,  1 -/( 0)  = 0.96875,  0.96875 
7.  0.265 

9.  f{x)  = Q.5xe~03/x\,  /(0)  +/(1)  = e_a5(1.0  + 0.5)  = 0.91.  Answer:  9% 

11.  13|% 

13.  42%,  47.2%,  10.5%,  0.3% 

15.  1 - e~02  = 18% 

Problem  Set  24.8,  page  1050 

1.  0.1587,  0.5,0.6915,0.6247 
5.  15.9% 

9.  About  58% 

13.  About  683  (Fig.  521a) 

Problem  Set  24.9,  page  1059 
, 1 JL  3 ^ 2 1 1 

A-  8’  16’  8 9’  9’  2 

5./2OO  = 1/032  - a2)  if  a2  < y < fi2 
7.  27.45  mm,  0.38  mm 

11.  25.26  cm,  0.0078  cm  13.  50% 

15.  The  distributions  in  Prob.  17  and  Example  1 

17.  No 

Chapter  24  Review  Questions  and  Problems,  page  1060 

11.  Ql=  110,  0m  = 112,01/=  115 
13.  x = 111.9,  s = 4.0125,  s2  = 16.1 
21.  xmin  Si  Xj  Si  xmax.  Sum  over  j from  1. 

17. 3c  = 6,  5 = 3.65 

19.  fix)  = (5v°)0.03a:0.975O_x  = \.5xe~15/x\ 

21. fix)  = 2~x,x  = 1,2,  •••  23.1,| 

25.0.1587,  0.6306,  0.5,  0.4950 

Problem  Set  25.2,  page  1067 

n 

1.  In  Example  1,  /r  = 0 so  ^ xj  = 0.  c)  In  i/di  = 0 and  cr2  is  as  before. 

j=i 

3.  i = e~n>J‘iJLUl+  +Xn>/ix  1!  • • ■ xn\),  <5  In  i/dfji  = —n  + (a_i  + • ■ • + xn)/ [jl  = 0, 
n\x  = nx,  ft  = x = 15.3 

5.  / = pki  1 — p)n~k,  p = k/n , k = number  of  successes  in  n trails 
7.  7/12 

9.  I = f = pi\  — p)x~1,  etc.,  p = \/x 
11.  9 = n/2  x,j  = 1/x 

13 .6  = 1 

15.  Variability  larger  than  perhaps  expected 


3.  45.065,  56.978,  2.022 
7.  31.1%,  95.4% 

11.  t = 1084  hours 
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Problem  Set  25.3,  page  1077 
3.  Shorter  by  a factor  V2 


5.  4,  16 


7.  c = 1.96,  x = 126,  s2  = 126  • 674/800  = 106.155,  k = cs/Vn  = 0.714, 
CONF0.95{  125.3  126.7},  CONF0.95{  0.1566  0.1583} 

9.  CONF0.99{  63.72  66.28} 

11.  n - 1 = 5,  F(c ) = 0.995,  c = 4.03,  x = 9533.33,  s2  = 49,666.67, 
k = 366.66  (Table  25.2),  CONF0.99{9166.7  g/ig  9900} 

13.  CONFo.95{0.023  g o-2  g 0.085} 

15.  n — 1 = 99  degrees  of  freedom.  F(ci)  = 0.025,  Ci  = 74.2,  U(c2)  = 0.975, 
c2  = 129.6.  Hence  kx  = 12.41,  yt2  = 7.10.  CONF0.95  {7.10  g a2  g 12.41}. 

17.  CONF0 95{0.74  g o-2  g 5.19} 

19.  Z = X + K is  normal  with  mean  105  and  variance  1.25. 

Answer:  F(104  g Z g 106)  = 63% 

Problem  Set  25.4,  page  1086 

3.  t = (0.286  - 0)/(4.31/V7)  = 0.18  < c = 1.94;  accept  the  hypothesis. 

5.  c = 6090  > 6019:  do  not  reject  the  hypothesis. 

7.  cr2/n  = 1.8,  c = 57.8,  accept  the  hypothesis. 

9.  jju  < 58.69  or  /a  > 61.31 

11.  Alternative  /a  A 5000,  t = (4990  - 5000)/(20/ V50)  = -3.54  < c = -2.01 
(Table  A9,  Appendix  5).  Reject  the  hypothesis  /a  = 5000  g. 

13.  Two-sided,  t = (0.55  - 0)/V0.546/8  = 2.11  < c = 2.37  (Table  A9,  Appendix  5), 
no  difference 

15.  19  • 1.02/0.82  = 29.69  < c = 30.14  (Table  A10.  Appendix  5),  accept  the 


17.  By  (12),  1 0 = vT6(20.2  — 19.6)/V0.16  + 0.36  > c = 1.70.  Assert  that  B is  better. 

Problem  Set  25.5,  page  1091 

1.  LCL  = 1 - 2.58  • 0.02/2  = 0.974,  UCL  = 1.026 
3.  27 

5.  Choose  4 times  the  original  sample  size 
9.  2. 58  V0. 0004/ V2  = 0.036,  LCL  = 3.464,  UCL  = 3.536 
11.  LCL  = np  — 3 Vnp(l  — p ),  CL  = np,  UCL  = np  + 3Vnp(l  — p) 

13.  In  about  30%  (5%)  of  the  cases 

15.  LCL  = /a  — 3 V/a  is  negative  in  (b)  and  we  set  LCL  = 0,  CL  = /a  = 3.6, 

UCL  = /a  + 3 V/a  = 9.3. 

Problem  Set  25.6,  page  1095 

1.  0.9825,  0.9384,  0.4060  3.  0.8187,  0.6703,  0.1353 

5.  e-25e(i  + 250),  P(A\  1.5)  = 94.5,  a = 5.5%  7.  19.5%,  14.7% 

9.  (1  - 0)n  + n0{  1 - 0)”-1  11.  (1  - ^)3  + 3 • |(1  - |)2  = 


hypothesis 


15.  (1  - Of,  [0(1  - 0)5-1]'  = 0,  0 = g,  AOQL  = 6.7% 
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Problem  Set  25.7,  page  1099 

3.  Xo  = (40  - 50)2/ 50  + (60  - 50)2/50  = 4 > c = 3.84;  no 

5.  xo  = is  > 11-07;  yes 

7.  xo  = 10.264  < 11.07;  yes 

9.  42  even  digits,  accept. 

, (355  - 358. 5)2  (123  - 119.5)2 

13'  *>  = 3581 + TK5 ’ 0137  < C ’ 3 84  (1  degf“  °f 

freedom,  95%) 

15.  Combining  the  last  three  nonzero  values,  we  have  K — r — 1 = 9 (r  = 1 since  we 
estimated  the  mean,  '^eoa4  ~ 3.87).  xo  = 12.8  < c = 16.92.  Accept  the  hypothesis. 

Problem  Set  25.8,  page  1102 

3.  (|)8  + 8 ■ (g)8  = 3.5%  is  the  probability  that  7 cases  in  8 trials  favor  A under  the 
hypothesis  that  A and  B are  equally  good.  Reject. 

5.  (|)18(1  + 18  + 153  + 816)  = 0.0038 

1.x  = 9.61,  s = 11.87.  t0  = 9. 67/(1 1.87/ Vl5)  = 3.16  > c = 1.76  (a  = 5%). 
Hypothesis  rejected. 

9.  Hypothesis  jl  = 0.  Alternative  jl  > 0,  x = 1.58, 

t = VTO  • 1.58/1.23  = 4.06  > c = 1.83  (a  = 5%).  Hypothesis  rejected. 

11.  Consider  w = x.;  — jl0. 

13.  n = 8;  4 transpositions,  I AT  g4)  = 0.007.  Assert  that  fertilizing  increases  yield. 

15.  P(T  §2)  = 2.8%.  Assert  that  there  is  an  increase. 

Problem  Set  25.9,  page  1111 

1.  y = 0.98  + 0.495.V  3.  y = -11,457.9  + 43 .2x 

5.  y = -10  + 0.55;t  7.  y = 0.5932  + 0.1138 x,R  = 1/0.1138 

9.  y = 0.32923  + 0.00032x,  v(66)  = 0.35035 

13.  c = 3.18  (Table  A9),  k±  = 43.2,  q0  = 54,878,  K = 1.502, 

CONF0.95!41.7  gKjg  44.7}. 

15.  y - 1.875  = 0.067(jc  - 25),  3s2  = 500,  q0  = 0.023,  K = 0.021, 

CONFo.95{0.046  g Kl  g 0.088} 
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15.  jl  = 20.325,  a2  = ( l)s 2 = 3.982  17.  CONF0.99{27.94  g/ig  34.81} 

19.  c = 14.74  > 14.5,  reject  ^0;  $((14.74  - 14.50)/  VO025)  = 0.9353 
21.  2.58  • V0. 00024/ V2  = 0.028,  LCL  = 2.722,  UCL  = 2.778 
23.  a = 1 - (1  - Of  = 5.85%,  when  0 = 0.01.  For  0 = 15%  we  obtain 
/3  = (1  — 0)6  = 37.7%.  If  n increases,  so  does  a,  whereas  /3  decreases. 

25.  y = 3.4  - 1.85x 


APPENDIX  3 
Auxiliary  Material 


A3.1  Formulas  for  Special  Functions 


For  tables  of  numeric  values,  see  Appendix  5. 


Exponential  function  ex  (Fig.  545) 


e = 2.71828  18284  59045  23536  02874  71353 

(1)  exev  = ex+v,  ex/ev  = ex~v,  ( ex)v  = exv 

Natural  logarithm  (Fig.  546) 

(2)  In  ( xy ) = In  x + In  y.  In  (x/y)  = In  x — In  y,  In  (x“)  = a lnx 

In  x is  the  inverse  of  ex,  and  eln  x = x,  e~ln  x = eln  (1/x)  = l/x. 

Logarithm  of  base  ten  log10x  or  simply  log  x 

(3)  log  x = M lnx,  M = log  e = 0.43429  44819  03251  82765  11289  18917 

1 1 

(4)  lnx  = — log x,  — = In  10  = 2.30258  50929  94045  68401  79914  54684 

M M 

log  x is  the  inverse  of  10x,  and  10Iog  x = x,  10_log  x = l/x. 

Sine  and  cosine  functions  (Figs.  547,  548).  In  calculus,  angles  are  measured  in  radians, 
so  that  sin  x and  cos  x have  period  2tt. 

sinx  is  odd,  sin  (—  x)  = — sinx,  and  cosx  is  even,  cos  (— x)  = cosx. 


Fig.  545.  Exponential  function  ex 
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Fig.  547.  sin  x Fig.  548.  cos  x 


(5) 

(6) 

(7) 

(8) 

(9) 

(10) 

(11) 


(12) 


1°  = 0.01745  32925  19943  radian 
1 radian  = 57°  17'  44.80625" 

= 57.29577  95131° 
sin2x  + cos2x  = 1 
sin  (x  + y)  = sin  x cos  y + cos  x sin  y 

sin  (x  — y)  = sin  x cos  y — cos  x sin  y 

cos  (x  + y)  = cos  x cos  y — sin  x sin  y 

cos  (x  — y)  = cos  x cos  y + sin  x sin  y 

sin  2x  = 2 sin  x cos  x,  cos  2x  = cos2  x — sin2  x 


' 

/ TT  \ 

sinx  = cos 

(‘-2, 

7t\ 

| = COS 

(2'*) 

1 TT  \ 

cos  x = sin  | 

rv 

= sin  | 

J-*) 

sin  (tt  — x)  = sin  x. 


cos  (tt  — x)  = —cos  x 


cos2x  = |(1  + cos  2x),  sin2x  = g(l  — cos  2x) 

sinx  sin  y = |[— cos  (x  + y)  + cos  (x  — y)] 

cos  x cos  y = \ [cos  (x  + y)  + cos  (x  — y)] 

sin  x cos  y = |[sin  (x  + y)  + sin  (x  — y)] 

u + v u — v 
sin  u + sin  v = 2 sin cos  ■ 


u + v 

cos  n + cos  v = 2 cos cos 


u + v 

cos  v — cos  n = 2 sin sin 


2 

u — v 
2 

u — v 


, ^ sin  8 

(13)  A cosx  + B sinx  = V42  + B2  cos  (x  ± 8),  tan  8 = 

(14)  A cos  x + B sin  x = Va2  + B2  sin  (x  ± 5),  tan  8 = 


cos  8 
sin  8 


cos  8 


B 

A 

A 

B 
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y 


i 


2k  x 


Fig.  549.  tan  x 


Tangent,  cotangent,  secant,  cosecant  (Figs.  549,  550) 


(15)  tanx  = 


sin  x 


cot  x = 


cos  x 


secx  = 


esc  x = 


cos  x 


sin  x 


cos  x 


sin  x 


(16) 


tan  (x  + y)  = 


tan  x + tan  v 


tan  (x  — >’)  = 


tan  x — tan  y 


1 - tan  x tan  y ’ JJ  1 + tan  x tan  y 

Hyperbolic  functions  (hyperbolic  sine  sinhx,  etc.;  Figs.  551,  552) 


(17) 

(18) 

(19) 

(20) 
(21) 


sinh.t  = |(ex  — e x). 


sinh  x 

tanhx  = , 

cosh  x 


coshx  = \{ex  + e x) 


cosh  x 

cothx  = 

sinh  x 


coshx  + sinhx  = ex. 


coshx  — sinhx  = e 


cosh2  x — sinh2  x = 1 


sinh2x  = 5 (cosh  2x  — 1),  cosh2x  = |(cosh2x  + 1) 
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(22) 


(23) 


(sinh  (x  ± y)  = sinh  x cosh  y ± cosh  x sinh  y 
cosh  (x  ± y)  = cosh  x cosh  y ± sinh  x sinh  y 


tanh  (x  ± y) 


tanhx  ± tanh  y 
1 ± tanh  x tanh  y 


Gamma  function  (Fig.  553  and  Table  A2  in  App.  5).  The  gamma  function  r(a)  is  defined 
by  the  integral 


OO 

dt  (a  > 0), 

which  is  meaningful  only  if  a > 0 (or,  if  we  consider  complex  a,  for  those  a whose  real 
part  is  positive).  Integration  by  parts  gives  the  important  functional  relation  of  the  gamma 
function, 

(25)  T(a  + 1)  = aT(a). 

From  (24)  we  readily  have  F(  1 ) = 1;  hence  if  a is  a positive  integer,  say  k,  then  by 
repeated  application  of  (25)  we  obtain 

(26)  r (k  + 1)  = k\  (k  = 0,  1,  ■ ■ ■)■ 


(24) 


r(«)  = [ 


This  shows  that  the  gamma  function  can  be  regarded  as  a generalization  of  the  elementary 
factorial  function.  [Sometimes  the  notation  (a  — 1)!  is  used  for  F(a),  even  for  noninteger 
values  of  a,  and  the  gamma  function  is  also  known  as  the  factorial  function.] 

By  repeated  application  of  (25)  we  obtain 


r(«) 


r(a  + 1) 

a 


T(a  + 2) 
a(a  + 1) 


r(a  + k + 1) 

a(a  + l)(a  + 2)  • ■ • (a  + k) 


Fig.  553.  Gamma  function 
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and  we  may  use  this  relation 


(27) 


n«) 


r(a  + k + 1) 

a(a  + 1)  • • ■ (cr  + k) 


C a A 0,  -1,  -2,  ■ ■ •), 


for  defining  the  gamma  function  for  negative  a (A  —1,  —2,  • • ■),  choosing  for  k the 
smallest  integer  such  that  a + k + 1 > 0.  Together  with  (24),  this  then  gives  a definition 
ofT(a)for  all  a not  equal  to  zero  or  a negative  integer  (Fig.  553). 

It  can  be  shown  that  the  gamma  function  may  also  be  represented  as  the  limit  of  a 
product,  namely,  by  the  formula 


(28) 


F(a)  = lim 


a(a  + l)(a  + 2)  • • • (a  + n) 


(a  A 0,  -1,  • • •)■ 


From  (27)  or  (28)  we  see  that,  for  complex  a,  the  gamma  function  T(a)  is  a meromorphic 
function  with  simple  poles  at  a = 0,  — 1,  — 2,  ■ • ■ . 

An  approximation  of  the  gamma  function  for  large  positive  a is  given  by  the  Stirling 
formula 


(29)  r(a  + 1)  » V2^a 

where  e is  the  base  of  the  natural  logarithm.  We  finally  mention  the  special  value 

(30)  F(i)  = Vn. 


Incomplete  gamma  functions 

(31)  P(a,  x)  = f e-f*-1  dt, 

Jo 


Q(a,  x)  = f e tta  1 dt 

Jx 


(a  > 0) 


(32) 

Beta  function 

(33) 


T(a)  = P(a,  x)  + Q(a,  x) 


B(x,y)  = [ tx_1(l  - t)y- 1 dt 
Jo 


Representation  in  terms  of  gamma  functions: 


(x  > 0,  y > 0) 


(34) 


B(x,  y)  = 


rwroo 


T(x  + y ) 

Error  function  (Fig.  554  and  Table  A4  in  App.  5) 

2 


(35) 


(36) 


erf  x = 


77 


I 


e dt 


2 x3  xs 

erf  x = — 1=  I x — + — + — 

V^r  \ 1!3  2!5  3!7 


3!7  / 
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erf  (°°)  = 1,  complementary  error  function 


(37) 


erfc  x = 1 — erf  a = 


dt 


Fresnel  integrals1  (Fig.  555) 

(38)  C(x)  = f cos  (t2)  dt,  S(x)  = f sin  (t2)  dt 

Jo 

C(»)  = V 7t/8,  S(°°)  = V 7t/8,  complementary  functions 

Hn  r°° 

c(x)  = / — — C(x)  = J cos  (t2)  dt 


(39) 


s(x)  = / y - S(x)  = J sin  ( 1 2) 


dt 


Sine  integral  (Fig.  556  and  Table  A4  in  App.  5) 

~ sin  t 


(40) 


Si(x)  = f 


dt 


Fig.  555.  Fresnel  integrals 


^AUGUSTIN  FRESNEL  (1788-1827),  French  physicist  and  mathematician.  For  tables  see  Ref.  [GenRefl]. 
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Site) 

2 - 

1 - / 

I I I I I I I I I I 

0 5 10 

Fig.  556.  Sine  integral 


x 


Si(oo)  = 77/2,  complementary  function 


(41) 


si(x)  = — - Si(x)  = f 

Z 


sin  t 


dt 


Cosine  integral  (Table  A4  in  App.  5) 


(42) 


C 

ci(x)  = J 


cos  t 


dt 


(x  > 0) 


Exponential  integral 


(43) 


Logarithmic  integral 


(44) 


li(x) 


fx  dt 

0 Inf 


(x  > 0) 


A3. 1 Partial  Derivatives 

For  differentiation  formulas,  see  inside  of  front  cover. 

Let  z = fix,  y ) be  a real  function  of  two  independent  real  variables,  x and  y.  If  we  keep 
y constant,  say,  y = Vj,  and  think  of  x as  a variable,  then  fix,  y,)  depends  on  x alone.  If 
the  derivative  of  fix,  yj)  with  respect  to  x for  a value  x = x1  exists,  then  the  value  of  this 
derivative  is  called  the  partial  derivative  of  fix,  y)  with  respect  to  x at  the  point  (xlt  yx) 
and  is  denoted  by 


Other  notations  are 


dx 

tei.yi) 

or  by 

fx(Xl,  >’l) 

and 

dz 


dx 


tei,  yO 


zxixx,  Ji); 


these  may  be  used  when  subscripts  are  not  used  for  another  purpose  and  there  is  no  danger 
of  confusion. 
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EXAMPLE  1 


We  thus  have,  by  the  definition  of  the  derivative. 


(1) 


dx 


(xi,yO 


= lim 


f(x1  + Ax,  yj  ^ fjx-L,  yj 

Ax 


The  partial  derivative  of  z = f(x,  y ) with  respect  to  y is  defined  similarly;  we  now  keep 
x constant,  say,  equal  to  xlt  and  differentiate  /'(x, , y)  with  respect  to  y.  Thus 


(2) 


df 

dy 


(xi,yO 


dz 

dy 


(xi,yD 


= lim 

Ay—*0 


f(xi,  yi  + Ay)  - /(x1;  yq) 

Ay 


Other  notations  are  fy (xq , y, ) and  zy (x, , yy). 

It  is  clear  that  the  values  of  those  two  partial  derivatives  will  in  general  depend  on  the 
point  (x1;  3^).  Hence  the  partial  derivatives  dz/dx  and  dz/dy  at  a variable  point  (x,  y)  are 
functions  of  x and  y.  The  function  dz/dx  is  obtained  as  in  ordinary  calculus  by 
differentiating  z = f(x,  y ) with  respect  to  x,  treating y as  a constant,  and  dz/dy  is  obtained 
by  differentiating  z with  respect  to  y,  treating  x as  a constant. 


Let  z — f(x,  ;y)  = xy  + x sin  y.  Then 


af 


dx 


y = 2xy  + sin  y, 


df  2 

~ — = x + x cos  y. 
dy 


The  partial  derivatives  dz/dx  and  dz/dy  of  a function  z = f(x,  y)  have  a very  simple 
geometric  interpretation.  The  function  z = f(x,  y)  can  be  represented  by  a surface  in 
space.  The  equation  y = y1  then  represents  a vertical  plane  intersecting  the  surface  in  a 
curve,  and  the  partial  derivative  dz/dx  at  a point  (x1;  y-,)  is  the  slope  of  the  tangent  (that 
is,  tan  a where  a is  the  angle  shown  in  Fig.  557)  to  the  curve.  Similarly,  the  partial 
derivative  dz/dy  at  (x-, , y, ) is  the  slope  of  the  tangent  to  the  curve  x = x1  on  the  surface 
2 = f(x,  y)  at  (xl5  yi). 


Fig.  557.  Geometrical  interpretation  of  first  partial  derivatives 
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EXAMPLE  2 


EXAMPLE  3 


The  partial  derivatives  dz/dx  and  dz/dy  are  called  first  partial  derivatives  or  partial 
derivatives  of  first  order.  By  differentiating  these  derivatives  once  more,  we  obtain  the 
four  second  partial  derivatives  (or  partial  derivatives  of  second  order)2 


(3) 


f XX 

~ fyx 
~ fxy 
~ f yy 


It  can  be  shown  that  if  all  the  derivatives  concerned  are  continuous,  then  the  two  mixed 
partial  derivatives  are  equal,  so  that  the  order  of  differentiation  does  not  matter  (see  Ref. 
[GenRef4]  in  App.  1),  that  is, 

d2z  d2z 

; dxdy  dy  dx 


For  the  function  in  Example  1 . 

fxx  = 2y>  fxy  = 2x  + cos  y = fyx,  fyy  = -x  sin  y. 

By  differentiating  the  second  partial  derivatives  again  with  respect  to  x and  y, 
respectively,  we  obtain  the  third  partial  derivatives  or  partial  derivatives  of  the  third 
order  of  /,  etc. 

If  we  consider  a function  f(x,  y,  z)  of  three  independent  variables,  then  we  have  the 
three  first  partial  derivatives  fx(x,  y,  z),  fy(x,  y,  z),  and  fz(x,  y,  z).  Here  fx  is  obtained  by 
differentiating  f with  respect  to  x,  treating  both  y and  z as  constants.  Thus,  analogous  to 
(1),  we  now  have 


df 

dx 


Oci,yi,zi> 


= lim 

Ax— >0 


f(x1  + A-Y,  yq,  zf)  - /fa,  ylt  Zy) 

Ax 


etc.  By  differentiating  fx,  fy,  fz  again  in  this  fashion  we  obtain  the  second  partial 
derivatives  of  /,  etc. 

Let  fix,  y,  z)  = x2  + y2  + z2  + xy  ez.  Then 


fx  = 2 x + y ez, 

fxx  = 2. 
fyy  = 2> 


fy  = 2y  + x ez, 

fxy  = fyx  — e ’ 
fyz  = fzy  — X e , 


fz  = 2z  + xy  ez, 

fxz  ~ f zx ' — y e * 
fzz  = 2 + xy  ez. 


2 CAUTION!  In  the  subscript  notation,  the  subscripts  are  written  in  the  order  in  which  we  differentiate, 
whereas  in  the  “3”  notation  the  order  is  opposite. 
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A3. 3 Sequences  and  Series 

See  also  Chap.  15. 

Monotone  Real  Sequences 

We  call  a real  sequence  xl7  x2,  ■ ■ ■ , xn,  ■ ■ • a monotone  sequence  if  it  is  either  monotone 
increasing,  that  is, 

V y ^ y ^ • • • 

A1  = x2  = x3  = 


or  monotone  decreasing,  that  is, 


y y ^ y ^ • • • 

xl  = x2  = x3  = 

We  call  a,,  a2,  ■ • • a bounded  sequence  if  there  is  a positive  constant  K such  that  \xn\  < K 
for  all  n. 


THEOREM  1 


If  a real  sequence  is  bounded  and  monotone,  it  converges. 


PROOF  Let  xl7  x2,  • • ■ be  a bounded  monotone  increasing  sequence.  Then  its  terms  are  smaller 
than  some  number  B and,  since  x1  = xn  for  all  n,  they  lie  in  the  interval  a-,  Si  xn  Si  B, 
which  will  be  denoted  by  I0.  We  bisect  70;  that  is,  we  subdivide  it  into  two  parts  of  equal 
length.  If  the  right  half  (together  with  its  endpoints)  contains  terms  of  the  sequence,  we 
denote  it  by  Ilm  If  it  does  not  contain  terms  of  the  sequence,  then  the  left  half  of  I0  (together 
with  its  endpoints)  is  called  f.  This  is  the  first  step. 

In  the  second  step  we  bisect  f,  select  one  half  by  the  same  rule,  and  call  it  /2,  and  so 
on  (see  Fig.  558). 

In  this  way  we  obtain  shorter  and  shorter  intervals  70,  /1;  72,  ■ • ■ with  the  following 
properties.  Each  Im  contains  all  /„  for  n > m.  No  term  of  the  sequence  lies  to  the  right 
of  7m,  and,  since  the  sequence  is  monotone  increasing,  all  xn  with  n greater  than  some 
number  N lie  in  7m;  of  course,  N will  depend  on  m,  in  general.  The  lengths  of  the  lm 
approach  zero  as  m approaches  infinity.  Hence  there  is  precisely  one  number,  call  it  L, 
that  lies  in  all  those  intervals,3  and  we  may  now  easily  prove  that  the  sequence  is 
convergent  with  the  limit  L. 

In  fact,  given  an  e > 0,  we  choose  an  m such  that  the  length  of  Im  is  less  than  e.  Then 
L and  all  the  xn  with  n > N(m)  lie  in  Im,  and,  therefore,  x,„  — L | < e for  all  those  n. 
This  completes  the  proof  for  an  increasing  sequence.  For  a decreasing  sequence  the  proof 
is  the  same,  except  for  a suitable  interchange  of  “left”  and  “right”  in  the  construction  of 
those  intervals. 


3This  statement  seems  to  be  obvious,  but  actually  it  is  not;  it  may  be  regarded  as  an  axiom  of  the  real  number 
system  in  the  following  form.  Let  J2,  • • • be  closed  intervals  such  that  each  Jm  contains  all  Jn  with  n > m, 
and  the  lengths  of  the  Jm  approach  zero  as  m approaches  infinity.  Then  there  is  precisely  one  real  number  that 
is  contained  in  all  those  intervals.  This  is  the  so-called  Cantor-Dedekind  axiom,  named  after  the  German 
mathematicians  GEORG  CANTOR  (1845-1918),  the  creator  of  set  theory,  and  RICHARD  DEDEKIND 
(1831-1916),  known  for  his  fundamental  work  in  number  theory.  For  further  details  see  Ref.  [GenRef2]  in  App.  1. 
(An  interval  / is  said  to  be  closed  if  its  two  endpoints  are  regarded  as  points  belonging  to  I.  It  is  said  to  be  open 
if  the  endpoints  are  not  regarded  as  points  of  I.) 
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THEOREM  2 


PROOF 


*1 

X2  *3 

1 1 

in  mill 

0 

B 

1 

T 

1 

k- 

^2 

Fig.  558.  Proof  of  Theorem  1 


Real  Series 

Leibniz  Test  for  Real  Series 

Let  x1,  x2,  ■ ' ■ be  real  and  monotone  decreasing  to  zero,  that  is, 

(1)  (a)  x1  §£  x2  = x3  §=  ■ • • , (b)  lim  xm  = 0. 

7n — >co 

Then  the  series  with  terms  of  alternating  signs 

x1  ~ x2  + x3  — x4  + — • ■ • 

converges,  and  for  the  remainder  Rn  after  the  nth  term  we  have  the  estimate 

(2)  |f^nl  — xn+ 1- 

Let  sn  be  the  nth  partial  sum  of  the  series.  Then,  because  of  (la), 

S 2 — Xl  x2  ' 

^3  = ^2  + *3  = 52>  $3  = S1  ~ (X2  ~ -*3)  — *1; 

so  that  s2  = s3  Si  .v-] . Proceeding  in  this  fashion,  we  conclude  that  (Fig.  559) 

(3)  h = h = h = 1 ’ ' § J6  = ?4  = ^2 

which  shows  that  the  odd  partial  sums  form  a bounded  monotone  sequence,  and  so  do  the 
even  partial  sums.  Hence,  by  Theorem  1,  both  sequences  converge,  say, 

lim  s2n+l  = s,  lim  s2n  = s*. 


Fig.  559.  Proof  of  the  Leibniz  test 
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Now,  since  s2n+1  — s2n  = x2n+1,  we  readily  see  that  (lb)  implies 

s - s*  = lim  s2n+1  - lim  s2n  = lim  (s2n+1  - s2n)  = lim  x2n+1  = 0. 

r&— » oo  n — >oo  n — >oo  n — >oo 

Hence  s*  = s,  and  the  series  converges  with  the  sum  .v. 

We  prove  the  estimate  (2)  for  the  remainder.  Since  sn  — » s,  it  follows  from  (3)  that 

■<*2n+ 1 ' ^ — ^2 n and  also  $2n—  1 = s = s2n. 

By  subtracting  s2n  and  s2n_1,  respectively,  we  obtain 

s2n+l  ~ s2n  = s ~ s2n  = 0,  0 = s ~ s2n- 1 = s2n  ~ s2n- 1- 

In  these  inequalities,  the  first  expression  is  equal  to  x2n+1 , the  last  is  equal  to  — x2n,  and 
the  expressions  between  the  inequality  signs  are  the  remainders  R2n  and  R2n-i-  Thus  the 
inequalities  may  be  written 

x2n+ 1 — ^2  n — 0>  0 = ^2n-l  — ~ x2  n 

and  we  see  that  they  imply  (2).  This  completes  the  proof. 


A3^  Grad,  Div,  Curl,  V2 

in  Curvilinear  Coordinates 


To  simplify  formulas,  we  write  Cartesian  coordinates  x = Xi,  y = x2,  z = x3.  We  denote 
curvilinear  coordinates  by  qlt  q2,  q3.  Through  each  point  P there  pass  three  coordinate 
surfaces  qx  = const,  q2  = const,  q3  = const.  They  intersect  along  coordinate  curves.  We 
assume  the  three  coordinate  curves  through  P to  be  orthogonal  (perpendicular  to  each 
other).  We  write  coordinate  transformations  as 


(1) 


*1  = *i(?i.  q2,  q3). 


x2  = x2(qi,  <?2 , q3). 


x3  = xs(qi,  q3,  qa)- 


Corresponding  transformations  of  grad,  div,  curl,  and  V2  can  all  be  written  by  using 

3 / \ 2 


(2) 


hf  = 


h \ 9 * i 


Next  to  Cartesian  coordinates,  most  important  are  cylindrical  coordinates  <7,  = r,  q2=  9 , 
q3  = z (Fig.  560a)  defined  by 


(3)  x1  = q1  cos  q2  = r cos  6,  x2  = qr  sin  q2  = r sin  9,  jc3  = q3  = z 
and  spherical  coordinates  qA  = r,  q2  = I).  qa  = <j)  (Fig.  560b)  defined  by4 


(4) 


x1  = qx  cos  q2  sin  q3  = r cos  0 sin  <fi, 

xa  = qi  cos  q3 


x2  = cli  sin  cl2  sin  q3  = r sin  9 sin  4> 
r cos  cj). 


4This  is  the  notation  used  in  calculus  and  in  many  other  books.  It  is  logical  since  in  it,  6 plays  the  same  role 
as  in  polar  coordinates.  CAUTION!  Some  books  interchange  the  roles  of  0 and  t) b. 
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(a)  Cylindrical  coordinates 

Fig.  560. 


(6)  Spherical  coordinates 

Special  curvilinear  coordinates 


In  addition  to  the  general  formulas  for  any  orthogonal  coordinates  qx,  q2,  q3,  we  shall  give 
additional  formulas  for  these  important  special  cases. 

Linear  Element  ds.  In  Cartesian  coordinates, 

els 2 = dx2  + dx2  + dx2  (Sec.  9.5). 

For  the  ^-coordinates, 

(5)  ds2  = hf  dq2  + h2  dq2  + h2  dq2. 

(5')  ds2  = dr2  + r2  dO2  + dz2  (Cylindrical  coordinates). 

For  polar  coordinates  set  dz2  = 0. 

(5")  ds2  = dr2  + r2  sin2  <f)  dO2  + r2  d<p2  (Spherical  coordinates). 

Gradient,  grad  / = V/  = [fx  , /.,2,  fx  J (partial  derivatives;  Sec.  9.7).  In  the 
t/-systcm,  with  u,  v,  w denoting  unit  vectors  in  the  positive  directions  of  the  qx,  q2,  q3 
coordinate  curves,  respectively, 


(6)  grad  / = V/ 
(6')  grad  / = V/ 

(6")  grad  / = V/ 

Divergence  div  F = 

(7)  div  F = V*F 

{!')  div  F = V • F 


1 df  1 df  13/ 

= — U + v + — w 

«i  dqi  h2  dq2  h3  dq3 


df  1 df  df 

Ui V + W 

dr  r dO  dz 


(Cylindrical  coordinates) 


df  13/1  df 

= u H . — — v H — w (Spherical  coordinates). 

dr  r sin  cp  dO  r dip 

V-F  = (Ff>Xi  + (F2)X2  + (F3)x,  (F  = [F,.  F2,  F3\,  Sec.  9.8); 


1 


h\h2h3 

1 3 


- — (h2h3Ff)  + - — (h3XF2)  + - — (MhzF-s) 

. dq i dq2  dq3 


1 dF2  d F-t 

(rF,)  + + — 

r dr  r 30  dz 


(Cylindrical  coordinates) 
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I d I dF2  Id 

(7")  divF  = V-F  = -5  — (r2F j)  + — — -f  + ——7  — (sin  cf>  F3) 
r dr  r sin  <p  r sin  tp  dtp 

(Spherical  coordinates). 

Laplacian  V2/  = V*V/  = div  (grad  /)  = fXiXi  + fx^  + (Sec.  9.8): 


1 

' 9 

( h2h3 

df  ] 

+ 5 

/ d3h\ 

df] 

+ 3 

^ h\h2 

df 

li1h2h3 

_ dqi 

\ K 

dqi  J 

dq2 

l h2 

dq2; 

dq3 

\ h3 

dq3  ) _ 

(8') 

(8") 


V2/  = 


d2f  i df  i d2f  _ a2/ 


dr2 


+ — 


+ -5  — 2 ^ 2 (Cylindrical  coordinates) 


r dr  r dO  dz 


d2f  _ 2 a/ 


V2/  - - — - 


a/" 


1 a2/  1 a2/  _ cot  4>  df 

d<[> 


r dr  r sin2  tf>  dOA  r2  dtf2 


+ 


Curl  (Sec.  9.9): 


(Spherical  coordinates). 


(9) 


curl  F = V X F = 


/?!  u 

h2\ 

h3  w 

1 

a 

a 

a 

h-Ji2h3 

dqi 

dq2 

dq3 

hi  F 1 

h2F2 

h3F3 

For  cylindrical  coordinates  we  have  in  (9)  (as  in  the  previous  formulas) 
/?!  = hr  = 1,  h2  = hg  = q1  = r,  h3  = hz  = 1 


and  for  spherical  coordinates  we  have 

/r1  = hr  = 1,  h2  = hs  = q 1 sin  q3  = r sin  <f>,  h3  = = q1  = r. 


P E N D I X 4 
Additional  Proofs 


Section  2.6,  page  74 

PROOF  OF  THEOREM  1 Uniqueness1 

Assuming  that  the  problem  consisting  of  the  ODE 

0)  y"  + p{x)y  + q(x)y  = 0 

and  the  two  initial  conditions 

(2)  y(x0)  = K0,  y'(x0)  = Kx 

has  two  solutions  yqix)  and  y2(x)  on  the  interval  I in  the  theorem,  we  show  that  their 
difference 

y(x)  = Xi(x)  - y2(x) 

is  identically  zero  on  I;  then  yi  = _y2  on  /,  which  implies  uniqueness. 

Since  (1)  is  homogeneous  and  linear,  y is  a solution  of  that  ODE  on  I,  and  since  y,  and 
y2  satisfy  the  same  initial  conditions,  y satisfies  the  conditions 

(11)  y{x0)  = o,  /(* 0)  = o. 

We  consider  the  function 

z(x)  = y(xf  + y\xf 

and  its  derivative 

z = 2 yy  +2  y y . 

From  the  ODE  we  have 

II  ! 

y = ~py  ~ qy- 

By  substituting  this  in  the  expression  for  z we  obtain 

(12)  z = 2 yy'  - 2 py'2  - 2qyy  . 

Now,  since  y and  y are  real, 

(y  ± y'f  = / ± 2yy'  + y'2  Z:  0. 


1This  proof  was  suggested  by  my  colleague.  Prof.  A.  D.  Ziebur.  In  this  proof,  we  use  some  formula  numbers 
that  have  not  yet  been  used  in  Sec.  2.6. 
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From  this  and  the  definition  of  z we  obtain  the  two  inequalities 

(13)  (a)  2 yy'  g/  + y'2  = z,  (b)  -2 yy'  + y'2  = z. 

From  (13b)  we  have  2 yy'  g — z.  Together,  \2yy'\  S z.  For  the  last  term  in  (12)  we  now 
obtain 

-2 qyy'  g \—2qyy'\  = \q\\2yy'\  g \q\z. 

Using  this  result  as  well  as  —p  g |^|  and  applying  (13a)  to  the  term  2 yy'  in  (12),  we  find 

Z g z + 2.\p\y'2  + \q\z. 

Since  y2  g/  + y'2  = z,  from  this  we  obtain 

z = (1  + 2\p\  + \q\)z 

or,  denoting  the  function  in  parentheses  by  h, 

(14a)  z = hz  for  all  x on  I. 


Similarly,  from  (12)  and  (13)  it  follows  that 


(14b) 


-Z  = ~2 yy'  + 2 py'2  + 2 qyy' 
g Z + 2\p\z  + \q\z  = hz. 


The  inequalities  (14a)  and  (14b)  are  equivalent  to  the  inequalities 
(15)  z ~ hz  ^0,  z!  + hz=  0. 


Integrating  factors  for  the  two  expressions  on  the  left  are 

= g-JWx)  dx  an[j  p _ ef Wx)  dx 

The  integrals  in  the  exponents  exist  because  h is  continuous.  Since  F1  and  F2  are  positive, 
we  thus  have  from  (15) 

F^(z  - hz)  = (F1z)'  g 0 and  F2(z'  + hz)  = ( F2z )'  § 0. 

This  means  that  Fpz  is  nonincreasing  and  F2z  is  nondecreasing  on  I.  Since  z(x())  = 0 by 
(11),  when  x g x0  we  thus  obtain 


FlZ  g (FlZ)Xo  = 0,  F2z  g (F2z)Xo  = 0 
and  similarly,  when  x x0, 

FlZ  g 0,  F2z  g 0. 

Dividing  by  F1  and  F2  and  noting  that  these  functions  are  positive,  we  altogether  have 

z g 0,  z = 0 

This  implies  that  z = y2  + y 2 = 0 on  I.  Hence  y = 0 or  y,  = y2  on  I. 


for  all  x on  I. 
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Section  5.3,  page  182 

PROOF  OF  THEOREM  2 Frobenius  Method.  Basis  of  Solutions.  Three  Cases 

The  formula  numbers  in  this  proof  are  the  same  as  in  the  text  of  Sec.  5.3.  An  additional 
formula  not  appearing  in  Sec.  5.3  will  be  called  (A)  (see  below). 

The  ODE  in  Theorem  2 is 


(1) 


b(x) 


c(x) 


y = o, 


where  b{x)  and  c(x)  are  analytic  functions.  We  can  write  it 
( 1 r)  x2y"  + xb{x)y'  + c(x)y  = 0. 

The  indicial  equation  of  (1)  is 

(4)  r(r  - 1)  + b0r  + c„  = 0. 

The  roots  rl5  r2  of  this  quadratic  equation  determine  the  general  form  of  a basis  of  solutions 
of  (1),  and  there  are  three  possible  cases  as  follows. 

Case  1.  Distinct  Roots  Not  Differing  by  an  Integer.  A first  solution  of  (1)  is  of  the  form 

(5)  JiW  ==  xTl{a0  + axx  + a2x2  + • ■ ■) 

and  can  be  determined  as  in  the  power  series  method.  For  a proof  that  in  this  case,  the 
ODE  (1)  has  a second  independent  solution  of  the  form 

(6)  y2(x)  = xr*{A0  + A±x  + A2x2  +■•■), 
see  Ref.  [All]  listed  in  App.  1. 

Case  2.  Double  Root.  The  indicial  equation  (4)  has  a double  root  r if  and  only  if 
(. b0  — l)2  — 4c0  = 0,  and  then  r = |(1  — b0).  A first  solution 

(7)  yi(x)  = xr  (a0  + a xx  + a2x2  +••■),  r = g(l  — b0), 

can  be  determined  as  in  Case  1.  We  show  that  a second  independent  solution  is  of  the 
form 

(8)  y2(x)  = yi(x)  lnx  + xr (Axx  + A2x2  + ■ ■ •)  (x  > 0). 

We  use  the  method  of  reduction  of  order  (see  Sec.  2.1),  that  is,  we  determine  u(x)  such 
that  y2(x)  = u(x)y1(x)  is  a solution  of  (1).  By  inserting  this  and  the  derivatives 

y2  = u y\  + uy i»  y^  = u yi  + yi  + uy1 

into  the  ODE  (lr)  we  obtain 

x2(u"y1  + 2 u'y[  + uy")  + xb(u'y1  + uy[)  + cuy1  = 0. 
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Since  Vi  is  a solution  of  (1 ;),  the  sum  of  the  terms  involving  u is  zero,  and  this  equation 
reduces  to 


x2yiu"  + 2x2y[u'  + xby^u  = 0. 


By  dividing  by  x2 y 1 and  inserting  the  power  series  for  b we  obtain 


" j.  n 3,1  4-  b 

u + | 2 — 4- 

yi  x 


0 4.  \ ’ 

C / 


= 0. 


Here,  and  in  the  following,  the  dots  designate  terms  that  are  constant  or  involve  positive 
powers  of  x.  Now,  from  (7),  it  follows  that 

y[  _ xr_1[ra0  + (r  + llcqx  + ■ ■ ■] 
y-i  xr[a0  + a±x  + • ■ ■] 


1 / ra0  + (r  + 1 )a1x  + ■ • • \ _ r 
x \ a0  + ape  + ■ • ■ f x 


Hence  the  previous  equation  can  be  written 


(A) 


" I 

u + 


2 r + b0 


= 0. 


Since  r = (1  — b0)l 2,  the  term  (2 r + b0)/x  equals  1/x,  and  by  dividing  by  u we  thus 
have 


u 


n 


U 


By  integration  we  obtain  In  u’  = — lnx  + ■ ■ ■ , hence  u'  = (l/x)e°  ' Expanding  the 
exponential  function  in  powers  of  x and  integrating  once  more,  we  see  that  u is  of  the  form 

u = In  x + k-^x  + k2x2  + • ■ ■ . 


Inserting  this  into  v2  = uy  1,  we  obtain  for  y2  a representation  of  the  form  (8). 


Case  3.  Roots  Differing  by  an  Integer.  We  write  r1  = r and  r2  = r — p where  p is  a 
positive  integer.  A first  solution 

(9)  JiW  = xTl(a0  + a±x  + a2x2  + ■ ■ ■) 

can  be  determined  as  in  Cases  1 and  2.  We  show  that  a second  independent  solution  is 
of  the  form 


(10)  y2(x)  = ky-iix)  lnx  + xr2(A0  + + A2x2  + • • ■) 


where  we  may  have  k A 0 or  k = 0.  As  in  Case  2 we  set  v2  = uyr.  The  first  steps  are 
literally  as  in  Case  2 and  give  Eq.  (A), 


" 1 

u + 


2 r + b0 


= 0. 
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Now  by  elementary  algebra,  the  coefficient  b0  — 1 of  r in  (4)  equals  minus  the  sum  of 
the  roots, 

K ~ 1 = “Oi  + r2)  = ~(r  + r - p)  = -2 r + p. 

Hence  2 r + b0  = p + 1,  and  division  by  u gives 


The  further  steps  are  as  in  Case  2.  Integrating,  we  find 
In  u = — (p  + 1)  In  x + ■ ■ ■ , thus 


, -Cp+l)  (•  • •) 

u = x e 


where  dots  stand  for  some  series  of  nonnegative  integer  powers  of  x.  By  expanding  the 
exponential  function  as  before  we  obtain  a series  of  the  form 

. 1 k-t  i k~ 

u = + — + •••  -f 2~  H 1-  kp+i  + kp+2x  + • • • . 

XP  Xy  X X 

We  integrate  once  more.  Writing  the  resulting  logarithmic  term  first,  we  get 

1 kp_1 


u = kv  In  x + . 

’ pxv 


Hence,  by  (9)  we  get  for  y2  = u\’\  the  formula 

1 


+ kp+1x  + • • ■ 


)■ 


y2  = In  x + xl  P 


1 ^ 
kP-  ix 


(fl0  + <*1*  + • ' •)■ 


But  this  is  of  the  form  (10)  with  k = kp  since  r1  — p = r2  and  the  product  of  the  two 
series  involves  nonnegative  integer  powers  of  x only. 


Section  7.7,  page  293 


THEOREM 


Determinants 

The  definition  of  a determinant 

flu 

a12 

Cl  In 

fl21 

a22 

a2n 

(7)  D = det  A = 

anl 

an2 

^nn 

as  given  in  Sec.  7.7  is  unambiguous,  that  is,  it  yields  the 
which  rows  or  columns  we  choose  in  the  development. 

same  value  of  D no  matter 
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PROOF 


In  this  proof  we  shall  use  formula  numbers  not  yet  used  in  Sec.  7.7. 

We  shall  prove  first  that  the  same  value  is  obtained  no  matter  which  row  is  chosen. 
The  proof  is  by  induction.  The  statement  is  true  for  a second-order  determinant,  for 
which  the  developments  by  the  first  row  flua2 2 + fli2(~ fl2i)  and  by  the  second  row 
a2i(— ^12)  + fl22fln  give  the  same  value  «na22  ~ «i2«2i-  Assuming  the  statement  to  be 
true  for  an  (n  — l)st-order  determinant,  we  prove  that  it  is  true  for  an  nth-order  determinant. 

For  this  purpose  we  expand  D in  terms  of  each  of  two  arbitrary  rows,  say,  the  /th  and 
the  /th,  and  compare  the  results.  Without  loss  of  generality  let  us  assume  i < j. 

First  Expansion.  We  expand  D by  the  /th  row.  A typical  term  in  this  expansion  is 

(19)  aikCik  = aik-(-l)i+kMlk. 

The  minor  Mik  of  aik  in  D is  an  (n  — 1 )st-order  determinant.  By  the  induction  hypothesis 
we  may  expand  it  by  any  row.  We  expand  it  by  the  row  corresponding  to  the  /th  row  of 
D.  This  row  contains  the  entries  (l  A k).  It  is  the  (j  — l)st  row  of  Mik,  because  Mik 
does  not  contain  entries  of  the  /th  row  of  D,  and  i < j.  We  have  to  distinguish  between 
two  cases  as  follows. 

Case  I.  If  / < k,  then  the  entry  a3i  belongs  to  the  /th  column  of  Mik  (see  Fig.  561).  Hence 
the  term  involving  c/)7  in  this  expansion  is 

(20)  afl  • (cofactor  of  afl  in  Mik)  = ajt  ■ (-l)(j~v+lMikjl 

where  Mlkjl  is  the  minor  of  c/j7  in  Mik.  Since  this  minor  is  obtained  from  Mik  by  deleting 
the  row  and  column  of  ajt.  it  is  obtained  from  D by  deleting  the  /'th  and  /th  rows  and  the 
kth  and  /th  columns  of  D.  We  insert  the  expansions  of  the  Mik  into  that  of  D.  Then  it  follows 
from  (19)  and  (20)  that  the  terms  of  the  resulting  representation  of  D are  of  the  form 

(21a)  alkan-  (-\  )bMmi  (l  < k) 

where 

b = i-hk-hj-hl  — 1. 

Case  II.  If  / > k,  the  only  difference  is  that  then  «.;7  belongs  to  the  (/  — l)st  column  of 
Mik,  because  Mik  does  not  contain  entries  of  the  kth  column  of  D,  and  k < /.  This  causes 
an  additional  minus  sign  in  (20),  and,  instead  of  (21a),  we  therefore  obtain 

(21b)  -aikafl  • (- 1 )bMikjl  (/  > k) 

where  b is  the  same  as  before. 


/th  kVn  kVn  /th 

col.  col.  col.  col. 


Case  I Case  II 

Fig.  561.  Cases  I and  II  of  the  two  expansions  of  D 
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Second  Expansion.  We  now  expand  D at  first  by  the  / 1 h row.  A typical  term  in  this 
expansion  is 

(22)  a3lC3l  = a3l-(-\y+lM3l. 

By  the  induction  hypothesis  we  may  expand  the  minor  M:]l  of  a:ll  in  D by  its  /th  row,  which 
corresponds  to  the  /th  row  of  D,  since  j > i. 

Case  I.  If  k > l,  the  entry  aik  in  that  row  belongs  to  the  ( k — l)st  column  of  Mj7,  because 
Mjt  does  not  contain  entries  of  the  / th  column  of  D,  and  l < k (see  Fig.  561).  Hence  the 
term  involving  aik  in  this  expansion  is 

(23)  aik  ■ (cofactor  of  aik  in  Mjt)  = aik  ■ (- 1 f+ck~vMikjl, 

where  the  minor  Mikjl  of  aik  in  M;jl  is  obtained  by  deleting  the  /th  and  /th  rows  and  the 
A:th  and  /th  columns  of  D [and  is,  therefore,  identical  with  Mlkjl  in  (20),  so  that  our  notation 
is  consistent].  We  insert  the  expansions  of  the  M}1  into  that  of  D.  It  follows  from  (22)  and 
(23)  that  this  yields  a representation  whose  terms  are  identical  with  those  given  by  (21a) 
when  / < k. 

Case  II.  If  k < /,  then  aik  belongs  to  the  A: th  column  of  ;V/;/,  we  obtain  an  additional  minus 
sign,  and  the  result  agrees  with  that  characterized  by  (21b). 

We  have  shown  that  the  two  expansions  of  D consist  of  the  same  terms,  and  this  proves 
our  statement  concerning  rows. 

The  proof  of  the  statement  concerning  columns  is  quite  similar;  if  we  expand  D in 
terms  of  two  arbitrary  columns,  say,  the  fcth  and  the  / th,  we  find  that  the  general  term 
involving  ajtaik  is  exactly  the  same  as  before.  This  proves  that  not  only  all  column 
expansions  of  D yield  the  same  value,  but  also  that  their  common  value  is  equal  to  the 
common  value  of  the  row  expansions  of  D. 

This  completes  the  proof  and  shows  that  our  definition  of  an  nth-order  determinant  is 
unambiguous. 


Section  9.3,  page  368 
PROOF  OF  FORMULA  (2) 

We  prove  that  in  right-handed  Cartesian  coordinates,  the  vector  product 
v = a X b = [al5  a2,  a3]  X [*?!,  b2,  b3] 


has  the  components 

(2)  iq  = a2b3  ~ a3b2,  v2  = a3b^  — a±b3,  v3  = a^b2  — a2b^. 

We  need  only  consider  the  case  v A 0.  Since  v is  perpendicular  to  both  a and  b.  Theorem 
1 in  Sec.  9.2  gives  a • v = 0 and  b • v = 0;  in  components  [see  (2),  Sec.  9.2], 


(3) 


nqiq  -I-  a2v2  T a3u3  — 0 
b1v1  + b2v  2 + b3v3  = 0. 


A84 


APP.  4 Additional  Proofs 


Multiplying  the  first  equation  by  b3,  the  last  by  a3,  and  subtracting,  we  obtain 
{a3b1  — a1b3)v1  = (a2b3  — a3b2)v2. 

Multiplying  the  first  equation  by  b\,  the  last  by  and  subtracting,  we  obtain 

(aib2  ~ a2bi)v2  = (fl3^i  — aib3)v3. 

We  can  easily  verify  that  these  two  equations  are  satisfied  by 

(4)  Vi  = c(a2b3  — a3b2),  v2  = c(a3b1  — a^b^,  v3  = c(ciib2  — fl2^i) 

where  c is  a constant.  The  reader  may  verify,  by  inserting,  that  (4)  also  satisfies  (3).  Now 
each  of  the  equations  in  (3)  represents  a plane  through  the  origin  in  iqi^fa-space.  The 
vectors  a and  b are  normal  vectors  of  these  planes  (see  Example  6 in  Sec.  9.2).  Since 
v A 0,  these  vectors  are  not  parallel  and  the  two  planes  do  not  coincide.  Hence  their 
intersection  is  a straight  line  L through  the  origin.  Since  (4)  is  a solution  of  (3)  and,  for 
varying  c,  represents  a straight  line,  we  conclude  that  (4)  represents  L,  and  every  solution 
of  (3)  must  be  of  the  form  (4).  In  particular,  the  components  of  v must  be  of  this  form, 
where  c is  to  be  determined.  From  (4)  we  obtain 

|v|2  = v\  + uf  + v3  = c2[(a2b3  - a3b2 f + (a3bx  - axb3f  + {axb2  - a2b i)2]. 

This  can  be  written 

|v|2  = c2[(af  + a2  + a3)(bf  + b%  + b3)  - (a1b1  + a2b2  + a3b3f\, 

as  can  be  verified  by  performing  the  indicated  multiplications  in  both  formulas  and 
comparing.  Using  (2)  in  Sec.  9.2,  we  thus  have 

|v|2  = c2[(a  • a)(b  • b)  - (a  • b)2]. 

By  comparing  this  with  formula  (12)  in  Prob.  4 of  Problem  Set  9.3  we  conclude  that 
c = ±1. 

We  show  that  c = +1.  This  can  be  done  as  follows. 

If  we  change  the  lengths  and  directions  of  a and  b continuously  and  so  that  at  the  end 
a = i and  b = j (Fig.  188a  in  Sec.  9.3),  then  v will  change  its  length  and  direction 
continuously,  and  at  the  end,  v = i X j = k.  Obviously  we  may  effect  the  change  so  that 
both  a and  b remain  different  from  the  zero  vector  and  are  not  parallel  at  any  instant. 
Then  v is  never  equal  to  the  zero  vector,  and  since  the  change  is  continuous  and  c can 
only  assume  the  values  +1  or  — 1,  it  follows  that  at  the  end  c must  have  the  same  value 
as  before.  Now  at  the  end  a = i,  b=j,  v = k and,  therefore,  = 1,  b2  = 1,  v3  = 1, 
and  the  other  components  in  (4)  are  zero.  Hence  from  (4)  we  see  that  v3  = c = +1.  This 
proves  Theorem  1. 

For  a left-handed  coordinate  system,  i X j = — k (see  Fig.  188b  in  Sec.  9.3),  resulting 
in  c = —1.  This  proves  the  statement  right  after  formula  (2). 
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Section  9.9,  page  408 

PROOF  OF  THE  INVARIANCE  OF  THE  CURL 

This  proof  will  follow  from  two  theorems  (A  and  B),  which  we  prove  first. 


THEOREM  A 


Transformation  Law  for  Vector  Components 

For  any  vector  v the  components  u1;  V2,  v3  and  v *,  v f , in  any  two  systems  of 

Cartesian  coordinates  a,  , x2,  x3  and  x*,  x2,  xf , respectively,  are  related  by 


(1) 


and  conversely 


v*  = c11v1  + c12v  2 + c13u3 

t>2  = C21V1  + C22V2  + C23V3 
t>3  = C31Vt  + C32V2  + C33V3, 

V1  = C11V1  C21V2  Csi^S! 


(2) 


V2  ~ C12V1  + C22V2  + C32V3 


with  coefficients 


V3  — C13U1  + C23V2  + C33U3 


(3) 


satisfying 


cn  = i**i  c12  = i**j  c13  = i**k 

c2r=j**i  c22=j**j  c23  = j**k 

c3i  = k**i  c32  = k**j  c33  = k*k 


3 

(4)  Ckj^mj  ^ 1?  2,  3), 

3= 1 

where  the  Kronecker  delta2  is  given  by 


o 

® km 


0 

-1 


(k  =£  m) 
(k  = m) 


and  i,  j,  k and  i*,  j*,  k*  denote  the  unit  vectors  in  the  positive  x±-,  x2-,  x3-  and 
a*-,  x|-,  x3-directions,  respectively. 


2LEOPOLD  KRONECKER  (1823-1891),  German  mathematician  at  Berlin,  who  made  important 
contributions  to  algebra,  group  theory,  and  number  theory. 

We  shall  keep  our  discussion  completely  independent  of  Chap.  7,  but  readers  familiar  with  matrices  should 
recognize  that  we  are  dealing  with  orthogonal  transformations  and  matrices  and  that  our  present  theorem 
follows  from  Theorem  2 in  Sec.  8.3. 
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PROOF 


THEOREM  B 


The  representation  of  v in  the  two  systems  are 

(5)  (a)  v = t^i  + u2j  + vak  (b)  v = u^i*  + v£j*  + u3k*. 

Since  i*  • i*  = 1,  i*  • j*  = 0,  i*  • k*  = 0,  we  get  from  (5b)  simply  i*  • v = u*  and 
from  this  and  (5  a) 

u*  = i*  • v = i*  • i^i  + i*  • v2 j + i*  • u3k  = t^i*  • i + u2i*  • j + u3i*  • k. 

Because  of  (3),  this  is  the  first  formula  in  (1),  and  the  other  two  formulas  are  obtained 
similarly,  by  considering  j*  • v and  then  k*  • v.  Formula  (2)  follows  by  the  same  idea, 
taking  i • v = v-^  from  (5a)  and  then  from  (5b)  and  (3) 

i>i  = i • v = t^i  • i*  + ufi  • j*  + v%i  • k*  = cnu%  + c21t;|  + c31u|, 


and  similarly  for  the  other  two  components. 

We  prove  (4).  We  can  write  (1)  and  (2)  briefly  as 


(6) 


(a)  Vj  = 2 


C V * 

'-mj  ^rrr> 


(b)  K = 2 ckjVj. 


j=  i 


Substituting  Vj  into  v%,  we  get 


Vfc  Cfcj  C7njVrn  Vm  ( ^kj^mj  I ’ 

j=  1 m=  1 m—  1 \j=l  J 


where  k = 1,  2,  3.  Taking  k = 1,  we  have 


v*  = i>*  2 ciicV  + vt  2 cuc2 A + u3  L2  ciicsj  • 


\j= 1 


b'=i 


b=i 


For  this  to  hold  for  every  vector  v,  the  first  sum  must  be  1 and  the  other  two  sums  0.  This 
proves  (4)  with  k = 1 for  m = 1,  2,  3.  Taking  k = 2 and  then  k = 3,  we  obtain  (4)  with 
k = 2 and  3,  for  m = 1,  2,  3. 


Transformation  Law  for  Cartesian  Coordinates 

The  transformation  of  any  Cartesian  x-lx2x:i-co o rd inate  system  into  any  other 
Cartesian  x*x2xa-coordinate  system  is  of  the  form 

3 

(7)  = 2 c-mjXj  + bm,  m = 1,  2,  3, 

3=1 

with  coefficients  (3)  and  constants  Z?l9  b2,  b3\  conversely, 

3 

(8)  Xfc  ^k>  k 1,  2,  3. 

n=  1 
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Theorem  B follows  from  Theorem  A by  noting  that  the  most  general  transformation  of  a 
Cartesian  coordinate  system  into  another  such  system  may  be  decomposed  into  a 
transformation  of  the  type  just  considered  and  a translation;  and  under  a translation, 
corresponding  coordinates  differ  merely  by  a constant. 


PROOF  OF  THE  INVARIANCE  OF  THE  CURL 

We  write  again  xls  x2,  x3  instead  of  x,  y,  z,  and  similarly  x*,  x2,  x|  for  other  Cartesian 
coordinates,  assuming  that  both  systems  are  right-handed.  Let  a1;  a2,  a3  denote  the 
components  of  curl  v in  the  x1x2x3-coordinates,  as  given  by  (1),  Sec.  9.9,  with 

x = X!,  y = x2,  z = x3. 

Similarly,  let  a2,  a3  denote  the  components  of  curl  v in  the  x*x2xf-coordinate  system. 
We  prove  that  the  length  and  direction  of  curl  v are  independent  of  the  particular  choice 
of  Cartesian  coordinates,  as  asserted.  We  do  this  by  showing  that  the  components  of  curl 
v satisfy  the  transformation  law  (2),  which  is  characteristic  of  vector  components.  We 
consider  ax  We  use  (6a),  and  then  the  chain  rule  for  functions  of  several  variables  (Sec. 
9.6).  This  gives 


From  this  and  (7)  we  obtain 


^ 1 'fxj 

m=  1 j= 1 

/ dvt  _ dvt  \ 

- (<::\3<:22  - C32C23)  y dx*  -faf  J 

= (c33c22  — c32c23)al  + (Cl3c32  — c12c33)fl2  + ((;23('l2  — C22C13)a3. 


Note  what  we  did.  The  double  sum  had  3X3  = 9 terms,  3 of  which  were  zero  (when 
m = j),  and  the  remaining  6 terms  we  combined  in  pairs  as  we  needed  them  in  getting 

rfc  rfc  rfc 

a 1,  a2,  a3. 

We  now  use  (3),  Lagrange’s  identity  (see  Formula  (15)  in  Team  Project  24  in  Problem 
Set  9.3)  and  k*  X j*  = — i*  and  k X j = — i.  Then 


C33C22  “ c32c23  = (k*  • k)(j*  • j)  - (k*  • j)(j*  • k) 
= (k*  x j*)  • (k  X j)  = i*  • i = cn, 


etc. 
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Hence  a1  = cufli  + c2i«2  + c3ifll-  This  is  of  the  form  of  the  first  formula  in  (2)  in 
Theorem  A,  and  the  other  two  formulas  of  the  form  (2)  are  obtained  similarly.  This  proves 
the  theorem  for  right-handed  systems.  If  the  x^xs-coordinates  are  left-handed,  then 
k X j = +i,  but  then  there  is  a minus  sign  in  front  of  the  determinant  in  (1),  Sec.  9.9. 

Section  10.2,  page  420 

PROOF  OF  THEOREM  1,  PART  (b)  We  prove  that  if 


(1) 


f iF1  dx  + F2  dy  + F3  dz ) 
Jc 


with  continuous  Fx,  F2,  F3  in  a domain  D is  independent  of  path  in  D,  then  F = grad  / 
in  D for  some  /;  in  components 


(2') 


_ _ df_  _df_ 

dx  ’ dy  ’ dz 


We  choose  any  fixed  A:  (x0,  >’o,  Zo)  in  D and  any  B:  (x,  y,  z)  in  D and  define  f by 
(3)  f{x , y,  z)  = f0  + f (Fi  dx * + F2  dy*  + F3  dz*) 

JA 

with  any  constant  f0  and  any  path  from  A to  B in  D.  Since  A is  fixed  and  we  have 
independence  of  path,  the  integral  depends  only  on  the  coordinates  x,  y,  z,  so  that  (3) 
defines  a function  f(x,  y,  z)  in  I).  We  show  that  F = grad  / with  this  /,  beginning  with 
the  first  of  the  three  relations  (2  ).  Because  of  independence  of  path  we  may  integrate 
from  A to  Bp.  (x1;  y,  z)  and  then  parallel  to  the  x-axis  along  the  segment  BrB  in  Fig.  562 
with  If  chosen  so  that  the  whole  segment  lies  in  D.  Then 

rBl  fB 

fix,  y,  z)  = fo  + I (F1  dx*  + F2  dy*  + F3  dz*)  + I {F1  dx*  + F2  dy*  + F3  dz*). 
ja  ' JB1 

We  now  take  the  partial  derivative  with  respect  to  x on  both  sides.  On  the  left  we  get 
df/dx.  We  show  that  on  the  right  we  get  Fv  The  derivative  of  the  first  integral  is  zero 
because  A:  (x0,  y0,  z(l)  and  If:  (xlt  y,  z)  do  not  depend  on  x.  We  consider  the  second 
integral.  Since  on  the  segment  B^B,  both  y and  z are  constant,  the  terms  F2  dy*  and 


Fig.  562.  Proof  of  Theorem  1 
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F3  dz*  do  not  contribute  to  the  derivative  of  the  integral.  The  remaining  part  can  be  written 
as  a definite  integral, 


J dx*  = J F:(x*,  y,  z ) dx*. 

Hence  its  partial  derivative  with  respect  to  x is  (x,  y,  z),  and  the  first  of  the  relations 
( 2 ')  is  proved.  The  other  two  formulas  in  (2')  follow  by  the  same  argument. 

Section  11.5,  page  500 


THEOREM 


Reality  of  Eigenvalues 

If  p,  q,  r,  and  p'  in  the  Sturm— Liouville  equation  (1)  of  Sec.  11.5  are  real-valued  and 
continuous  on  the  interval  a Si  x = b and  r(x)  > 0 throughout  that  interval  (or 
r(x)  < 0 throughout  that  interval ),  then  all  the  eigenvalues  of  the  Sturm-Liouville 
problem  (1),  (2),  Sec.  11.5,  are  real. 


PROOF  Let  A = a + i/3  be  an  eigenvalue  of  the  problem  and  let 

y(x)  = u(x)  + iv(x) 

be  a corresponding  eigenfunction;  here  a,  (3,  u,  and  v are  real.  Substituting  this  into  (1), 
Sec.  11.5,  we  have 

(pu'  + ipv')'  + (q  + ar  + i/3r)(u  + iv)  = 0. 

This  complex  equation  is  equivalent  to  the  following  pair  of  equations  for  the  real  and 
the  imaginary  parts: 

(pu')'  + (q  + ar)u  — (3rv  = 0 
(pv')'  + (q  + ar)v  + f3ru  = 0. 


Multiplying  the  first  equation  by  v,  the  second  by  — u and  adding,  we  get 

— (3(u2  + v2)r  = u(pv')'  ~ v(pu')' 

= [(pv')u  - (pu')vf. 


The  expression  in  brackets  is  continuous  on  a ^ x = b,  for  reasons  similar  to  those  in 
the  proof  of  Theorem  1,  Sec.  1 1.5.  Integrating  over  x from  a to  b,  we  thus  obtain 


(u2  + v2)r  dx  = 


p(uv'  — u'v) 


Because  of  the  boundary  conditions,  the  right  side  is  zero;  this  is  as  in  that  proof.  Since 
y is  an  eigenfunction,  u2  + v2  ^ 0.  Since  y and  r are  continuous  and  r > 0 (or  r < 0) 
on  the  interval  a Si  x ^ b,  the  integral  on  the  left  is  not  zero.  Hence,  (3  = 0,  which  means 
that  A = a is  real.  This  completes  the  proof. 
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Section  13.4,  page  627 

PROOF  OF  THEOREM  2 Cauchy-Riemann  Equations 

We  prove  that  Cauchy-Riemann  equations 

(1)  Ux  = Vy,  Uy  = ~VX 

are  sufficient  for  a complex  function  f(z)  = u(x,  y)  + iv(x,  y)  to  be  analytic;  precisely,  if 
the  real  part  u and  the  imaginary  part  v of  f(z)  satisfy  (1)  in  a domain  D in  the  complex 
plane  and  if  the  partial  derivatives  in  (1)  are  continuous  in  D,  then  f(z)  is  analytic  in  D. 

In  this  proof  we  write  A z = Ax  + /Ay  and  A / = f(z  + A z)  — f(z).  The  idea  of  proof 
is  as  follows. 

(a)  We  express  A / in  terms  of  first  partial  derivatives  of  u and  v,  by  applying  the  mean 
value  theorem  of  Sec.  9.6. 

(b)  We  get  rid  of  partial  derivatives  with  respect  to  y by  applying  the  Cauchy-Riemann 
equations. 

(c)  We  let  A z approach  zero  and  show  that  then  Af/Az,  as  obtained,  approaches  a limit, 
which  is  equal  to  ux  + ivx,  the  right  side  of  (4)  in  Sec.  13.4,  regardless  of  the  way  of 
approach  to  zero. 

(a)  Let  P:  (x,  y)  be  any  fixed  point  in  D.  Since  D i s a domain,  it  contains  a neighborhood 
of  P.  We  can  choose  a point  Q:  (x  + Ax,  y + Ay)  in  this  neighborhood  such  that  the 
straight-line  segment  PQ  is  in  D.  Because  of  our  continuity  assumptions  we  may  apply 
the  mean  value  theorem  in  Sec.  9.6.  This  yields 

n(x  + Ax,  y + Ay)  — w(x,  y)  = (Ax)ux(M1)  + (A y)uy(M1) 
v(x  + Ax,  y + Ay)  - v(x,  y)  = (A x)vx(M2)  + (A y)vy(M2) 

where  ;V/1  and  M2  (7=  M1  in  general!)  are  suitable  points  on  that  segment.  The  first  line 
is  Re  Af  and  the  second  is  Im  A/,  so  that 

A f = (A  x)ux(M1)  + (A  y)uy(M1)  + i[(Ax)vx(M2)  + (A y)vy{M2)\. 


(b)  uy  = — vx  and  vy  = ux  by  the  Cauchy-Riemann  equations,  so  that 

Af  = (A x)ux(M1)  - (Ay) vx(Mf)  + i[(Ax)vx(M2)  + (A y)ux(M2)]. 

Also  Az  = Ax  + iAy,  so  that  we  can  write  Ax  = Az  — iAy  in  the  first  term  and 
Ay  = (Az  — Ax)//  = — /(Az  — Ax)  in  the  second  term.  This  gives 

Af  = (Az  - /Ay) ux(Mx)  + /(Az  - A x)vx(M1)  + i[(Ax)vx(M2)  + (A y)ux(M2)]. 

By  performing  the  multiplications  and  reordering  we  obtain 

Af  = (A z)ux(M1)  - i Ay {ux(Mf  - ux(M2)} 

+ iiikzlv^Mf)  - A x{vx{Mf)  - vx(M2)}]. 
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Division  by  A z now  yields 

(A)  = ux(M\)  + ivx(Mi)  ~ [u^Mj)  - ux(M2)}  - ^ {v^MJ  - vx(M2)}. 

A z Az  Az 

(c)  We  finally  let  Az  approach  zero  and  note  that  |Ay/Az|  1 and  |Ax/Az|  =§  1 in  (A). 
Then  Q:  (x  + Ax,  y + Ay)  approaches  P:  (x,  y),  so  that  M1  and  M2  must  approach  P. 
Also,  since  the  partial  derivatives  in  (A)  are  assumed  to  be  continuous,  they  approach 
their  value  at  P.  In  particular,  the  differences  in  the  braces  { • ■ • ) in  (A)  approach  zero. 
Hence  the  limit  of  the  right  side  of  (A)  exists  and  is  independent  of  the  path  along  which 
Az  — » 0.  We  see  that  this  limit  equals  the  right  side  of  (4)  in  Sec.  13.4.  This  means  that 
f(z)  is  analytic  at  every  point  z in  D,  and  the  proof  is  complete. 


Section  14.2,  pages  653-654 

GOURSATS  PROOF  OF  CAUCHY’S  INTEGRAL  THEOREM  Goursat  proved  Cauchy’s 
integral  theorem  without  assuming  that  f'(z)  is  continuous,  as  follows. 

We  start  with  the  case  when  C is  the  boundary  of  a triangle.  We  orient  C 
counterclockwise.  By  joining  the  midpoints  of  the  sides  we  subdivide  the  triangle  into 
four  congruent  triangles  (Fig.  563).  Let  C:,  Cn,  Cm,  CIV  denote  their  boundaries.  We 
claim  that  (see  Fig.  563). 

(1)  f dz  = f dz  + f dz  + f dz  + f dz. 

o c-x  c-n  c-m  c^rv 

Indeed,  on  the  right  we  integrate  along  each  of  the  three  segments  of  subdivision  in  both 
possible  directions  (Fig.  563),  so  that  the  corresponding  integrals  cancel  out  in  pairs,  and 
the  sum  of  the  integrals  on  the  right  equals  the  integral  on  the  left.  We  now  pick  an  integral 
on  the  right  that  is  biggest  in  absolute  value  and  call  its  path  Cx-  Then,  by  the  triangle 
inequality  (Sec.  13.2), 


fdz 

< 

fdz 

+ 

f fdz 

+ 

f dz 

+ 

fdz 

A 4 

4 fdz 

Jc 

Jc1 

Jcn 

Cm 

CIV 

Jc1 

We  now  subdivide  the  triangle  bounded  by  C1  as  before  and  select  a triangle  of 
subdivision  with  boundary  C2  for  which 


Fig.  563.  Proof  of  Cauchy’s  integral  theorem 
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Continuing  in  this  fashion,  we  obtain  a sequence  of  triangles  Tx,  Tz,  ■ ■ ■ with  boundaries 
Ci,  C2,  • • • that  are  similar  and  such  that  Tn  lies  in  Tm  when  n > m,  and 


(2) 


n = 1,  2,  • • ■ . 


Let  Zo  be  the  point  that  belongs  to  all  these  triangles.  Since  f is  differentiable  at  z = Zo, 
the  derivative  f (z0)  exists.  Let 


(3) 


Hz) 


f(z)  - /(Zo) 
z - Zo 


f'(z0). 


Solving  this  algebraically  for  f(z)  we  have 

f(z)  = f(z0)  + (z  - z0)f'(z0)  + Hz)(z  ~ z0)- 


Integrating  this  over  the  boundary  Cn  of  the  triangle  T.„  gives 


<£  f(z)  dz  = f(zo)  dz  + <£  (z  ~ Zo)f'(zo)  dz  + <£  h(z)(z  ~ z0)dz. 

cn  c„  c„  c„ 

Since  /(z0)  and  f\zo)  are  constants  and  Cn  is  a closed  path,  the  first  two  integrals  on  the 
right  are  zero,  as  follows  from  Cauchy’s  proof,  which  is  applicable  because  the  integrands 
do  have  continuous  derivatives  (0  and  const,  respectively).  We  thus  have 


f f(z)  dz  = h(z)(z  - Zo)  dz. 
cn  Jcn 

Since  f' (z0)  is  the  limit  of  the  difference  quotient  in  (3),  for  given  e > 0 we  can  find  a 
S > 0 such  that 


(4)  \Hz)\  < € when  |z  - z0|  < 8. 

We  may  now  take  n so  large  that  the  triangle  Tn  lies  in  the  disk  |z  — z0|  < 8.  Let  Ln  be 
the  length  of  Cn.  Then  |z  — z0|  < Ln  for  all  z on  C.„  and  z0  in  Tn.  From  this  and  (4)  we 
have  | h(z)(z  — z0)|  < eL„.  The  ML-inequality  in  Sec.  14.1  now  gives 


(5) 


f(z)  dz 


h(z)(z  ~ z0)  dz 


cn 


< c]  . J = d z 

C L/,,  C L;,,,, . 


Now  denote  the  length  of  C by  L.  Then  the  path  C1  has  the  length  Lx  = LI 2,  the  path  C2 
has  the  length  L2  = L,/2  = LI 4,  etc.,  and  Cn  has  the  length  Ln  = LIT1.  Hence 
L^  = L2/4n.  From  (2)  and  (5)  we  thus  obtain 


Lr 

g 4 neLl  = 4ne  — = eL2. 

n 4« 


By  choosing  e (>  0)  sufficiently  small  we  can  make  the  expression  on  the  right  as  small 
as  we  please,  while  the  expression  on  the  left  is  the  definite  value  of  an  integral. 
Consequently,  this  value  must  be  zero,  and  the  proof  is  complete. 
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The  proof  for  the  case  in  which  C is  the  boundary  of  a polygon  follows  from  the  previous 
proof  by  subdividing  the  polygon  into  triangles  (Fig.  564).  The  integral  corresponding  to 
each  such  triangle  is  zero.  The  sum  of  these  integrals  is  equal  to  the  integral  over  C, 
because  we  integrate  along  each  segment  of  subdivision  in  both  directions,  the 
corresponding  integrals  cancel  out  in  pairs,  and  we  are  left  with  the  integral  over  C. 

The  case  of  a general  simple  closed  path  C can  be  reduced  to  the  preceding  one  by 
inscribing  in  C a closed  polygon  P of  chords,  which  approximates  C “sufficiently 
accurately,”  and  it  can  be  shown  that  there  is  a polygon  P such  that  the  integral  over  P 
differs  from  that  over  C by  less  than  any  preassigned  positive  real  number  e,  no  matter 
how  small.  The  details  of  this  proof  are  somewhat  involved  and  can  be  found  in  Ref.  [D6] 
listed  in  App.  1. 


Fig.  564 


Proof  of  Cauchy’s  integral  theorem  for  a polygon 


Section  15.1,  page  674 

PROOF  OF  THEOREM  4 Cauchy’s  Convergence  Principle  for  Series 

(a)  In  this  proof  we  need  two  concepts  and  a theorem,  which  we  list  first. 

1.  A bounded  sequence  .sy,  ,v2,  ■ • • is  a sequence  whose  terms  all  lie  in  a disk  of 
(sufficiently  large,  finite)  radius  K with  center  at  the  origin;  thus  j.vj  < K for  all  n. 

2.  A limit  point  a of  a sequence  s1,  s2,  • • • is  a point  such  that,  given  an  e > 0,  there 
are  infinitely  many  terms  satisfying  \sn  — a\  < e.  (Note  that  this  does  not  imply 
convergence,  since  there  may  still  be  infinitely  many  terms  that  do  not  lie  within  that 
circle  of  radius  e and  center  a.) 

Example:  g,  |,  m,  if,  ' ' ' ^as  limit  points  0 and  1 and  diverges. 

3.  A bounded  sequence  in  the  complex  plane  has  at  least  one  limit  point. 
(Bolzano-Weierstrass  theorem;  proof  below.  Recall  that  “sequence”  always  means  infinite 
sequence.) 

(b)  We  now  turn  to  the  actual  proof  that  Zi  + Z2  + ' ' ' converges  if  and  only  if,  for 
every  e > 0,  we  can  find  an  N such  that 

(1)  \zn+i  + ■ ■ • + zn+p | < e for  every  n > N and  p = 1,  2,  • • • . 

Here,  by  the  definition  of  partial  sums, 

S n+p  Sn  Zn+1  T T Zn+p- 
Writing  n + p = r,  we  see  from  this  that  (1)  is  equivalent  to 


(1*) 


for  all  r > N and  n > N. 
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THEOREM 


PROOF 


Suppose  that  s1?  s2,  ' ' ' converges.  Denote  its  limit  by  s.  Then  for  a given  e > 0 we  can 
find  an  N such  that 

km  — s|  < — for  every  n > N. 

Hence,  if  r > N and  n > N,  then  by  the  triangle  inequality  (Sec.  13.2), 


— sn|  = |(sr  — s)  — (sn  — i)|  Si  kr  s|  + |sn  — s|  < — + — = e, 


that  is,  (1*)  holds. 

(c)  Conversely,  assume  that  Sy,  s2,  ■ • ■ satisfies  (1*).  We  first  prove  that  then  the 
sequence  must  be  bounded.  Indeed,  choose  a fixed  e and  a fixed  n = n0  > N in  (1*). 
Then  (1*)  implies  that  all  sr  with  r > A'  lie  in  the  disk  of  radius  e and  center  sng  and  only 
finitely  many  terms  slt  ■ ■ ■ , sN  may  not  lie  in  this  disk.  Clearly,  we  can  now  find  a circle 
so  large  that  this  disk  and  these  finitely  many  terms  all  lie  within  this  new  circle.  Hence 
the  sequence  is  bounded.  By  the  Bolzano-Weierstrass  theorem,  it  has  at  least  one  limit 
point,  call  it  s. 

We  now  show  that  the  sequence  is  convergent  with  the  limit  s.  Let  e > 0 be  given. 
Then  there  is  an  N*  such  that  \sr  — .vj  < e/2  for  all  r > N*  and  n > N*,  by  (1*).  Also, 
by  the  definition  of  a limit  point,  j,vn  — ,v|  < e/2  for  infinitely  many  n,  so  that  we  can  find 
and  fix  an  n > N*  such  that  \sn  — .vj  < e/2.  Together,  for  every  r > N*, 

iii  ii  | | | e e 

kr  “^1  | ('C  Syf  "t"  ( Sn  .y)|  = |sr  "t"  ^ 


that  is,  the  sequence  sq, 


is  convergent  with  the  limit  s. 


Bolzano-Weierstrass  Theorem3 

A bounded  infinite  sequence  Zi,  Z2,  Z3,  • 
limit  point. 

■ in  the  complex  plane  has  at  least  one 

It  is  obvious  that  we  need  both  conditions:  a finite  sequence  cannot  have  a limit  point, 
and  the  sequence  1,  2,  3,  ■ ■ ■ , which  is  infinite  but  not  bounded,  has  no  limit  point.  To 
prove  the  theorem,  consider  a bounded  infinite  sequence  Z\,  z2,  ' ' ' and  let  K be  such  that 
k„|  < K for  all  n.  If  only  finitely  many  values  of  the  z,n  are  different,  then,  since  the 
sequence  is  infinite,  some  number  z,  must  occur  infinitely  many  times  in  the  sequence, 
and,  by  definition,  this  number  is  a limit  point  of  the  sequence. 

We  may  now  turn  to  the  case  when  the  sequence  contains  infinitely  many  different 
terms.  We  draw  a large  square  Q0  that  contains  all  z,n.  We  subdivide  Q0  into  four  congruent 
squares,  which  we  number  1,  2,  3,  4.  Clearly,  at  least  one  of  these  squares  (each  taken 
with  its  complete  boundary)  must  contain  infinitely  many  terms  of  the  sequence.  The 
square  of  this  type  with  the  lowest  number  (1,  2,  3,  or  4)  will  be  denoted  by  Qv  This  is 


3BERNARD  BOLZANO  (1781-1848),  Austrian  mathematician  and  professor  of  religious  studies,  was  a 
pioneer  in  the  study  of  point  sets,  the  foundation  of  analysis,  and  mathematical  logic. 

For  Weierstrass,  see  Sec.  15.5. 
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the  first  step.  In  the  next  step  we  subdivide  <2i  into  four  congruent  squares  and  select  a 
square  Q2  by  the  same  rule,  and  so  on.  This  yields  an  infinite  sequence  of  squares  Q0, 
Q i,  Q2,  • ■ • , Qn,  ■ ■ • with  the  property  that  the  side  of  Qn  approaches  zero  as  n approaches 
infinity,  and  Qm  contains  all  Q„  with  n > m.  It  is  not  difficult  to  see  that  the  number 
which  belongs  to  all  these  squares,4  call  it  z = a,  is  a limit  point  of  the  sequence.  In  fact, 
given  an  e > 0,  we  can  choose  an  N so  large  that  the  side  of  the  square  QN  is  less  than 
e and,  since  QN  contains  infinitely  many  zn,  we  have  \zn  — a\  < e for  infinitely  many  n. 
This  completes  the  proof.  ■ 


Section  15.3,  pages  688-689 

PART  (b)  OF  THE  PROOF  OF  THEOREM  5 
We  have  to  show  that 


(z  + A z)w  ~ zT 

Az 


nz 


= 2 an  Az[(z  + Az)n  2 + 2z(z  + Az)”  3 + • ■ • + (n  - 1 )zn  2], 


thus, 


(z  + Az)"  - zr 

Az 


— nz 


= Az[(z  + Az)”-2  + 2z(z  + Az)”-3  + ■•■  + (*-  l)z”-2]. 


If  we  set  z + Az  = b and  z = a,  thus  A z = b — a,  this  becomes  simply 

bn  - an 

(7a)  — - no™-1  = (b~  d)An  (n  = 2,  3,  ■ ■ ■), 


where  An  is  the  expression  in  the  brackets  on  the  right, 

(7b)  An  = bn~2  + 2 abn~3  + 3 a26”-4  +■••+(«-  l)a”-2; 

thus,  A2  = 1,  A3  = b + 2a,  etc.  We  prove  (7)  by  induction.  When  n = 2,  then  (7)  holds, 
since  then 

b2  — a 2 (b  + a)(b  — a) 

— 2 a = 2a  = b - a = (b  - a)A2. 

b — a b — a 


Assuming  that  (7)  holds  for  n = k,  we  show  that  it  holds  for  n = k + 1.  By  adding  and 
subtracting  a term  in  the  numerator  and  then  dividing  we  first  obtain 


bk+1  - a 1 


k+1 


(Jc+l 


bak  + bak 
b — a 


fc  + 1 


= b 


bk  - ak 


+ aK 


4 The  fact  that  such  a unique  number  z — a exists  seems  to  be  obvious,  but  it  actually  follows  from  an  axiom 
of  the  real  number  system,  the  so-called  Cantor-Dedekind  axiom:  see  footnote  3 in  App.  A3. 3. 
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By  the  induction  hypothesis,  the  right  side  equals  b[(b  — a)Ak  + kak  1]  + cik.  Direct 
calculation  shows  that  this  is  equal  to 

(b  — a){bAk  + kak~ 1}  4-  akc ik~1  + ak. 

From  (7b)  with  n = k we  see  that  the  expression  in  the  braces  { ■ ■ ■ } equals 

bk — 1 + 2 abk~2  + ■••  + (&-  1 )bak~2  + kak~1  = Ak+1. 

Hence  our  result  is 

bk+1  - ak+1 

= (b  ~ a)Ak+l  + (k  + 1 )ak. 

b — a 

Taking  the  last  term  to  the  left,  we  obtain  (7)  with  n = k + 1.  This  proves  (7)  for  any 
integer  n§2  and  completes  the  proof.  ■ 


Section  18.2,  page  763 

ANOTHER  PROOF  OF  THEOREM  1 without  the  me  of  a harmonic  conjugate 

We  show  that  if  w = u + iv  = f(z)  is  analytic  and  maps  a domain  I)  conformally  onto 
a domain  D*  and  <!>*(«,  u)  is  harmonic  in  D*,  then 

(1)  <J>(x,  y)  = y),  v(x,  yj) 

is  harmonic  in  D,  that  is,  V2d>  = 0 in  I).  We  make  no  use  of  a harmonic  conjugate  of 
$*,  but  use  straightforward  differentiation.  By  the  chain  rule, 

cj>  = cp*  u + cp*  v 

We  apply  the  chain  rule  again,  underscoring  the  terms  that  will  drop  out  when  we  form 
V2d>: 

®xx  = + (^Su*fx  + cKqVx)ux 

+ ^fjVxx  + + ®*uVx)Vx. 

<$yy  is  the  same  with  each  x replaced  by  y.  We  form  the  sum  V2(I>.  In  it,  (b*„  = is 
multiplied  by 

UXVX  + UyVy 

which  is  0 by  the  Cauchy-Riemann  equations.  Also  V2m  = 0 and  V2v  = 0.  There  remains 
V2d)  = + n2)  + fl>*„(t;2  + u2). 

By  the  Cauchy-Riemann  equations  this  becomes 

V20>  = (<!>*„  + <F5„)(w2  + v2) 


and  is  0 since  €>*  is  harmonic. 


APPENDIX  5 
Tables 


For  Tables  of  Laplace  Transforms  see  Secs.  6.8  and  6.9. 

For  Tables  of  Fourier  Transforms  see  Sec.  11.10. 

If  you  have  a Computer  Algebra  System  (CAS),  you  may  not  need  the  present  tables, 
but  you  may  still  find  them  convenient  from  time  to  time. 

Table  A1  Bessel  Functions 


For  more  extensive  tables  see  Ref.  [GenRefl]  in  App.  1. 


X 

Mx) 

Mx) 

X 

Mx) 

Mx) 

X 

Mx) 

Mx) 

0.0 

1.0000 

0.0000 

3.0 

-0.2601 

0.3391 

6.0 

0.1506 

-0.2767 

0.1 

0.9975 

0.0499 

3.1 

-0.2921 

0.3009 

6.1 

0.1773 

-0.2559 

0.2 

0.9900 

0.0995 

3.2 

-0.3202 

0.2613 

6.2 

0.2017 

-0.2329 

0.3 

0.9776 

0.1483 

3.3 

-0.3443 

0.2207 

6.3 

0.2238 

-0.2081 

0.4 

0.9604 

0.1960 

3.4 

-0.3643 

0.1792 

6.4 

0.2433 

-0.1816 

0.5 

0.9385 

0.2423 

3.5 

-0.3801 

0.1374 

6.5 

0.2601 

-0.1538 

0.6 

0.9120 

0.2867 

3.6 

-0.3918 

0.0955 

6.6 

0.2740 

-0.1250 

0.7 

0.8812 

0.3290 

3.7 

-0.3992 

0.0538 

6.7 

0.2851 

-0.0953 

0.8 

0.8463 

0.3688 

3.8 

-0.4026 

0.0128 

6.8 

0.2931 

-0.0652 

0.9 

0.8075 

0.4059 

3.9 

-0.4018 

-0.0272 

6.9 

0.2981 

-0.0349 

1.0 

0.7652 

0.4401 

4.0 

-0.3971 

-0.0660 

7.0 

0.3001 

-0.0047 

1.1 

0.7196 

0.4709 

4.1 

-0.3887 

-0.1033 

7.1 

0.2991 

0.0252 

1.2 

0.6711 

0.4983 

4.2 

-0.3766 

-0.1386 

7.2 

0.2951 

0.0543 

1.3 

0.6201 

0.5220 

4.3 

-0.3610 

-0.1719 

7.3 

0.2882 

0.0826 

1.4 

0.5669 

0.5419 

4.4 

-0.3423 

-0.2028 

7.4 

0.2786 

0.1096 

1.5 

0.5118 

0.5579 

4.5 

-0.3205 

-0.2311 

7.5 

0.2663 

0.1352 

1.6 

0.4554 

0.5699 

4.6 

-0.2961 

-0.2566 

7.6 

0.2516 

0.1592 

1.7 

0.3980 

0.5778 

4.7 

-0.2693 

-0.2791 

7.7 

0.2346 

0.1813 

1.8 

0.3400 

0.5815 

4.8 

-0.2404 

-0.2985 

7.8 

0.2154 

0.2014 

1.9 

0.2818 

0.5812 

4.9 

-0.2097 

-0.3147 

7.9 

0.1944 

0.2192 

2.0 

0.2239 

0.5767 

5.0 

-0.1776 

-0.3276 

8.0 

0.1717 

0.2346 

2.1 

0.1666 

0.5683 

5.1 

-0.1443 

-0.3371 

8.1 

0.1475 

0.2476 

2.2 

0.1104 

0.5560 

5.2 

-0.1103 

-0.3432 

8.2 

0.1222 

0.2580 

2.3 

0.0555 

0.5399 

5.3 

-0.0758 

-0.3460 

8.3 

0.0960 

0.2657 

2.4 

0.0025 

0.5202 

5.4 

-0.0412 

-0.3453 

8.4 

0.0692 

0.2708 

2.5 

-0.0484 

0.4971 

5.5 

-0.0068 

-0.3414 

8.5 

0.0419 

0.2731 

2.6 

-0.0968 

0.4708 

5.6 

0.0270 

-0.3343 

8.6 

0.0146 

0.2728 

2.7 

-0.1424 

0.4416 

5.7 

0.0599 

-0.3241 

8.7 

-0.0125 

0.2697 

2.8 

-0.1850 

0.4097 

5.8 

0.0917 

-0.3110 

8.8 

-0.0392 

0.2641 

2.9 

-0.2243 

0.3754 

5.9 

0.1220 

-0.2951 

8.9 

-0.0653 

0.2559 

J0(x)  = Oforx  = 2.40483,  5.52008,  8.65373,  11.7915,  14.9309,  18.0711,  21.2116,  24.3525,  27.4935,  30.6346 
Mx)  = 0 for  x = 3.83171,  7.01559,  10.1735,  13.3237,  16.4706,  19.6159,  22.7601,  25.9037,  29.0468,  32.1897 


A97 


A98 


APP.  5 Tables 


Table  A1  (continued) 


X 

ToW 

TiW 

X 

Y0(x) 

Tito 

X 

Y0(x) 

TiW 

0.0 

(-00) 

(-”) 

2.5 

0.498 

0.146 

5.0 

-0.309 

0.148 

0.5 

-0.445 

-1.471 

3.0 

0.377 

0.325 

5.5 

-0.339 

-0.024 

1.0 

0.088 

-0.781 

3.5 

0.189 

0.410 

6.0 

-0.288 

-0.175 

1.5 

0.382 

-0.412 

4.0 

-0.017 

0.398 

6.5 

-0.173 

-0.274 

2.0 

0.510 

-0.107 

4.5 

-0.195 

0.301 

7.0 

-0.026 

-0.303 

Table  A2  Gamma  Function  [see  (24)  in  App.  A3.1] 


a 

T(«) 

a 

T(a) 

a 

T(a) 

a 

T(a) 

a 

T(a) 

1.00 

1.000  000 

1.20 

0.918  169 

1.40 

0.887  264 

1.60 

0.893  515 

1.80 

0.931  384 

1.02 

0.988  844 

1.22 

0.913  106 

1.42 

0.886  356 

1.62 

0.895  924 

1.82 

0.936  845 

1.04 

0.978  438 

1.24 

0.908  521 

1.44 

0.885  805 

1.64 

0.898  642 

1.84 

0.942  612 

1.06 

0.968  744 

1.26 

0.904  397 

1.46 

0.885  604 

1.66 

0.901  668 

1.86 

0.948  687 

1.08 

0.959  725 

1.28 

0.900  718 

1.48 

0.885  747 

1.68 

0.905  001 

1.88 

0.955  071 

1.10 

0.951  351 

1.30 

0.897  471 

1.50 

0.886  227 

1.70 

0.908  639 

1.90 

0.961  766 

1.12 

0.943  590 

1.32 

0.894  640 

1.52 

0.887  039 

1.72 

0.912  581 

1.92 

0.968  774 

1.14 

0.936  416 

1.34 

0.892  216 

1.54 

0.888  178 

1.74 

0.916  826 

1.94 

0.976  099 

1.16 

0.929  803 

1.36 

0.890  185 

1.56 

0.889  639 

1.76 

0.921  375 

1.96 

0.983  743 

1.18 

0.923  728 

1.38 

0.888  537 

1.58 

0.891  420 

1.78 

0.926  227 

1.98 

0.991  708 

1.20 

0.918  169 

1.40 

0.887  264 

1.60 

0.893  515 

1.80 

0.931  384 

2.00 

1.000  000 

Table  A3  Factorial  Function  and  Its  Logarithm  with  Base  10 


n 

n\ 

log  (n!) 

n 

n\ 

log  («!) 

n 

n\ 

log  («!) 

i 

1 

0.000  000 

6 

720 

2.857  332 

ii 

39  916  800 

7.601  156 

2 

2 

0.301  030 

1 

5 040 

3.702  431 

12 

479  001  600 

8.680  337 

3 

6 

0.778  151 

8 

40  320 

4.605  521 

13 

6 227  020  800 

9.794  280 

4 

24 

1.380  211 

9 

362  880 

5.559  763 

14 

87  178  291  200 

10.940  408 

5 

120 

2.079  181 

10 

3 628  800 

6.559  763 

15 

1 307  674  368  000 

12.116  500 

Table  A4  Error  Function,  Sine  and  Cosine  Integrals  [see  (35),  (40),  (42)  in  App.  A3.1] 


X 

erf  x 

Si(x) 

ci(x) 

X 

erf  x 

Si(x) 

ci(x) 

0.0 

0.0000 

0.0000 

00 

2.0 

0.9953 

1.6054 

-0.4230 

0.2 

0.2227 

0.1996 

1.0422 

2.2 

0.9981 

1.6876 

-0.3751 

0.4 

0.4284 

0.3965 

0.3788 

2.4 

0.9993 

1.7525 

-0.3173 

0.6 

0.6039 

0.5881 

0.0223 

2.6 

0.9998 

1.8004 

-0.2533 

0.8 

0.7421 

0.7721 

-0.1983 

2.8 

0.9999 

1.8321 

-0.1865 

1.0 

0.8427 

0.9461 

-0.3374 

3.0 

1.0000 

1.8487 

-0.1196 

1.2 

0.9103 

1.1080 

-0.4205 

3.2 

1.0000 

1.8514 

-0.0553 

1.4 

0.9523 

1.2562 

-0.4620 

3.4 

1.0000 

1.8419 

0.0045 

1.6 

0.9763 

1.3892 

-0.4717 

3.6 

1.0000 

1.8219 

0.0580 

1.8 

0.9891 

1.5058 

-0.4568 

3.8 

1.0000 

1.7934 

0.1038 

2.0 

0.9953 

1.6054 

-0.4230 

4.0 

1.0000 

1.7582 

0.1410 
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Table  A5  Binomial  Distribution 


Probability  function  f(x ) [see  (2),  Sec.  24.7]  and  distribution  function  Fix) 


n 

X 

P 

/(*) 

= 0.1 

Fix) 

P 

m 

0.2 

Fix) 

P 

fix) 

= 0.3 
Fix) 

P 

fix) 

II 

o 

* 

P 

fix) 

= 0.5 
Fix) 

0. 

0. 

0. 

0. 

0. 

i 

0 

9000 

0.9000 

8000 

0.8000 

7000 

0.7000 

6000 

0.6000 

5000 

0.5000 

i 

1000 

1.0000 

2000 

1.0000 

3000 

1.0000 

4000 

1.0000 

5000 

1.0000 

0 

8100 

0.8100 

6400 

0.6400 

4900 

0.4900 

3600 

0.3600 

2500 

0.2500 

2 

1 

1800 

0.9900 

3200 

0.9600 

4200 

0.9100 

4800 

0.8400 

5000 

0.7500 

2 

0100 

1.0000 

0400 

1.0000 

0900 

1.0000 

1600 

1.0000 

2500 

1.0000 

0 

7290 

0.7290 

5120 

0.5120 

3430 

0.3430 

2160 

0.2160 

1250 

0.1250 

1 

2430 

0.9720 

3840 

0.8960 

4410 

0.7840 

4320 

0.6480 

3750 

0.5000 

J 

2 

0270 

0.9990 

0960 

0.9920 

1890 

0.9730 

2880 

0.9360 

3750 

0.8750 

3 

0010 

1.0000 

0080 

1.0000 

0270 

1.0000 

0640 

1.0000 

1250 

1.0000 

0 

6561 

0.6561 

4096 

0.4096 

2401 

0.2401 

1296 

0.1296 

0625 

0.0625 

1 

2916 

0.9477 

4096 

0.8192 

4116 

0.6517 

3456 

0.4752 

2500 

0.3125 

4 

2 

0486 

0.9963 

1536 

0.9728 

2646 

0.9163 

3456 

0.8208 

3750 

0.6875 

3 

0036 

0.9999 

0256 

0.9984 

0756 

0.9919 

1536 

0.9744 

2500 

0.9375 

4 

0001 

1.0000 

0016 

1.0000 

0081 

1.0000 

0256 

1.0000 

0625 

1.0000 

0 

5905 

0.5905 

3277 

0.3277 

1681 

0.1681 

0778 

0.0778 

0313 

0.0313 

1 

3281 

0.9185 

4096 

0.7373 

3602 

0.5282 

2592 

0.3370 

1563 

0.1875 

2 

0729 

0.9914 

2048 

0.9421 

3087 

0.8369 

3456 

0.6826 

3125 

0.5000 

3 

0081 

0.9995 

0512 

0.9933 

1323 

0.9692 

2304 

0.9130 

3125 

0.8125 

4 

0005 

1.0000 

0064 

0.9997 

0284 

0.9976 

0768 

0.9898 

1563 

0.9688 

5 

0000 

1.0000 

0003 

1.0000 

0024 

1.0000 

0102 

1.0000 

0313 

1.0000 

0 

5314 

0.5314 

2621 

0.2621 

1176 

0.1176 

0467 

0.0467 

0156 

0.0156 

1 

3543 

0.8857 

3932 

0.6554 

3025 

0.4202 

1866 

0.2333 

0938 

0.1094 

2 

0984 

0.9841 

2458 

0.9011 

3241 

0.7443 

3110 

0.5443 

2344 

0.3438 

6 

3 

0146 

0.9987 

0819 

0.9830 

1852 

0.9295 

2765 

0.8208 

3125 

0.6563 

4 

0012 

0.9999 

0154 

0.9984 

0595 

0.9891 

1382 

0.9590 

2344 

0.8906 

5 

0001 

1.0000 

0015 

0.9999 

0102 

0.9993 

0369 

0.9959 

0938 

0.9844 

6 

0000 

1.0000 

0001 

1.0000 

0007 

1.0000 

0041 

1.0000 

0156 

1.0000 

0 

4783 

0.4783 

2097 

0.2097 

0824 

0.0824 

0280 

0.0280 

0078 

0.0078 

1 

3720 

0.8503 

3670 

0.5767 

2471 

0.3294 

1306 

0.1586 

0547 

0.0625 

2 

1240 

0.9743 

2753 

0.8520 

3177 

0.6471 

2613 

0.4199 

1641 

0.2266 

3 

0230 

0.9973 

1147 

0.9667 

2269 

0.8740 

2903 

0.7102 

2734 

0.5000 

4 

0026 

0.9998 

0287 

0.9953 

0972 

0.9712 

1935 

0.9037 

2734 

0.7734 

5 

0002 

1.0000 

0043 

0.9996 

0250 

0.9962 

0774 

0.9812 

1641 

0.9375 

6 

0000 

1.0000 

0004 

1.0000 

0036 

0.9998 

0172 

0.9984 

0547 

0.9922 

7 

0000 

1.0000 

0000 

1.0000 

0002 

1.0000 

0016 

1.0000 

0078 

1.0000 

0 

4305 

0.4305 

1678 

0.1678 

0576 

0.0576 

0168 

0.0168 

0039 

0.0039 

1 

3826 

0.8131 

3355 

0.5033 

1977 

0.2553 

0896 

0.1064 

0313 

0.0352 

2 

1488 

0.9619 

2936 

0.7969 

2965 

0.5518 

2090 

0.3154 

1094 

0.1445 

3 

0331 

0.9950 

1468 

0.9437 

2541 

0.8059 

2787 

0.5941 

2188 

0.3633 

8 

4 

0046 

0.9996 

0459 

0.9896 

1361 

0.9420 

2322 

0.8263 

2734 

0.6367 

5 

0004 

1.0000 

0092 

0.9988 

0467 

0.9887 

1239 

0.9502 

2188 

0.8555 

6 

0000 

1.0000 

0011 

0.9999 

0100 

0.9987 

0413 

0.9915 

1094 

0.9648 

7 

0000 

1.0000 

0001 

1.0000 

0012 

0.9999 

0079 

0.9993 

0313 

0.9961 

8 

0000 

1.0000 

0000 

1.0000 

0001 

1.0000 

0007 

1.0000 

0039 

1.0000 
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Table  A6  Poisson  Distribution 


Probability  function  f(x ) [see  (5),  Sec.  24.7]  and  distribution  function  Fix) 


X 

A 

fix) 

= 0.1 
Fix) 

A 

m 

= 0.2 
F(x) 

A 

fix) 

= 0.3 
Fix) 

A 

fix) 

= 0.4 
Fix) 

A 

fix) 

= 0.5 
Fix) 

0. 

0. 

0. 

0. 

0. 

0 

9048 

0.9048 

8187 

0.8187 

7408 

0.7408 

6703 

0.6703 

6065 

0.6065 

t 

0905 

0.9953 

1637 

0.9825 

2222 

0.9631 

2681 

0.9384 

3033 

0.9098 

2 

0045 

0.9998 

0164 

0.9989 

0333 

0.9964 

0536 

0.9921 

0758 

0.9856 

3 

0002 

1.0000 

0011 

0.9999 

0033 

0.9997 

0072 

0.9992 

0126 

0.9982 

4 

0000 

1.0000 

0001 

1.0000 

0003 

1.0000 

0007 

0.9999 

0016 

0.9998 

5 

0001 

1.0000 

0002 

1.0000 

X 

A 

fix) 

= 0.6 
Fix) 

A 

fix) 

= 0.7 
Fix) 

A 

fix) 

= 0.8 
Fix) 

A 

fix) 

= 0.9 
Fix) 

A 

fix) 

= 1 
Fix) 

0. 

0. 

0. 

0. 

0. 

0 

5488 

0.5488 

4966 

0.4966 

4493 

0.4493 

4066 

0.4066 

3679 

0.3679 

l 

3293 

0.8781 

3476 

0.8442 

3595 

0.8088 

3659 

0.7725 

3679 

0.7358 

2 

0988 

0.9769 

1217 

0.9659 

1438 

0.9526 

1647 

0.9371 

1839 

0.9197 

3 

0198 

0.9966 

0284 

0.9942 

0383 

0.9909 

0494 

0.9865 

0613 

0.9810 

4 

0030 

0.9996 

0050 

0.9992 

0077 

0.9986 

0111 

0.9977 

0153 

0.9963 

5 

0004 

1.0000 

0007 

0.9999 

0012 

0.9998 

0020 

0.9997 

0031 

0.9994 

6 

0001 

1.0000 

0002 

1.0000 

0003 

1.0000 

0005 

0.9999 

7 

0001 

1.0000 

X 

A 

fix) 

= 1.5 
Fix) 

A 

fix) 

= 2 
Fix) 

A 

fix) 

II 

■s  “ 

>2, 

A 

fix) 

2 

ii 

A 

fix) 

3 

to  ^ 

II 

0. 

0. 

0. 

0. 

0. 

0 

2231 

0.2231 

1353 

0.1353 

0498 

0.0498 

0183 

0.0183 

0067 

0.0067 

l 

3347 

0.5578 

2707 

0.4060 

1494 

0.1991 

0733 

0.0916 

0337 

0.0404 

2 

2510 

0.8088 

2707 

0.6767 

2240 

0.4232 

1465 

0.2381 

0842 

0.1247 

3 

1255 

0.9344 

1804 

0.8571 

2240 

0.6472 

1954 

0.4335 

1404 

0.2650 

4 

0471 

0.9814 

0902 

0.9473 

1680 

0.8153 

1954 

0.6288 

1755 

0.4405 

5 

0141 

0.9955 

0361 

0.9834 

1008 

0.9161 

1563 

0.7851 

1755 

0.6160 

6 

0035 

0.9991 

0120 

0.9955 

0504 

0.9665 

1042 

0.8893 

1462 

0.7622 

7 

0008 

0.9998 

0034 

0.9989 

0216 

0.9881 

0595 

0.9489 

1044 

0.8666 

8 

0001 

1.0000 

0009 

0.9998 

0081 

0.9962 

0298 

0.9786 

0653 

0.9319 

9 

0002 

1.0000 

0027 

0.9989 

0132 

0.9919 

0363 

0.9682 

10 

0008 

0.9997 

0053 

0.9972 

0181 

0.9863 

11 

0002 

0.9999 

0019 

0.9991 

0082 

0.9945 

12 

0001 

1.0000 

0006 

0.9997 

0034 

0.9980 

13 

0002 

0.9999 

0013 

0.9993 

14 

0001 

1.0000 

0005 

0.9998 

15 

0002 

0.9999 

16 

0000 

1.0000 
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Table  A7  Normal  Distribution 

Values  of  the  distribution  function  <f>(z)  [see  (3),  Sec.  24.8].  (T>(— z)  = 1 — <f>(z) 


z 

®(z) 

z 

®(z) 

z 

®(z) 

z 

®(z) 

z 

®(z) 

z 

®(z) 

0. 

0. 

0. 

0. 

0. 

0. 

0.01 

5040 

0.51 

6950 

1.01 

8438 

1.51 

9345 

2.01 

9778 

2.51 

9940 

0.02 

5080 

0.52 

6985 

1.02 

8461 

1.52 

9357 

2.02 

9783 

2.52 

9941 

0.03 

5120 

0.53 

7019 

1.03 

8485 

1.53 

9370 

2.03 

9788 

2.53 

9943 

0.04 

5160 

0.54 

7054 

1.04 

8508 

1.54 

9382 

2.04 

9793 

2.54 

9945 

0.05 

5199 

0.55 

7088 

1.05 

8531 

1.55 

9394 

2.05 

9798 

2.55 

9946 

0.06 

5239 

0.56 

7123 

1.06 

8554 

1.56 

9406 

2.06 

9803 

2.56 

9948 

0.07 

5279 

0.57 

7157 

1.07 

8577 

1.57 

9418 

2.07 

9808 

2.57 

9949 

0.08 

5319 

0.58 

7190 

1.08 

8599 

1.58 

9429 

2.08 

9812 

2.58 

9951 

0.09 

5359 

0.59 

7224 

1.09 

8621 

1.59 

9441 

2.09 

9817 

2.59 

9952 

0.10 

5398 

0.60 

7257 

1.10 

8643 

1.60 

9452 

2.10 

9821 

2.60 

9953 

0.11 

5438 

0.61 

7291 

1.11 

8665 

1.61 

9463 

2.11 

9826 

2.61 

9955 

0.12 

5478 

0.62 

7324 

1.12 

8686 

1.62 

9474 

2.12 

9830 

2.62 

9956 

0.13 

5517 

0.63 

7357 

1.13 

8708 

1.63 

9484 

2.13 

9834 

2.63 

9957 

0.14 

5557 

0.64 

7389 

1.14 

8729 

1.64 

9495 

2.14 

9838 

2.64 

9959 

0.15 

5596 

0.65 

7422 

1.15 

8749 

1.65 

9505 

2.15 

9842 

2.65 

9960 

0.16 

5636 

0.66 

7454 

1.16 

8770 

1.66 

9515 

2.16 

9846 

2.66 

9961 

0.17 

5675 

0.67 

7486 

1.17 

8790 

1.67 

9525 

2.17 

9850 

2.67 

9962 

0.18 

5714 

0.68 

7517 

1.18 

8810 

1.68 

9535 

2.18 

9854 

2.68 

9963 

0.19 

5753 

0.69 

7549 

1.19 

8830 

1.69 

9545 

2.19 

9857 

2.69 

9964 

0.20 

5793 

0.70 

7580 

1.20 

8849 

1.70 

9554 

2.20 

9861 

2.70 

9965 

0.21 

5832 

0.71 

7611 

1.21 

8869 

1.71 

9564 

2.21 

9864 

2.71 

9966 

0.22 

5871 

0.72 

7642 

1.22 

8888 

1.72 

9573 

2.22 

9868 

2.72 

9967 

0.23 

5910 

0.73 

7673 

1.23 

8907 

1.73 

9582 

2.23 

9871 

2.73 

9968 

0.24 

5948 

0.74 

7704 

1.24 

8925 

1.74 

9591 

2.24 

9875 

2.74 

9969 

0.25 

5987 

0.75 

7734 

1.25 

8944 

1.75 

9599 

2.25 

9878 

2.75 

9970 

0.26 

6026 

0.76 

7764 

1.26 

8962 

1.76 

9608 

2.26 

9881 

2.76 

9971 

0.27 

6064 

0.77 

7794 

1.27 

8980 

1.77 

9616 

2.27 

9884 

2.77 

9972 

0.28 

6103 

0.78 

7823 

1.28 

8997 

1.78 

9625 

2.28 

9887 

2.78 

9973 

0.29 

6141 

0.79 

7852 

1.29 

9015 

1.79 

9633 

2.29 

9890 

2.79 

9974 

0.30 

6179 

0.80 

7881 

1.30 

9032 

1.80 

9641 

2.30 

9893 

2.80 

9974 

0.31 

6217 

0.81 

7910 

1.31 

9049 

1.81 

9649 

2.31 

9896 

2.81 

9975 

0.32 

6255 

0.82 

7939 

1.32 

9066 

1.82 

9656 

2.32 

9898 

2.82 

9976 

0.33 

6293 

0.83 

7967 

1.33 

9082 

1.83 

9664 

2.33 

9901 

2.83 

9977 

0.34 

6331 

0.84 

7995 

1.34 

9099 

1.84 

9671 

2.34 

9904 

2.84 

9977 

0.35 

6368 

0.85 

8023 

1.35 

9115 

1.85 

9678 

2.35 

9906 

2.85 

9978 

0.36 

6406 

0.86 

8051 

1.36 

9131 

1.86 

9686 

2.36 

9909 

2.86 

9979 

0.37 

6443 

0.87 

8078 

1.37 

9147 

1.87 

9693 

2.37 

9911 

2.87 

9979 

0.38 

6480 

0.88 

8106 

1.38 

9162 

1.88 

9699 

2.38 

9913 

2.88 

9980 

0.39 

6517 

0.89 

8133 

1.39 

9177 

1.89 

9706 

2.39 

9916 

2.89 

9981 

0.40 

6554 

0.90 

8159 

1.40 

9192 

1.90 

9713 

2.40 

9918 

2.90 

9981 

0.41 

6591 

0.91 

8186 

1.41 

9207 

1.91 

9719 

2.41 

9920 

2.91 

9982 

0.42 

6628 

0.92 

8212 

1.42 

9222 

1.92 

9726 

2.42 

9922 

2.92 

9982 

0.43 

6664 

0.93 

8238 

1.43 

9236 

1.93 

9732 

2.43 

9925 

2.93 

9983 

0.44 

6700 

0.94 

8264 

1.44 

9251 

1.94 

9738 

2.44 

9927 

2.94 

9984 

0.45 

6736 

0.95 

8289 

1.45 

9265 

1.95 

9744 

2.45 

9929 

2.95 

9984 

0.46 

6772 

0.96 

8315 

1.46 

9279 

1.96 

9750 

2.46 

9931 

2.96 

9985 

0.47 

6808 

0.97 

8340 

1.47 

9292 

1.97 

9756 

2.47 

9932 

2.97 

9985 

0.48 

6844 

0.98 

8365 

1.48 

9306 

1.98 

9761 

2.48 

9934 

2.98 

9986 

0.49 

6879 

0.99 

8389 

1.49 

9319 

1.99 

9767 

2.49 

9936 

2.99 

9986 

0.50 

6915 

1.00 

8413 

1.50 

9332 

2.00 

9772 

2.50 

9938 

3.00 

9987 
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Table  A8  Normal  Distribution 

Values  of  z for  given  values  of  <f>(z)  [see  (3),  Sec.  24.8]  and  D(z ) = <l>(z)  — <E>(— z) 
Example:  z = 0.279  if  <J>(z)  = 61%;  z = 0.860  if  D(z)  = 61%. 


% 

Z(D) 

% 

z(<J>) 

z(D) 

% 

z(<K) 

Z(D) 

1 

-2.326 

0.013 

41 

-0.228 

0.539 

81 

0.878 

1.311 

2 

-2.054 

0.025 

42 

-0.202 

0.553 

82 

0.915 

1.341 

3 

-1.881 

0.038 

43 

-0.176 

0.568 

83 

0.954 

1.372 

4 

-1.751 

0.050 

44 

-0.151 

0.583 

84 

0.994 

1.405 

5 

-1.645 

0.063 

45 

-0.126 

0.598 

85 

1.036 

1.440 

6 

-1.555 

0.075 

46 

-0.100 

0.613 

86 

1.080 

1.476 

7 

-1.476 

0.088 

47 

-0.075 

0.628 

87 

1.126 

1.514 

8 

-1.405 

0.100 

48 

-0.050 

0.643 

88 

1.175 

1.555 

9 

-1.341 

0.113 

49 

-0.025 

0.659 

89 

1.227 

1.598 

10 

-1.282 

0.126 

50 

0.000 

0.674 

90 

1.282 

1.645 

11 

-1.227 

0.138 

51 

0.025 

0.690 

91 

1.341 

1.695 

12 

-1.175 

0.151 

52 

0.050 

0.706 

92 

1.405 

1.751 

13 

-1.126 

0.164 

53 

0.075 

0.722 

93 

1.476 

1.812 

14 

-1.080 

0.176 

54 

0.100 

0.739 

94 

1.555 

1.881 

15 

-1.036 

0.189 

55 

0.126 

0.755 

95 

1.645 

1.960 

16 

-0.994 

0.202 

56 

0.151 

0.772 

96 

1.751 

2.054 

17 

-0.954 

0.215 

57 

0.176 

0.789 

97 

1.881 

2.170 

18 

-0.915 

0.228 

58 

0.202 

0.806 

97.5 

1.960 

2.241 

19 

-0.878 

0.240 

59 

0.228 

0.824 

98 

2.054 

2.326 

20 

-0.842 

0.253 

60 

0.253 

0.842 

99 

2.326 

2.576 

21 

-0.806 

0.266 

61 

0.279 

0.860 

99.1 

2.366 

2.612 

22 

-0.772 

0.279 

62 

0.305 

0.878 

99.2 

2.409 

2.652 

23 

-0.739 

0.292 

63 

0.332 

0.896 

99.3 

2.457 

2.697 

24 

-0.706 

0.305 

64 

0.358 

0.915 

99.4 

2.512 

2.748 

25 

-0.674 

0.319 

65 

0.385 

0.935 

99.5 

2.576 

2.807 

26 

-0.643 

0.332 

66 

0.412 

0.954 

99.6 

2.652 

2.878 

27 

-0.613 

0.345 

67 

0.440 

0.974 

99.7 

2.748 

2.968 

28 

-0.583 

0.358 

68 

0.468 

0.994 

99.8 

2.878 

3.090 

29 

-0.553 

0.372 

69 

0.496 

1.015 

99.9 

3.090 

3.291 

30 

-0.524 

0.385 

70 

0.524 

1.036 

31 

-0.496 

0.399 

71 

0.553 

1.058 

99.91 

3.121 

3.320 

32 

-0.468 

0.412 

72 

0.583 

1.080 

99.92 

3.156 

3.353 

33 

-0.440 

0.426 

73 

0.613 

1.103 

99.93 

3.195 

3.390 

34 

-0.412 

0.440 

74 

0.643 

1.126 

99.94 

3.239 

3.432 

35 

-0.385 

0.454 

75 

0.674 

1.150 

99.95 

3.291 

3.481 

36 

-0.358 

0.468 

76 

0.706 

1.175 

99.96 

3.353 

3.540 

37 

-0.332 

0.482 

77 

0.739 

1.200 

99.97 

3.432 

3.615 

38 

-0.305 

0.496 

78 

0.772 

1.227 

99.98 

3.540 

3.719 

39 

-0.279 

0.510 

79 

0.806 

1.254 

99.99 

3.719 

3.891 

40 

-0.253 

0.524 

80 

0.842 

1.282 
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Table  A9  t-Distribution 


Values  of  z for  given  values  of  the  distribution  function  F(z)  (see  (8)  in  Sec.  25.3) 
Example:  For  9 degrees  of  freedom,  z = 1.83  when  F(z)  = 0.95. 


Number  of  Degrees  of  Freedom 

m 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

0.5 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.6 

0.32 

0.29 

0.28 

0.27 

0.27 

0.26 

0.26 

0.26 

0.26 

0.26 

0.7 

0.73 

0.62 

0.58 

0.57 

0.56 

0.55 

0.55 

0.55 

0.54 

0.54 

0.8 

1.38 

1.06 

0.98 

0.94 

0.92 

0.91 

0.90 

0.89 

0.88 

0.88 

0.9 

3.08 

1.89 

1.64 

1.53 

1.48 

1.44 

1.41 

1.40 

1.38 

1.37 

0.95 

6.31 

2.92 

2.35 

2.13 

2.02 

1.94 

1.89 

1.86 

1.83 

1.81 

0.975 

12.7 

4.30 

3.18 

2.78 

2.57 

2.45 

2.36 

2.31 

2.26 

2.23 

0.99 

31.8 

6.96 

4.54 

3.75 

3.36 

3.14 

3.00 

2.90 

2.82 

2.76 

0.995 

63.7 

9.92 

5.84 

4.60 

4.03 

3.71 

3.50 

3.36 

3.25 

3.17 

0.999 

318.3 

22.3 

10.2 

7.17 

5.89 

5.21 

4.79 

4.50 

4.30 

4.14 

m 

11 

12 

13 

Number 

14 

of  Degree 
15 

s of  Free 
16 

lorn 

17 

18 

19 

20 

0.5 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.6 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.7 

0.54 

0.54 

0.54 

0.54 

0.54 

0.54 

0.53 

0.53 

0.53 

0.53 

0.8 

0.88 

0.87 

0.87 

0.87 

0.87 

0.86 

0.86 

0.86 

0.86 

0.86 

0.9 

1.36 

1.36 

1.35 

1.35 

1.34 

1.34 

1.33 

1.33 

1.33 

1.33 

0.95 

1.80 

1.78 

1.77 

1.76 

1.75 

1.75 

1.74 

1.73 

1.73 

1.72 

0.975 

2.20 

2.18 

2.16 

2.14 

2.13 

2.12 

2.11 

2.10 

2.09 

2.09 

0.99 

2.72 

2.68 

2.65 

2.62 

2.60 

2.58 

2.57 

2.55 

2.54 

2.53 

0.995 

3.11 

3.05 

3.01 

2.98 

2.95 

2.92 

2.90 

2.88 

2.86 

2.85 

0.999 

4.02 

3.93 

3.85 

3.79 

3.73 

3.69 

3.65 

3.61 

3.58 

3.55 

Number  of  Degrees  of  Freedom 

m 

22 

24 

26 

28 

30 

40 

50 

100 

200 

00 

0.5 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.6 

0.26 

0.26 

0.26 

0.26 

0.26 

0.26 

0.25 

0.25 

0.25 

0.25 

0.7 

0.53 

0.53 

0.53 

0.53 

0.53 

0.53 

0.53 

0.53 

0.53 

0.52 

0.8 

0.86 

0.86 

0.86 

0.85 

0.85 

0.85 

0.85 

0.85 

0.84 

0.84 

0.9 

1.32 

1.32 

1.31 

1.31 

1.31 

1.30 

1.30 

1.29 

1.29 

1.28 

0.95 

1.72 

1.71 

1.71 

1.70 

1.70 

1.68 

1.68 

1.66 

1.65 

1.65 

0.975 

2.07 

2.06 

2.06 

2.05 

2.04 

2.02 

2.01 

1.98 

1.97 

1.96 

0.99 

2.51 

2.49 

2.48 

2.47 

2.46 

2.42 

2.40 

2.36 

2.35 

2.33 

0.995 

2.82 

2.80 

2.78 

2.76 

2.75 

2.70 

2.68 

2.63 

2.60 

2.58 

0.999 

3.50 

3.47 

3.43 

3.41 

3.39 

3.31 

3.26 

3.17 

3.13 

3.09 
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Table  A1(  Chi-square  Distribution 

Values  of  x for  given  values  of  the  distribution  function  F(z)  (see  Sec.  25.3  before  (17)). 
Example:  For  3 degrees  of  freedom,  z = 11.34  when  F(z)  = 0.99. 


Number  of  Degrees  of  Freedom 

m 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

0.005 

0.00 

0.01 

0.07 

0.21 

0.41 

0.68 

0.99 

1.34 

1.73 

2.16 

0.01 

0.00 

0.02 

0.11 

0.30 

0.55 

0.87 

1.24 

1.65 

2.09 

2.56 

0.025 

0.00 

0.05 

0.22 

0.48 

0.83 

1.24 

1.69 

2.18 

2.70 

3.25 

0.05 

0.00 

0.10 

0.35 

0.71 

1.15 

1.64 

2.17 

2.73 

3.33 

3.94 

0.95 

3.84 

5.99 

7.81 

9.49 

11.07 

12.59 

14.07 

15.51 

16.92 

18.31 

0.975 

5.02 

7.38 

9.35 

11.14 

12.83 

14.45 

16.01 

17.53 

19.02 

20.48 

0.99 

6.63 

9.21 

11.34 

13.28 

15.09 

16.81 

18.48 

20.09 

21.67 

23.21 

0.995 

7.88 

10.60 

12.84 

14.86 

16.75 

18.55 

20.28 

21.95 

23.59 

25.19 

Number  of  Degrees  of  Freedom 

F(z) 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

0.005 

2.60 

3.07 

3.57 

4.07 

4.60 

5.14 

5.70 

6.26 

6.84 

7.43 

0.01 

3.05 

3.57 

4.11 

4.66 

5.23 

5.81 

6.41 

7.01 

7.63 

8.26 

0.025 

3.82 

4.40 

5.01 

5.63 

6.26 

6.91 

7.56 

8.23 

8.91 

9.59 

0.05 

4.57 

5.23 

5.89 

6.57 

7.26 

7.96 

8.67 

9.39 

10.12 

10.85 

0.95 

19.68 

21.03 

22.36 

23.68 

25.00 

26.30 

27.59 

28.87 

30.14 

31.41 

0.975 

21.92 

23.34 

24.74 

26.12 

27.49 

28.85 

30.19 

31.53 

32.85 

34.17 

0.99 

24.72 

26.22 

27.69 

29.14 

30.58 

32.00 

33.41 

34.81 

36.19 

37.57 

0.995 

26.76 

28.30 

29.82 

31.32 

32.80 

34.27 

35.72 

37.16 

38.58 

40.00 

Number  of  Degrees  of  Freedom 

m 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

0.005 

8.0 

8.6 

9.3 

9.9 

10.5 

11.2 

11.8 

12.5 

13.1 

13.8 

0.01 

8.9 

9.5 

10.2 

10.9 

11.5 

12.2 

12.9 

13.6 

14.3 

15.0 

0.025 

10.3 

11.0 

11.7 

12.4 

13.1 

13.8 

14.6 

15.3 

16.0 

16.8 

0.05 

11.6 

12.3 

13.1 

13.8 

14.6 

15.4 

16.2 

16.9 

17.7 

18.5 

0.95 

32.7 

33.9 

35.2 

36.4 

37.7 

38.9 

40.1 

41.3 

42.6 

43.8 

0.975 

35.5 

36.8 

38.1 

39.4 

40.6 

41.9 

43.2 

44.5 

45.7 

47.0 

0.99 

38.9 

40.3 

41.6 

43.0 

44.3 

45.6 

47.0 

48.3 

49.6 

50.9 

0.995 

41.4 

42.8 

44.2 

45.6 

46.9 

48.3 

49.6 

51.0 

52.3 

53.7 

Number  of  Degrees  of  Freedom 

m 

40 

50 

60 

70 

80 

90 

100 

>100  (Approximation) 

0.005 

20.7 

28.0 

35.5 

43.3 

51.2 

59.2 

67.3 

\(h  - 2.58)2 

0.01 

22.2 

29.7 

37.5 

45.4 

53.5 

61.8 

70.1 

\(h  - 2.33 f 

0.025 

24.4 

32.4 

40.5 

48.8 

57.2 

65.6 

74.2 

\{h  - 1.96)2 

0.05 

26.5 

34.8 

43.2 

51.7 

60.4 

69.1 

77.9 

5 (*  - 1 -64)2 

0.95 

55.8 

67.5 

79.1 

90.5 

101.9 

113.1 

124.3 

+ 1.64  f 

0.975 

59.3 

71.4 

83.3 

95.0 

106.6 

118.1 

129.6 

l(h  + 1.96  )2 

0.99 

63.7 

76.2 

88.4 

100.4 

112.3 

124.1 

135.8 

\(h  + 2.33)2 

0.995 

66.8 

79.5 

92.0 

104.2 

116.3 

128.3 

140.2 

\(h  + 2.58)2 

In  the  last  column,  h = V2 m — I,  where  m is  the  number  of  degrees  of  freedom. 
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Table  All  F-Distribution  with  ( m , n)  Degrees  of  Freedom 

Values  of  z for  which  the  distribution  function  F(z)  [see  (13),  Sec.  25.4]  has  the  value  0.95 
Example:  For  (7,  4)  d.f.,  z = 6.09  if  F(z)  = 0.95. 


n 

m = 1 

m = 2 

m = 3 

m = 4 

m = 5 

m = 6 

m = 7 

m = 8 

m = 9 

i 

161 

200 

216 

225 

230 

234 

237 

239 

241 

2 

18.5 

19.0 

19.2 

19.2 

19.3 

19.3 

19.4 

19.4 

19.4 

3 

10.1 

9.55 

9.28 

9.12 

9.01 

8.94 

8.89 

8.85 

8.81 

4 

7.71 

6.94 

6.59 

6.39 

6.26 

6.16 

6.09 

6.04 

6.00 

5 

6.61 

5.79 

5.41 

5.19 

5.05 

4.95 

4.88 

4.82 

4.77 

6 

5.99 

5.14 

4.76 

4.53 

4.39 

4.28 

4.21 

4.15 

4.10 

7 

5.59 

4.74 

4.35 

4.12 

3.97 

3.87 

3.79 

3.73 

3.68 

8 

5.32 

4.46 

4.07 

3.84 

3.69 

3.58 

3.50 

3.44 

3.39 

9 

5.12 

4.26 

3.86 

3.63 

3.48 

3.37 

3.29 

3.23 

3.18 

10 

4.96 

4.10 

3.71 

3.48 

3.33 

3.22 

3.14 

3.07 

3.02 

11 

4.84 

3.98 

3.59 

3.36 

3.20 

3.09 

3.01 

2.95 

2.90 

12 

4.75 

3.89 

3.49 

3.26 

3.11 

3.00 

2.91 

2.85 

2.80 

13 

4.67 

3.81 

3.41 

3.18 

3.03 

2.92 

2.83 

2.77 

2.71 

14 

4.60 

3.74 

3.34 

3.11 

2.96 

2.85 

2.76 

2.70 

2.65 

15 

4.54 

3.68 

3.29 

3.06 

2.90 

2.79 

2.71 

2.64 

2.59 

16 

4.49 

3.63 

3.24 

3.01 

2.85 

2.74 

2.66 

2.59 

2.54 

17 

4.45 

3.59 

3.20 

2.96 

2.81 

2.70 

2.61 

2.55 

2.49 

18 

4.41 

3.55 

3.16 

2.93 

2.77 

2.66 

2.58 

2.51 

2.46 

19 

4.38 

3.52 

3.13 

2.90 

2.74 

2.63 

2.54 

2.48 

2.42 

20 

4.35 

3.49 

3.10 

2.87 

2.71 

2.60 

2.51 

2.45 

2.39 

22 

4.30 

3.44 

3.05 

2.82 

2.66 

2.55 

2.46 

2.40 

2.34 

24 

4.26 

3.40 

3.01 

2.78 

2.62 

2.51 

2.42 

2.36 

2.30 

26 

4.23 

3.37 

2.98 

2.74 

2.59 

2.47 

2.39 

2.32 

2.27 

28 

4.20 

3.34 

2.95 

2.71 

2.56 

2.45 

2.36 

2.29 

2.24 

30 

4.17 

3.32 

2.92 

2.69 

2.53 

2.42 

2.33 

2.27 

2.21 

32 

4.15 

3.29 

2.90 

2.67 

2.51 

2.40 

2.31 

2.24 

2.19 

34 

4.13 

3.28 

2.88 

2.65 

2.49 

2.38 

2.29 

2.23 

2.17 

36 

4.11 

3.26 

2.87 

2.63 

2.48 

2.36 

2.28 

2.21 

2.15 

38 

4.10 

3.24 

2.85 

2.62 

2.46 

2.35 

2.26 

2.19 

2.14 

40 

4.08 

3.23 

2.84 

2.61 

2.45 

2.34 

2.25 

2.18 

2.12 

50 

4.03 

3.18 

2.79 

2.56 

2.40 

2.29 

2.20 

2.13 

2.07 

60 

4.00 

3.15 

2.76 

2.53 

2.37 

2.25 

2.17 

2.10 

2.04 

70 

3.98 

3.13 

2.74 

2.50 

2.35 

2.23 

2.14 

2.07 

2.02 

80 

3.96 

3.11 

2.72 

2.49 

2.33 

2.21 

2.13 

2.06 

2.00 

90 

3.95 

3.10 

2.71 

2.47 

2.32 

2.20 

2.11 

2.04 

1.99 

100 

3.94 

3.09 

2.70 

2.46 

2.31 

2.19 

2.10 

2.03 

1.97 

150 

3.90 

3.06 

2.66 

2.43 

2.27 

2.16 

2.07 

2.00 

1.94 

200 

3.89 

3.04 

2.65 

2.42 

2.26 

2.14 

2.06 

1.98 

1.93 

1000 

3.85 

3.00 

2.61 

2.38 

2.22 

2.11 

2.02 

1.95 

1.89 

00 

3.84 

3.00 

2.60 

2.37 

2.21 

2.10 

2.01 

1.94 

1.88 
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Table  All  F-Distribution  with  ( m , n)  Degrees  of  Freedom  (continued) 

Values  of  z for  which  the  distribution  function  F(z)  [see  (13),  Sec.  25.4]  has  the  value  0.95 


n 

m = 10 

m =15 

m = 20 

in  = 30 

O 

'Tf 

II 

£ 

m = 50 

O 

O 

II 

00 

i 

242 

246 

248 

250 

251 

252 

253 

254 

2 

19.4 

19.4 

19.4 

19.5 

19.5 

19.5 

19.5 

19.5 

3 

8.79 

8.70 

8.66 

8.62 

8.59 

8.58 

8.55 

8.53 

4 

5.96 

5.86 

5.80 

5.75 

5.72 

5.70 

5.66 

5.63 

5 

4.74 

4.62 

4.56 

4.50 

4.46 

4.44 

4.41 

4.37 

6 

4.06 

3.94 

3.87 

3.81 

3.77 

3.75 

3.71 

3.67 

7 

3.64 

3.51 

3.44 

3.38 

3.34 

3.32 

3.27 

3.23 

8 

3.35 

3.22 

3.15 

3.08 

3.04 

3.02 

2.97 

2.93 

9 

3.14 

3.01 

2.94 

2.86 

2.83 

2.80 

2.76 

2.71 

10 

2.98 

2.85 

2.77 

2.70 

2.66 

2.64 

2.59 

2.54 

11 

2.85 

2.72 

2.65 

2.57 

2.53 

2.51 

2.46 

2.40 

12 

2.75 

2.62 

2.54 

2.47 

2.43 

2.40 

2.35 

2.30 

13 

2.67 

2.53 

2.46 

2.38 

2.34 

2.31 

2.26 

2.21 

14 

2.60 

2.46 

2.39 

2.31 

2.27 

2.24 

2.19 

2.13 

15 

2.54 

2.40 

2.33 

2.25 

2.20 

2.18 

2.12 

2.07 

16 

2.49 

2.35 

2.28 

2.19 

2.15 

2.12 

2.07 

2.01 

17 

2.45 

2.31 

2.23 

2.15 

2.10 

2.08 

2.02 

1.96 

18 

2.41 

2.27 

2.19 

2.11 

2.06 

2.04 

1.98 

1.92 

19 

2.38 

2.23 

2.16 

2.07 

2.03 

2.00 

1.94 

1.88 

20 

2.35 

2.20 

2.12 

2.04 

1.99 

1.97 

1.91 

1.84 

22 

2.30 

2.15 

2.07 

1.98 

1.94 

1.91 

1.85 

1.78 

24 

2.25 

2.11 

2.03 

1.94 

1.89 

1.86 

1.80 

1.73 

26 

2.22 

2.07 

1.99 

1.90 

1.85 

1.82 

1.76 

1.69 

28 

2.19 

2.04 

1.96 

1.87 

1.82 

1.79 

1.73 

1.65 

30 

2.16 

2.01 

1.93 

1.84 

1.79 

1.76 

1.70 

1.62 

32 

2.14 

1.99 

1.91 

1.82 

1.77 

1.74 

1.67 

1.59 

34 

2.12 

1.97 

1.89 

1.80 

1.75 

1.71 

1.65 

1.57 

36 

2.11 

1.95 

1.87 

1.78 

1.73 

1.69 

1.62 

1.55 

38 

2.09 

1.94 

1.85 

1.76 

1.71 

1.68 

1.61 

1.53 

40 

2.08 

1.92 

1.84 

1.74 

1.69 

1.66 

1.59 

1.51 

50 

2.03 

1.87 

1.78 

1.69 

1.63 

1.60 

1.52 

1.44 

60 

1.99 

1.84 

1.75 

1.65 

1.59 

1.56 

1.48 

1.39 

70 

1.97 

1.81 

1.72 

1.62 

1.57 

1.53 

1.45 

1.35 

80 

1.95 

1.79 

1.70 

1.60 

1.54 

1.51 

1.43 

1.32 

90 

1.94 

1.78 

1.69 

1.59 

1.53 

1.49 

1.41 

1.30 

100 

1.93 

1.77 

1.68 

1.57 

1.52 

1.48 

1.39 

1.28 

150 

1.89 

1.73 

1.64 

1.54 

1.48 

1.44 

1.34 

1.22 

200 

1.88 

1.72 

1.62 

1.52 

1.46 

1.41 

1.32 

1.19 

1000 

1.84 

1.68 

1.58 

1.47 

1.41 

1.36 

1.26 

1.08 

00 

1.83 

1.67 

1.57 

1.46 

1.39 

1.35 

1.24 

1.00 
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Table  All  F-Distribution  with  ( m , n)  Degrees  of  Freedom  (continued) 

Values  of  z for  which  the  distribution  function  F(z)  [see  (13),  Sec.  25.4]  has  the  value  0.99 


n 

m = 1 

m = 2 

m = 3 

m = 4 

m = 5 

m = 6 

m = 7 

m = 8 

m~  9 

i 

4052 

4999 

5403 

5625 

5764 

5859 

5928 

5981 

6022 

2 

98.5 

99.0 

99.2 

99.2 

99.3 

99.3 

99.4 

99.4 

99.4 

3 

34.1 

30.8 

29.5 

28.7 

28.2 

27.9 

27.7 

27.5 

27.3 

4 

21.2 

18.0 

16.7 

16.0 

15.5 

15.2 

15.0 

14.8 

14.7 

5 

16.3 

13.3 

12.1 

11.4 

11.0 

10.7 

10.5 

10.3 

10.2 

6 

13.7 

10.9 

9.78 

9.15 

8.75 

8.47 

8.26 

8.10 

7.98 

7 

12.2 

9.55 

8.45 

7.85 

7.46 

7.19 

6.99 

6.84 

6.72 

8 

11.3 

8.65 

7.59 

7.01 

6.63 

6.37 

6.18 

6.03 

5.91 

9 

10.6 

8.02 

6.99 

6.42 

6.06 

5.80 

5.61 

5.47 

5.35 

10 

10.0 

7.56 

6.55 

5.99 

5.64 

5.39 

5.20 

5.06 

4.94 

11 

9.65 

7.21 

6.22 

5.67 

5.32 

5.07 

4.89 

4.74 

4.63 

12 

9.33 

6.93 

5.95 

5.41 

5.06 

4.82 

4.64 

4.50 

4.39 

13 

9.07 

6.70 

5.74 

5.21 

4.86 

4.62 

4.44 

4.30 

4.19 

14 

8.86 

6.51 

5.56 

5.04 

4.69 

4.46 

4.28 

4.14 

4.03 

15 

8.68 

6.36 

5.42 

4.89 

4.56 

4.32 

4.14 

4.00 

3.89 

16 

8.53 

6.23 

5.29 

4.77 

4.44 

4.20 

4.03 

3.89 

3.78 

17 

8.40 

6.11 

5.18 

4.67 

4.34 

4.10 

3.93 

3.79 

3.68 

18 

8.29 

6.01 

5.09 

4.58 

4.25 

4.01 

3.84 

3.71 

3.60 

19 

8.18 

5.93 

5.01 

4.50 

4.17 

3.94 

3.77 

3.63 

3.52 

20 

8.10 

5.85 

4.94 

4.43 

4.10 

3.87 

3.70 

3.56 

3.46 

22 

7.95 

5.72 

4.82 

4.31 

3.99 

3.76 

3.59 

3.45 

3.35 

24 

7.82 

5.61 

4.72 

4.22 

3.90 

3.67 

3.50 

3.36 

3.26 

26 

7.72 

5.53 

4.64 

4.14 

3.82 

3.59 

3.42 

3.29 

3.18 

28 

7.64 

5.45 

4.57 

4.07 

3.75 

3.53 

3.36 

3.23 

3.12 

30 

7.56 

5.39 

4.51 

4.02 

3.70 

3.47 

3.30 

3.17 

3.07 

32 

7.50 

5.34 

4.46 

3.97 

3.65 

3.43 

3.26 

3.13 

3.02 

34 

7.44 

5.29 

4.42 

3.93 

3.61 

3.39 

3.22 

3.09 

2.98 

36 

7.40 

5.25 

4.38 

3.89 

3.57 

3.35 

3.18 

3.05 

2.95 

38 

7.35 

5.21 

4.34 

3.86 

3.54 

3.32 

3.15 

3.02 

2.92 

40 

7.31 

5.18 

4.31 

3.83 

3.51 

3.29 

3.12 

2.99 

2.89 

50 

7.17 

5.06 

4.20 

3.72 

3.41 

3.19 

3.02 

2.89 

2.78 

60 

7.08 

4.98 

4.13 

3.65 

3.34 

3.12 

2.95 

2.82 

2.72 

70 

7.01 

4.92 

4.07 

3.60 

3.29 

3.07 

2.91 

2.78 

2.67 

80 

6.96 

4.88 

4.04 

3.56 

3.26 

3.04 

2.87 

2.74 

2.64 

90 

6.93 

4.85 

4.01 

3.54 

3.23 

3.01 

2.84 

2.72 

2.61 

100 

6.90 

4.82 

3.98 

3.51 

3.21 

2.99 

2.82 

2.69 

2.59 

150 

6.81 

4.75 

3.91 

3.45 

3.14 

2.92 

2.76 

2.63 

2.53 

200 

6.76 

4.71 

3.88 

3.41 

3.11 

2.89 

2.73 

2.60 

2.50 

1000 

6.66 

4.63 

3.80 

3.34 

3.04 

2.82 

2.66 

2.53 

2.43 

00 

6.63 

4.61 

3.78 

3.32 

3.02 

2.80 

2.64 

2.51 

2.41 
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Table  All  F-Distribution  with  ( m , n)  Degrees  of  Freedom  (continued) 

Values  of  z for  which  the  distribution  function  F(z)  [see  (13),  Sec.  25.4]  has  the  value  0.99 


n 

m = 10 

m =15 

m = 20 

in  = 30 

m = 40 

m = 50 

O 

O 

II 

00 

i 

6056 

6157 

6209 

6261 

6287 

6303 

6334 

6366 

2 

99.4 

99.4 

99.4 

99.5 

99.5 

99.5 

99.5 

99.5 

3 

27.2 

26.9 

26.7 

26.5 

26.4 

26.4 

26.2 

26.1 

4 

14.5 

14.2 

14.0 

13.8 

13.7 

13.7 

13.6 

13.5 

5 

10.1 

9.72 

9.55 

9.38 

9.29 

9.24 

9.13 

9.02 

6 

7.87 

7.56 

7.40 

7.23 

7.14 

7.09 

6.99 

6.88 

7 

6.62 

6.31 

6.16 

5.99 

5.91 

5.86 

5.75 

5.65 

8 

5.81 

5.52 

5.36 

5.20 

5.12 

5.07 

4.96 

4.86 

9 

5.26 

4.96 

4.81 

4.65 

4.57 

4.52 

4.42 

4.31 

10 

4.85 

4.56 

4.41 

4.25 

4.17 

4.12 

4.01 

3.91 

11 

4.54 

4.25 

4.10 

3.94 

3.86 

3.81 

3.71 

3.60 

12 

4.30 

4.01 

3.86 

3.70 

3.62 

3.57 

3.47 

3.36 

13 

4.10 

3.82 

3.66 

3.51 

3.43 

3.38 

3.27 

3.17 

14 

3.94 

3.66 

3.51 

3.35 

3.27 

3.22 

3.11 

3.00 

15 

3.80 

3.52 

3.37 

3.21 

3.13 

3.08 

2.98 

2.87 

16 

3.69 

3.41 

3.26 

3.10 

3.02 

2.97 

2.86 

2.75 

17 

3.59 

3.31 

3.16 

3.00 

2.92 

2.87 

2.76 

2.65 

18 

3.51 

3.23 

3.08 

2.92 

2.84 

2.78 

2.68 

2.57 

19 

3.43 

3.15 

3.00 

2.84 

2.76 

2.71 

2.60 

2.49 

20 

3.37 

3.09 

2.94 

2.78 

2.69 

2.64 

2.54 

2.42 

22 

3.26 

2.98 

2.83 

2.67 

2.58 

2.53 

2.42 

2.31 

24 

3.17 

2.89 

2.74 

2.58 

2.49 

2.44 

2.33 

2.21 

26 

3.09 

2.81 

2.66 

2.50 

2.42 

2.36 

2.25 

2.13 

28 

3.03 

2.75 

2.60 

2.44 

2.35 

2.30 

2.19 

2.06 

30 

2.98 

2.70 

2.55 

2.39 

2.30 

2.25 

2.13 

2.01 

32 

2.93 

2.65 

2.50 

2.34 

2.25 

2.20 

2.08 

1.96 

34 

2.89 

2.61 

2.46 

2.30 

2.21 

2.16 

2.04 

1.91 

36 

2.86 

2.58 

2.43 

2.26 

2.18 

2.12 

2.00 

1.87 

38 

2.83 

2.55 

2.40 

2.23 

2.14 

2.09 

1.97 

1.84 

40 

2.80 

2.52 

2.37 

2.20 

2.11 

2.06 

1.94 

1.80 

50 

2.70 

2.42 

2.27 

2.10 

2.01 

1.95 

1.82 

1.68 

60 

2.63 

2.35 

2.20 

2.03 

1.94 

1.88 

1.75 

1.60 

70 

2.59 

2.31 

2.15 

1.98 

1.89 

1.83 

1.70 

1.54 

80 

2.55 

2.27 

2.12 

1.94 

1.85 

1.79 

1.65 

1.49 

90 

2.52 

2.24 

2.09 

1.92 

1.82 

1.76 

1.62 

1.46 

100 

2.50 

2.22 

2.07 

1.89 

1.80 

1.74 

1.60 

1.43 

150 

2.44 

2.16 

2.00 

1.83 

1.73 

1.66 

1.52 

1.33 

200 

2.41 

2.13 

1.97 

1.79 

1.69 

1.63 

1.48 

1.28 

1000 

2.34 

2.06 

1.90 

1.72 

1.61 

1.54 

1.38 

fill 

00 

2.32 

2.04 

1.88 

1.70 

1.59 

1.52 

1.36 

1.00 
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Table  A12  Distribution  Function  F(x)  — P(T  ' x)  of  the  Random  Variable  T in 
Section  25.8 


X 

n 

= 8 

0. 

2 

001 

3 

003 

4 

007 

5 

016 

6 

031 

7 

054 

8 

089 

9 

138 

10 

199 

11 

274 

12 

360 

13 

452 

X 

n 

=6 

0. 

0 

001 

l 

008 

2 

028 

3 

068 

4 

136 

5 

235 

6 

360 

7 

500 

X 

n 

=5 

0. 

0 

008 

l 

042 

2 

117 

3 

242 

4 

408 

X 

n 

=7 

0. 

1 

001 

2 

005 

3 

015 

4 

035 

5 

068 

6 

119 

7 

191 

8 

281 

9 

386 

10 

500 

n 

X 

=4 

0. 

0 

042 

l 

167 

2 

375 

X 

n 

=9 

0. 

4 

001 

5 

003 

6 

006 

7 

012 

8 

022 

9 

038 

10 

060 

11 

090 

12 

130 

13 

179 

14 

238 

15 

306 

16 

381 

17 

460 

X 

n 

= 10 

0. 

6 

001 

7 

002 

8 

005 

9 

008 

10 

014 

11 

023 

12 

036 

13 
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14 

078 

15 

108 

16 

146 

17 

190 

18 

242 

19 

300 

20 

364 

21 

431 

22 

500 

X 

n 

= 11 

0. 

8 

001 

9 

002 

10 

003 

11 

005 

12 

008 

13 

013 

14 

020 

15 

030 

16 

043 

17 

060 

18 

082 

19 

109 

20 

141 

21 

179 

22 

223 

23 

271 

24 

324 

25 

381 

26 

440 

27 

500 

n 

X 

=3 

0. 

0 

167 

l 

500 

X 

O 

II 

0. 

50 

001 

51 

002 

52 

002 

53 

003 

54 

004 

55 

005 

56 

006 

57 

007 

58 

008 

59 

010 

60 

012 

61 

014 

62 

017 

63 

020 

64 

023 

65 

027 

66 

032 

67 

037 

68 
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69 

049 

70 

056 

71 

064 

72 

073 

73 

082 

74 

093 

75 

104 

76 

117 

77 

130 

78 

144 

79 

159 

80 

176 

81 

193 

82 

211 

83 

230 

84 

250 

85 

271 

86 

293 

87 

315 

88 

339 

89 

362 

90 

387 

91 

411 

92 

436 

93 

462 

94 

487 

X 

n 

= 19 

0. 

43 

001 

44 

002 

45 

002 

46 

003 

47 

003 

48 

004 

49 

005 

50 

006 

51 

008 

52 

010 

53 

012 

54 

014 

55 

017 

56 

021 

57 

025 

58 

029 

59 

034 

60 

040 

61 

047 

62 

054 

63 

062 

64 

072 

65 

082 

66 

093 

67 

105 

68 

119 

69 

133 

70 

149 

71 

166 

72 

184 

73 

203 

74 

223 

75 

245 

76 

267 

77 

290 

78 

314 

79 

339 

80 

365 

81 

391 

82 

418 

83 

445 

84 

473 

85 

500 

X 

71 

= 18 

0. 

38 

001 

39 

002 

40 

003 

41 

003 

42 

004 

43 

005 

44 

007 

45 

009 

46 

Oil 

47 

013 

48 

016 

49 

020 

50 

024 

51 

029 

52 

034 

53 

041 

54 

048 

55 

056 

56 

066 

57 

076 

58 

088 

59 

100 

60 

115 

61 

130 

62 

147 

63 

165 

64 
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65 

205 

66 

227 

67 

250 

68 

275 

69 

300 

70 

327 

71 

354 

72 

383 

73 

411 

74 

441 

75 

470 

76 

500 

X 

n 

= 17 

0. 

32 

001 

33 

002 

34 

002 

35 

003 

36 

004 

37 

005 

38 

007 

39 

009 

40 

011 

41 

014 

42 

017 

43 

021 

44 

026 

45 

032 

46 

038 

47 

046 

48 

054 

49 

064 

50 

076 

51 

088 

52 

102 

53 

118 

54 

135 

55 

154 

56 

174 

57 

196 

58 

220 

59 

245 

60 

271 

61 

299 

62 

328 

63 

358 

64 

388 

65 

420 

66 

452 

67 

484 

X 

n 

= 16 

0. 

27 

001 

28 

002 

29 

002 

30 

003 

31 

004 

32 

006 

33 

008 

34 

010 

35 

013 

36 

016 

37 

021 

38 

026 

39 

032 

40 

039 

41 

048 

42 

058 

43 

070 

44 

083 

45 

097 

46 

114 

47 

133 

48 

153 

49 

175 

50 

199 

51 

225 

52 

253 

53 

282 

54 

313 

55 

345 

56 

378 

57 

412 

58 

447 

59 

482 

X 

n 

= 15 

0. 

23 

001 

24 

002 

25 

003 

26 

004 

27 

006 

28 

008 

29 

010 

30 

014 

31 

018 

32 

023 

33 

029 

34 

037 

35 

046 

36 

057 

37 

070 

38 

084 

39 

101 

40 

120 

41 

141 

42 

164 

43 

190 

44 

218 

45 

248 

46 

279 

47 

313 

48 

349 

49 

385 

50 

423 

51 

461 

52 

500 

X 

n 

= 14 

0. 

18 

001 

19 

002 

20 

002 

21 

003 

22 

005 

23 

007 

24 

010 

25 

013 

26 

018 

27 

024 

28 

031 

29 

040 

30 

051 

31 

063 

32 

079 

33 

096 

34 

117 

35 

140 

36 

165 

37 

194 

38 

225 

39 

259 

40 

295 

41 

334 

42 

374 

43 

415 

44 

457 

45 

500 

X 

n 

= 13 

0. 

14 

001 

15 

001 

16 

002 

17 

003 

18 

005 

19 

007 

20 

Oil 

21 

015 

22 

021 

23 

029 

24 

038 

25 

050 

26 

064 

27 

082 

28 

102 

29 

126 

30 

153 

31 

184 

32 

218 

33 

255 

34 

295 

35 

338 

36 

383 

37 

429 

38 

476 

X 

n 

= 12 

0. 

11 

001 

12 

002 

13 

003 

14 

004 

15 

007 

16 

010 

17 

016 

18 

022 

19 

031 

20 

043 

21 

058 

22 

076 

23 

098 

24 

125 

25 

155 

26 

190 

27 

230 

28 

273 

29 

319 

30 

369 

31 

420 

32 

473 
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75,  104,  106,  113 
second-order  homogeneous 

linear  ODEs,  50-52,  75, 
104 

systems  of  ODEs,  139 
standard,  314 

vector  spaces,  286,  311,  314 
Beats  (oscillation),  89 


Bellman,  Richard,  981n.3 
Bellman  equations,  981 
Bellman’s  principle,  980-981 
Bell-shaped  curve,  13,  574 
BEM,  see  Backward  Euler  method 
Benoulli,  Niklaus,  31n.7 
Bernoulli,  Daniel,  3 In. 7 
Bernoulli,  Jakob,  3 In. 7 
Bernoulli,  Johann,  3 In. 7 
Bernoulli  distribution,  1040.  See  also 
Binomial  distributions 
Bernoulli  equation,  45 
defined,  31 
linear  ODEs,  31-33 
Bernoulli’s  law  of  large  numbers, 
1051 

Bessel,  Friedrich  Wilhelm,  187n.6 
Bessel  functions,  167,  187-191,  202 
of  the  first  kind,  189-190 
with  half-integer  v,  193-194 
of  order  1 , 1 89 
of  order  v,  191 
orthogonality  of,  506 
of  the  second  kind: 

general  solution,  196-200 
of  order  v,  198-200 
table,  A97-A98 
of  the  third  kind,  200 
Bessel’s  equation,  167,  187-196, 

202 

Bessel  functions,  167,  187-191, 
196-200 

circular  membrane,  587 
general  solution,  194-200 
Bessel’s  inequality: 

for  Fourier  coefficients,  497 
orthogonal  series,  508-509 
Beta  function,  formula  for,  A67 
Bezier  curve,  827 
BFS  algorithms,  see  Breadth  First 
search  algorithms 
Bijective  mapping,  737n.l 
Binomial  coefficients: 

Newton’s  forward  difference 
formula,  816 

probability  theory,  1027-1028 
Binomial  distributions,  1039-1041, 
1061 

normal  approximation  of, 
1049-1050 

sampling  with  replacement  for, 
1042 
table,  A99 
Binomial  series,  696 
Binomial  theorem,  1029 
Bipartite  graphs,  1001-1006,  1008 
BISECT,  ALGORITHM,  A46 
Bisection  method,  807-808 
Bolzano,  Bernard,  A94n.3 


Bolzano-Weierstrass  theorem, 
A94-A95 

Bonnet,  Ossian,  180n.3 
Bonnet’s  recursion,  180 
Borda,  J.  C.,  16n.4 
Boundaries: 

ODEs,  39 
of  regions,  426n.2 
sets  in  complex  plane,  620 
Boundary  conditions: 

one-dimensional  heat  equation, 
559 

PDEs,  541,  605 
periodic,  501 

two-dimensional  wave  equation, 
577 

vibrating  string,  545-547 
Boundary  points,  426n.2 
Boundary  value  problem  (BVP),  499 
conformal  mapping  for,  763-767, 
A96 

first,  see  Dirichlet  problem 
mixed,  see  Mixed  boundary  value 
problem 

second,  see  Neumann  problem 
third,  see  Mixed  boundary  value 
problem 

two-dimensional  heat  equation, 
564 

Bounded  domains,  652 
Bounded  regions,  426n.2 
Bounded  sequence,  A93-A95 
Boxplots,  1013 
Boyle,  Robert,  19n.5 
Boyle-Mariotte’s  law  for  idea  gases, 
19 

Bragg,  Sir  William  Henry,  938n.5 
Bragg,  Sir  William  Lawrence,  938n.5 
Branch,  of  logarithm,  639 
Branch  cut,  of  logarithm,  639 
Branch  point  (Riemann  surfaces),  755 
Breadth  First  search  (BFS) 
algorithms,  977 
defined,  977,  998 
Moore’s,  977-980 
BVP,  see  Boundary  value  problem 


CAD  (computer-aided  design),  820 
Cancellation  laws,  306-307 
Canonical  form,  344 
Cantor,  Georg,  A72n.3 
Cantor-Dedekind  axiom,  A72n.3, 
A95n,4 
Capacity: 

cut  sets,  994 
networks,  991 
Cardano,  Girolamo,  608n.l 
Cardioid,  391,  437 
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Cartesian  coordinates: 
linear  element  in,  A75 
transformation  law,  A86-A87 
vector  product  in,  A83-A84 
writing,  A74 

Cartesian  coordinate  systems: 
complex  plane,  611 
left-handed,  369,  370,  A84 
right-handed,  368-369,  A83-A84 
in  space,  315,  356 
transformation  law  for  vector 
components,  A85-A86 
Cartesius,  Renatus,  356n.l 
Cauchy,  Augustin-Louis,  71n.4, 
625n.4,  683n.l 
Cauchy  determinant,  113 
Cauchy-Goursat  theorem,  see 

Cauchy’s  integral  theorem 
Cauchy-Hadamard  formula,  683 
Cauchy  principal  value,  727,  730 
Cauchy-Riemann  equations,  38,  642 
complex  analysis,  623-629 
proof  of,  A90-A91 
Cauchy-Schwarz  inequality,  363, 
871-782 

Cauchy’s  convergence  principle, 
674-675,  A93-A94 
Cauchy’s  inequality,  666 
Cauchy’s  integral  formula,  660-663, 
670 

Cauchy’s  integral  theorem,  652-660, 
669 

existence  of  indefinite  integral, 
656-658 

Goursat’s  proof  of,  A91-A93 
independence  of  path,  655 
for  multiply  connected  domains, 
658-659 

principle  of  deformation  of  path, 
656 

Cayley,  Arthur,  748n.2 
c-charts,  1092 
Center: 

as  critical  point,  144,  165 
of  a graph,  991 
of  power  series,  680 
Center  control  line  (CL),  1088 
Center  of  gravity,  of  mass  in  a 
region,  429 

Central  difference  notation,  819 
Central  limit  theorem,  1076 
Central  vertex,  991 
Centrifugal  force,  388 
Centripetal  acceleration,  387-388 
Chain  rules,  392-394 
Characteristics,  555 
Characteristics,  method  of,  555 
Characteristic  determinant,  of  a 

matrix,  129,  325,  326,  353,  877 


Characteristic  equation: 

matrices,  129,  325,  326,  353,  877 
PDEs,  555 

second-order  homogeneous  linear 
ODEs,  54 

Characteristic  matrix,  326 
Characteristic  polynomial,  325,  353, 
877 

Characteristic  values,  87,  324,  353. 

See  also  Eigenvalues 
Characteristic  vectors,  324,  877.  See 
also  eigenvectors 
Chebyshev,  Pafnuti,  504n.6 
Chebyshev  equation,  504 
Chebyshev  polynomials,  504 
Checkerboard  pattern  (determinants), 
294 

Chi-square  (x2)  distribution, 
1074-1076,  A104 
Chi-square  (x2)  test,  1096-1097, 

1113 

Choice  of  numeric  method,  for  matrix 
eigenvalue  problems,  879 
Cholesky,  Andre-Louis,  855n.3 
Cholesky’s  method,  855-856,  898 
Chopping,  error  caused  by,  792 
Chromatic  number,  1006 
Circle,  386 

Circle  of  convergence  (power  series), 
682 

Circulation,  of  flow,  467,  774 
CL  (center  control  line),  1088 
Clairaut  equation,  35 
Clamped  condition  (spline 
interpolation),  823 
Class  intervals,  1012 
Class  marks,  1012 
Closed  annulus,  619 
Closed  circular  disk,  619 
Closed  integration  formulas,  833,  838 
Closed  intervals,  A72n.3 
Closed  Newton-Cotes  formulas,  833 
Closed  paths,  414,  645,  975-976 
Closed  regions,  426n.2 
Closed  sets,  620 
Closed  trails,  975-976 
Closed  walks,  975-976 
CN  (Crank-Nicolson)  method, 
938-941 
Coefficients: 
binomial: 

Newton’s  forward  difference 
formula,  816 

probability  theory,  1027-1028 
constant: 

higher-order  homogeneous 
linear  ODEs,  111-116 
second-order  homogeneous 
linear  ODEs,  53-60 


Coefficients:  ( Cont .) 

second-order  nonhomogeneous 
linear  ODEs,  81 
systems  of  ODEs,  140-151 
correlation,  1108-1111,  1113 
Fourier,  476,  484,  538,  582-583 
of  kinetic  friction,  19 
of  linear  systems,  272,  845 
of  ODEs,  47 

higher-order  homogeneous 
linear  ODEs,  105 
second-order  homogeneous 

linear  ODEs,  53-60,  73 
second-order  nonhomogeneous 
linear  ODEs,  81-85 
series  of  ODEs,  168,  174 
variable,  167,  240-241 
of  power  series,  680 
regression,  1105,  1107-1108 
variable: 

Frobenius  method,  180-187 
Laplace  transforms  ODEs 
with,  240-241 
of  ODEs,  167,  240-241 
power  series  method,  167-175 
second-order  homogeneous 
linear  ODEs,  73 
Coefficient  matrices,  257,  273 
Hermitian  or  skew-Hermitian 
forms,  351 
linear  systems,  845 
quadratic  form,  343 
Cofactor  (determinants),  294 
Collatz,  Lothar,  883n.9 
Collatz  inclusion  theorem,  883-884 
Columns: 

determinants,  294 
matrix,  125,  257,  320 
Column  “sum”  norm,  861 
Column  vectors,  126 

matrices,  257,  284-285,  320 
rank  in  terms  of,  284-285 
Combinations  (probability  theory), 
1024,  1026-1027 
of  n things  taken  k at  a time 

without  repetitions,  1026 
of  n things  taken  k at  a time  with 
repetitions,  1026 
Combinatorial  optimization,  970, 
975-1008 

assignment  problems,  1001-1006 
flow  problems  in  networks, 
991-997 

cut  sets,  994-996 
flow  augmenting  paths, 
992-993 
paths,  992 

Ford-Fulkerson  algorithm  for 

maximum  flow,  998-1001 
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Combinatorial  optimization  ( Cont .) 
shortest  path  problems,  975-980 
Bellman’s  principle,  980-981 
complexity  of  algorithms, 
978-980 

Dijkstra’s  algorithm,  981-983 
Moore’s  BFS  algorithm, 
977-980 

shortest  spanning  trees: 

Greedy  algorithm,  984-988 
Prim's  algorithm,  988-991 
Commutation  (matrices),  271 
Complements: 
of  events,  1016 
of  sets  in  complex  plane,  620 
Complementation  rule, 

1020-1021 

Complete  bipartite  graphs,  1005 
Complete  graphs,  974 
Complete  matching,  1002 
Completeness  (orthogonal  series), 
508-509 

Complete  orthonormal  set,  508 
Complex  analysis,  607 

analytic  functions,  623-624 
Cauchy-Riemann  equations, 
623-629 

circles  and  disks,  619 
complex  functions,  620-623 
exponential,  630-633 
general  powers,  639-640 
hyperbolic,  635 
logarithm,  636-639 
trigonometric,  633-635 
complex  integration,  643-670 
Cauchy’s  integral  formula, 
660-663,  670 
Cauchy’s  integral  theorem, 
652-660,  669 
derivatives  of  analytic 

functions,  664-668 
Laurent  series,  708-719 
line  integrals,  643-652,  669 
power  series,  671-707 
residue  integration,  719-733 
complex  numbers,  608-619 
addition  of,  609,  610 
conjugate,  612 
defined,  608 
division  of,  610 
multiplication  of,  609,  610 
polar  form  of,  613-618 
subtraction  of,  610 
complex  plane,  611 
conformal  mapping,  736-757 
geometry  of  analytic  functions, 
737-742 
linear  fractional 

transformations, 

742-750 


Complex  analysis  (Cont.) 

Riemann  surfaces,  754-756 
by  trigonometric  and 

hyperbolic  analytic 
functions,  750-754 
half-planes,  619-620 
harmonic  functions,  628-629 
Laplace’s  equation,  628-629 
Laurent  series,  708-719,  734 
analytic  or  singular  at  infinity, 
718-719 

point  at  infinity,  718 
Riemann  sphere,  718 
singularities,  715-717 
zeros  of  analytic  functions,  717 
power  series,  168,  671-707 
convergence  behavior  of, 
680-682 

convergence  tests,  674-676, 
A93-A94 

functions  given  by,  685-690 
Maclaurin  series,  690 
in  powers  of  x,  168 
radius  of  convergence, 

682-684 

ratio  test,  676-678 
root  test,  678-679 
sequences,  671-673 
series,  673-674 
Taylor  series,  690-697 
uniform  convergence, 

698-705 

residue  integration,  719-733 
formulas  for  residues,  721-722 
of  real  integrals,  725-733 
several  singularities  inside 
contour,  723-725 
Taylor  series,  690-697,  707 
Complex  conjugate  numbers,  612 
Complex  conjugate  roots,  72-73 
Complex  Fourier  integral,  523 
Complex  functions,  620-623 
exponential,  630-633 
general  powers,  639-640 
hyperbolic,  635 
logarithm,  636-639 
trigonometric,  633-635 
Complex  heat  potential,  767 
Complex  integration,  643-670 
Cauchy’s  integral  formula, 
660-663,  670 
Cauchy’s  integral  theorem, 
652-660,  669 

existence  of  indefinite  integral, 
656-658 

independence  of  path,  655 
for  multiply  connected 
domains,  658-659 
principle  of  deformation  of 
path,  656 


Complex  integration  (Cont.) 

derivatives  of  analytic  functions, 
664-668 

Laurent  series,  708-719 

analytic  or  singular  at  infinity, 
718-719 

point  at  infinity,  718 
Riemann  sphere,  718 
singularities,  715-717 
zeros  of  analytic  functions, 
717-718 

line  integrals,  643-652,  669 
basic  properties  of,  645 
bounds  for,  650-651 
definition  of,  643-645 
existence  of,  646 
indefinite  integration  and 

substitution  of  limits, 

646- 647 

representation  of  a path, 

647- 650 

power  series,  671-707 

convergence  behavior  of, 
680-682 

convergence  tests,  674-676 
functions  given  by,  685-690 
Maclaurin  series,  690 
radius  of  convergence  of, 
682-684 

ratio  test,  676-678 
root  test,  678-679 
sequences,  671-673 
series,  673-674 
Taylor  series,  690-697 
uniform  convergence, 

698-705 

residue  integration,  719-733 
formulas  for  residues,  721-722 
of  real  integrals,  725-733 
several  singularities  inside 
contour,  723-725 

Complexity,  of  algorithms,  978-979 

Complex  line  integrals,  see  Line 
integrals 

Complex  matrices  and  forms, 
346-352 

Complex  numbers,  608-619,  641 
addition  of,  609,  610 
conjugate,  612 
defined,  608 
division  of,  610 
multiplication  of,  609,  610 
polar  form  of,  613-618 
subtraction  of,  610 

Complex  plane,  611 

extended,  718,  744-745 
sets  in,  620 

Complex  potential,  786 

electrostatic  fields,  760-761 
of  fluid  flow,  771,  773-774 
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Complex  roots: 

higher-order  homogeneous  linear 
ODEs: 
multiple,  115 
simple,  113-114 

second-order  homogeneous  linear 
ODEs,  57-59 

Complex  trigonometric  polynomials, 
529 

Complex  variables,  620-621 
Complex  vector  space,  309,  310, 

349 

Components  (vectors),  126,  356,  365 
Composition,  of  linear 

transfonnations,  316-317 
Computer-aided  design  (CAD),  820 
Condition: 

of  incompressibility,  405 
spline  interpolation,  823 
Conditionally  convergent  series,  675 
Conditional  probability,  1022-1023, 
1061 

Condition  number,  868-870,  899 
Confidence  intervals,  1063, 
1068-1077,  1113 
interval  estimates,  1065 
for  mean  of  normal  distribution: 
with  known  variance, 
1069-1071 

with  unknown  variance, 
1071-1073 

for  parameters  of  distributions 
other  than  normal,  1076 
in  regression  analysis,  1107-1108 
for  variance  of  a normal 

distribution,  1073-1076 
Confidence  level,  1068 
Conformality,  738 
Conformal  mapping,  736-757 
boundary  value  problems, 
763-767,  A96 
defined,  738 

geometry  of  analytic  functions, 
737-742 

linear  fractional  transformations, 
742-750 

extended  complex  plane, 
744-745 

mapping  standard  domains, 
747-750 

Riemann  surfaces,  754—756 
by  trigonometric  and  hyperbolic 
analytic  functions, 

750-754 

Connected  graphs,  977,  981,  984 
Connected  set,  in  complex  plane, 

620 

Conservative  physical  systems,  422 
Conservative  vector  fields,  400,  408 
Consistent  linear  systems,  277 


Constant  coefficients: 

higher-order  homogeneous  linear 
ODEs,  111-116 
distinct  real  roots,  1 12-113 
multiple  real  roots,  114-115 
simple  complex  roots,  113-114 
second-order  homogeneous  linear 
ODEs,  53-60 
complex  roots,  57-59 
real  double  root,  55-56 
two  distinct  real  roots,  54-55 
second-order  nonhomogeneous 
linear  ODEs,  81 
systems  of  ODEs,  140-151 
critical  points,  142-146, 
148-151 

graphing  solutions  in  phase 
plane,  141-142 

Constant  of  gravity,  at  the  Earth’s 
surface,  63 

Constant  of  integration,  18 
Constant  revenue,  lines  of,  954 
Constrained  (linear)  optimization, 

951,  954-958,  969 
normal  form  of  problems,  955-957 
simplex  method,  958-968 
degenerate  feasible  solution, 
962-965 

difficulties  in  starting,  965-968 
Constraints,  951 
Consumers,  1092 
Consumer’s  risk,  1094 
Consumption  matrix,  334 
Continuity  equation  (compressible 
fluid  flow),  405 

Continuous  complex  functions,  621 
Continuous  distributions,  1029, 
1032-1034 

marginal  distribution  of,  1055 
two-dimensional,  1053 
Continuous  random  variables,  1029, 
1032-1034,  1061 

Continuous  vector  functions,  378-379 
Contour  integral,  653 
Contour  lines,  21,  36 
Control  charts,  1088 
for  mean,  1088-1089 
for  range,  1090-1091 
for  standard  deviation,  1090 
for  variance,  1089-1090 
Controlled  variables,  in  regression 
analysis,  1103 
Control  limits,  1088,  1089 
Control  variables,  951 
Convergence: 
absolute: 

defined,  674 

and  uniform  convergence,  704 
of  approximate  and  exact 
solutions,  936 


Convergence:  ( Cont .) 
circle  of,  682 
defined,  861 

Gauss-Seidel  iteration,  861-862 
mean  square  (orthogonal  series), 
507-508 
in  the  norm,  507 
power  series,  680-682 

convergence  tests,  674-676, 
A93-A94 

radius  of  convergence  of, 
682-684,  706 

uniform  convergence,  698-705 
radius  of,  172 
defined,  172 

power  series,  682-684,  706 
sequence  of  vectors,  378 
speed  of  (numeric  analysis), 
804-805 
superlinear,  806 
uniform: 

and  absolute  convergence,  704 
power  series,  698-705 
Convergence  interval,  171,  683 
Convergence  tests,  674-676 

power  series,  674-676,  A93-A94 
uniform  convergence,  698-705 
Convergent  iteration  processes, 

800 

Convergent  sequence  of  functions, 
507-508,  672 

Convergent  series,  171,  673 
Convolution: 
defined,  232 

Fourier  transforms,  527-528 
Laplace  transforms,  232-237 
Convolution  theorem,  232-233 
Coriolis,  Gustave  Gaspard,  389n.3 
Coriolis  acceleration,  388-389 
Corrector  (improved  Euler  method), 
903 

Correlation  analysis,  1063, 
1108-1111,  1113 
defined,  1103 

test  for  correlation  coefficient, 
1110-1111 

Correlation  coefficient,  1 108-1  111, 
1113 

Cosecant,  formula  for,  A65 
Cosine  function: 

conformal  mapping  by,  752 
formula  for,  A63-A65 
Cosine  integral: 
formula  for,  A69 
table,  A98 
Cosine  series,  781 
Cotangent,  formula  for,  A65 
Coulomb,  Charles  Augustin  de, 

19n.6,  93n.7,  401n.6 
Coulomb’s  law,  19,  401 
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Covariance: 

in  correlation  analysis,  1109 
defined,  1058 

Cramer,  Gabriel,  31n.7,  298n.2 
Cramer’s  rule,  292,  298-300,  321 
for  three  equations,  293 
for  two  equations,  292 
Cramer’s  Theorem,  298 
Crank,  John,  938n.5 
Crank-Nicolson  (CN)  method, 
938-941 

Critical  damping,  65,  66 
Critical  points,  33,  165 

asymptotically  stable,  149 
and  conformal  mapping,  738,  757 
constant-coefficient  systems  of 
ODEs,  142-146 
center,  144 
criteria  for,  148-151 
degenerate  node,  145-146 
improper  node,  142 
proper  node,  143 
saddle  point,  143 
spiral  point,  144-145 
stability  of,  149-151 
isolated,  152 
nonlinear  systems,  152 
stable,  140,  149 
stable  and  attractive,  140,  149 
unstable,  140,  149 
Critical  region,  1079 
Cross  product,  368,  410.  See  also 
Vector  product 

Crout,  Prescott  Durand,  853n.2 
Crout’s  method,  853,  898 
Cubic  spline,  821 

Cumulative  absolute  frequencies  (of 
values),  1012 

Cumulative  distribution  functions, 
1029 

Cumulative  relative  frequencies  (of 
values),  1012 
Curl,  A76 

invariance  of,  A85-A88 
of  vector  fields,  406-409,  412 
Curvature,  of  a curve,  389-390 
Curves: 

arc  of,  383 
bell-shaped,  13,  574 
Bezier,  827 
deflection,  120 
elastic,  120 

equipotential,  36,  759,  761 
one-parameter  family  of,  36-37 
operating  characteristic,  1081, 
1092,  1095 
oriented,  644 

orthogonal  coordinate,  A74 
parameter,  442 
plane,  383 


Curves:  ( Cont .) 
regression,  1103 
simple,  383 
simple  closed,  646 
smooth,  414,  644 
solution,  4—6 
twisted,  383 

vector  differential  calculus, 
381-392,  411 
arc  length  of,  385-386 
length  of,  385 
in  mechanics,  386-389 
tangents  to,  384-385 
and  torsion,  389-390 
Curve  fitting,  872-876 

method  of  least  squares,  872-874 
by  polynomials  of  degree  m, 
874-875 

Curvilinear  coordinates,  354,  412,  A74 
Cut  sets,  994-996,  1008 
Cycle  (paths),  976,  984 
Cylindrical  coordinates,  593-594, 
A74-A76 


D’Alembert,  Jean  le  Rond,  554n.l 
D’Alembert’s  solution,  553-556 
Damped  oscillations,  67 
Damping  constant,  65 
Dantzig,  George  Bernard,  959 
Data  processing: 

frequency  distributions, 

1011-1012 

and  randomness,  1064 
Data  representation: 

frequency  distributions, 

1011-1015 
Empirical  Rule,  1014 
graphic,  1012 
mean,  1013-1014 
standard  deviation,  1014 
variation,  1014 
and  randomness,  1064 
Decisions: 

false,  risks  of  making,  1080 
statistics  for,  1077-1078 
Dedekind,  Richard,  A72n.3 
Defect  (eigenvalue),  328 
Defectives,  1092 

Definite  integrals,  complex,  see  Line 
integrals 

Deflection  curve,  120 
Deformation  of  path,  principle  of, 

656 

Degenerate  feasible  solution  (simplex 
method),  962-965 
Degenerate  node,  145-146 
Degrees  of  freedom  (d.f.),  number  of, 
1071,  1074 

Degree  of  incidence,  97 1 


Degree  of  precision  (DP),  833 
Deleted  neighborhood,  720 
Demand  vector,  334 
De  Moivre,  Abraham,  616n.3 
De  Moivre-Laplace  limit  theorem, 
1050 

De  Moivre’s  formula,  616 
De  Morgan's  laws,  1018 
Density,  1061 

continuous  two-dimensional 
distributions,  1053 
of  a distribution,  1033 
Dependent  random  variables,  1055, 
1056 

Dependent  variables,  393,  1055,  1056 
Depth  First  Search  (DFS)  algorithms, 
977 

Derivatives: 

of  analytic  functions,  664-668, 
688-689,  A95-A96 
of  complex  functions,  622,  641 
Laplace  transforms  of,  211-212 
of  matrices  or  vectors,  127 
of  vector  functions,  379-380 
Derived  series,  687 
Descartes,  Rene,  356n.l,  391n.4 
Determinants,  293-301,  321 
Cauchy,  113 
Cramer’s  rule,  298-300 
defined,  A81 

general  properties  of,  295-298 
of  a matrix,  128 
of  matrix  products,  307-308 
of  order  n,  293 
proof  of,  A81-A83 
second-order,  291-292 
second-order  homogeneous  linear 
ODEs,  76 

third-order,  292-293 
Vandermonde,  113 
Wronski: 

second-order  homogeneous 
linear  ODEs,  75-78 
systems  of  ODEs,  139 
Developed,  in  a power  series,  683 
D.f.  (degrees  of  freedom),  number  of, 
1071,  1074 

DFS  (Depth  First  Search)  algorithms, 
977 

DFTs  (discrete  Fourier  transforms), 
528-531 

Diagonalization  of  matrices,  341-342 
Diagonally  dominant  matrices,  881 
Diagonal  matrices,  268 
inverse  of,  305-306 
scalar,  268 

Diameter  (graphs),  991 
Difference: 

complex  numbers,  610 
scalar  multiplication,  260 
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Difference  equations  (elliptic  PDEs), 
923-925 

Difference  quotients,  923 
Difference  table,  814 
Differentiable  complex  functions, 
622-623 

Differentiable  vector  functions,  379 
Differential  (total  differential), 

20,  45 

Differential  equations: 
applications  of,  3 
defined,  2 

Differential  form,  422 
exact,  21,  470 

first  fundamental  form,  of  S,  451 
floating-point,  of  numbers, 
791-792 

path  independence  and  exactness 
of,  422,  470 

Differential  geometry,  381 
Differential  operators: 
second-order,  60 
for  second-order  homogeneous 
linear  ODEs,  60-62 
Differentiation: 

of  Laplace  transforms,  238-240 
matrices  or  vectors,  127 
numeric,  838-839 
of  power  series,  687-688,  703 
termwise,  173,  687-688,  703 
Diffusion  equation,  459-460,  558. 

See  also  Heat  equation 
Digraphs  (directed  graphs),  97 1-972, 

1007 

computer  representation  of, 
972-974 
defined,  972 
incidence  matrix  of,  975 
subgraphs,  972 

Dijkstra,  Edsger  Wybe,  981n.4 
Dijkstra’s  algorithm,  981-983, 

1008 

DIJKSTRA,  ALGORITHM,  982 
Dimension  of  vector  spaces,  286, 

311,  359 
Diodes,  391n.4 
Dirac,  Paul,  226n.2 
Dirac  delta  function,  226-228,  237 
Directed  graphs,  see  Digraphs 
(directed  graphs) 

Directed  path,  1000 
Directional  derivatives  (scalar 
functions),  396-397,  411 
Direction  field  (slope  field),  9-10,  44 
Direct  methods  (linear  system 

solutions),  858,  898.  See  also 
iteration 

Dirichlet,  Peter  Gustav  LeJeune, 
462n.8 

Dirichlet  boundary  condition,  564 


Dirichlet  problem,  605,  923 
ADI  method,  929 
heat  equation,  564-566 
Laplace  equation,  593-596, 
925-928,  934-935 
Poisson  equation,  925-928 
two-dimensional  heat  equation, 
564-565 

uniqueness  theorem  for,  462,  784 
Dirichlet’ s discontinuous  factor,  514 
Discharge  (flow  modeling),  776 
Discrete  distributions,  1029-1032 
marginal  distributions  of, 
1053-1054 

two-dimensional,  1052-1053 
Discrete  Fourier  transforms  (DFTs), 
528-531 

Discrete  random  variables,  1029, 
1030-1032,  1061 
defined,  1030 

marginal  distributions  of,  1054 
Discrete  spectrum,  525 
Disjoint  events,  1016 
Disks: 

circular,  open  and  closed,  619 
mapping,  748-750 
Poisson’s  integral  formula,  779-780 
Dissipative  physical  systems,  422 
Distance: 
graphs,  991 
vector  norms,  866 
Distinct  real  roots: 

higher-order  homogeneous  linear 
ODEs,  112-113 

second-order  homogeneous  linear 
ODEs,  54-55 

Distinct  roots  (Frobenius  method), 

182 

Distributions,  226n.2.  See  also 
Frequency  distributions; 
Probability  distributions 
Distribution-free  tests,  1100 
Distribution  function,  1029-1032 
cumulative,  1029 
normal  distributions,  1046-1047 
of  random  variables,  1056,  A109 
sample,  1096 

two-dimensional  probability 

distributions,  1051-1052 
Distributive  laws,  264 
Distributivity,  363 
Divergence,  A75 
fluid  flow,  775 
of  vector  fields,  402-406 
of  vector  functions,  411,  453 
Divergence  theorem  of  Gauss,  405, 
470 

applications,  458^163 
vector  integral  calculus,  453-457 
Divergent  sequence,  672 


Divergent  series,  171,  673 
Division,  of  complex  numbers,  610, 
615-616 
Domain(s),  393 
bounded,  652 

doubly  connected,  658,  659 
of/,  620 
holes  of,  653 
mapping,  737,  747-750 
multiply  connected: 

Cauchy’s  integral  formula, 
662-663 

Cauchy’s  integral  theorem, 
658-659 

p-fold  connected,  652-653 
sets  in  complex  plane,  620 
simply  connected,  423,  646,  652, 
653 

triply  connected,  653,  658,  659 
Dominant  eigenvalue,  883 
Doolittle,  Myrick  H.,  853n.l 
Doolittle’s  method,  853-855,  898 
Dot  product,  312,  410.  See  also  Inner 
product 

Double  Fourier  series: 
defined,  582 

rectangular  membrane,  577-585 
Double  integrals  (vector  integral 
calculus),  426-432,  470 
applications  of,  428-429 
change  of  variables  in,  429-431 
evaluation  of,  by  two  successive 
integrations,  427-428 
Double  precision,  floating-point 
standard  for,  792 

Double  root  (Frobenius  method),  183 
Double  subscript  notation,  125 
Doubly  connected  domains,  658,  659 
DP  (degree  of  precision),  833 
Driving  force,  see  Input  (driving 
force) 

Duffing  equation,  160 
Duhamel,  Jean-Marie  Constant, 
603n.4 

Duhamel’ s formula,  603 


Eccentricity,  of  vertices,  991 
Edges: 

backward: 

cut  sets,  994 
initial  flow,  998 
of  a path,  992 
forward: 

cut  sets,  994 
initial  flow,  998 
of  a path,  992 
graphs,  971,  1007 
incident,  971 

Edge  chromatic  number,  1006 
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Edge  condition,  991 
Edge  incidence  list  (graphs),  973 
Efficient  algorithms,  979 
Eigenbases,  339-341 
Eigenfunctions,  605 

circular  membrane,  588 
one-dimensional  heat  equation, 
560 

Sturm-Liouville  Problems, 
499-500 

two-dimensional  heat  equation, 
565 

two-dimensional  wave  equation, 
578,  580 

vibrating  string,  547 
Eigenfunction  expansion,  504 
Eigenspaces,  326,  878 
Eigenvalues,  129-130,  166,  353,  605, 
877,  899.  See  also  Matrix 
eigenvalue  problems 
circular  membrane,  588 
complex  matrices,  347-351 
and  critical  points,  149 
defined,  324 
determining,  323-329 
dominant,  883 
finding,  324-328 
one-dimensional  heat  equation, 
560 

Sturm-Liouville  Problems, 
499-500,  A89 

two-dimensional  wave  equation, 
580 

vibrating  string,  547 
Eigenvalues  of  A,  322 
Eigenvalue  problem,  140 
Eigenvectors,  129-130,  166,  353, 

877,  899 

basis  of,  339-340 
convergent  sequence  of,  886 
defined,  324 
determining,  323-329 
finding,  324-328 
Eigenvectors  of  A,  322 
EISPACK,  789 
Elastic  curve,  120 
Electric  circuits: 

analogy  of  electrical  and 

mechanical  quantities, 
97-98 

second-order  nonhomogeneous 
linear  ODEs,  93-99 
Electrostatic  fields  (potential  theory), 
759-763 

complex  potential,  760-761 
superposition,  761-762 
Electrostatic  potential,  759 
Electrostatics  (Laplace’s  equation), 
593 

Elementary  matrix,  281 


Elementary  row  operations  (linear 
systems),  277 

Ellipses,  area  of  region  bounded  by, 
436 

Elliptic  PDEs: 
defined,  923 

numeric  analysis,  922-936 
ADI  method,  928-930 
difference  equations,  923-925 
Dirichlet  problem,  925-928 
irregular  boundary,  933-935 
mixed  boundary  value 

problems,  931-933 
Neumann  problem,  931 
Empirical  Rule,  1014 
Energies,  157 

Entire  function,  630,  642,  707,  718 
Entries: 

determinants,  294 
matrix,  125,  257 
Equal  complex  numbers,  609 
Equality: 

of  matrices,  126,  259 
of  vectors,  355 
Equally  likely  events,  1018 
Equal  spacing  (interpolation): 
Newton’s  backward  difference 
formula,  818-819 
Newton’s  forward  difference 
formula,  815-818 
Equilibrium  harvest,  36 
Equilibrium  solutions  (equilibrium 
points),  33-34 

Equipotential  curves,  36,  759,  761 
Equipotential  lines,  38 

electrostatic  fields,  759,  761 
fluid  flow,  77 1 
Equipotential  surfaces,  759 
Equivalent  vector  norms,  871 
Error(s): 

in  acceptance  sampling, 
1093-1094 

of  approximations,  495 
in  numeric  analysis,  842 
basic  error  principle,  796 
error  propagation,  795 
errors  of  numeric  results, 
794-795 
roundoff,  792 

in  statistical  tests,  1080-1081 
and  step  size  control,  906-907 
trapezoidal  rule,  830 
vector  norms,  866 
Error  bounds,  795 
Error  estimate,  908 
Error  function,  828,  A67-A68,  A98 
Essential  singularity,  715-716 
Estimation  of  parameters,  1063 
EULER,  ALGORITHM,  903 
Euler,  Leonhard,  3 In. 7,  71n,4 


Euler-Cauchy  equations,  71-74, 

104 

higher-order  nonhomogeneous 
linear  ODEs,  119-120 
Laplace’s  equation,  595 
third-order,  I VP  for,  108 
Euler-Cauchy  method,  901 
Euler  constant,  198 
Euler  formulas,  58 

complex  Fourier  integral,  523 
derivation  of,  479-480 
exponential  function,  631 
Fourier  coefficients  given  by,  476, 
484 

generalized,  582 
Taylor  series,  694 
trigonometric  function,  634 
Euler  graph,  980 
Euler’s  method: 
defined,  10 

error  of,  901-902,  906,  908 
first-order  ODEs,  10-11,  901-902 
backward  method,  909-910 
improved  method,  902-904 
higher  order  ODEs,  916-917 
Euler  trail,  980 
Even  functions,  486-488 
Even  periodic  extension,  488-490 
Events  (probability  theory), 
1016-1017,  1060 
addition  rule  for,  1021-1022 
arbitrary,  1021-1022 
complements  of,  1016 
defined,  1015 
disjoint,  1016 
equally  likely,  1018 
independent,  1022-1023 
intersection,  1016,  1017 
mutually  exclusive,  1016,  1021 
simple,  1015 
union,  1016-1017 
Exact  differential  equation,  21 
Exact  differential  form,  422,  470 
Exact  ODEs,  20-27,  45 
defined,  21 

integrating  factors,  23-26 
Existence,  problem  of,  39 
Existence  theorems: 
cubic  splines,  822 
first-order  ODEs,  39-42 
homogeneous  linear  ODEs: 
higher-order,  108 
second-order,  74 
of  the  inverse,  301-302 
Laplace  transforms,  209-210 
linear  systems,  138 
power  series  solutions,  172 
systems  of  ODEs,  137 
Expectation,  1035,  1037-1038, 

1057 
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Experiments: 

defined,  1015,  1060 
in  probability  theory,  1015-1016 
random,  1011,  1015-1016,  1060 
Experimental  error,  794 
Explicit  formulas,  913 
Explicit  method: 

heat  equation,  937,  940-941 
wave  equation,  943 
Explicit  solution,  21 
Exponential  decay,  5,  7 
Exponential  function,  630-633,  642 
formula  for,  A63 
Taylor  series,  694 
Exponential  growth,  5 
Exponential  integral,  formula  for,  A69 
Exposed  vertices,  1001,  1003 
Extended  complex  plane: 

conformal  mapping,  744—745 
defined,  718 

Extended  method  (separable  ODEs), 
17-18 

Extended  problems,  966 
Extrapolation,  808 
Extrema  (unconstrained 
optimization),  951 


Factorial  function,  1027,  A66,  A98. 

See  also  Gamma  functions 
Failing  to  reject  a hypothesis,  1081 
Fair  die,  1018,  1019 
False  decisions,  risks  of  making, 

1080 

False  position,  method  of,  807-808 
Family  of  curves,  one-parameter, 
36-37 

Family  of  solutions,  5 
Faraday,  Michael,  93n.7 
Fast  Fourier  transforms  (FFTs), 
531-532 

F-distribution,  1086,  A105-A108 
Feasibility  region,  954 
Feasible  solutions,  954-955 
basic,  957,  959 
degenerate,  962-965 
normal  form  of  linear  optimization 
problems,  957 
Fehlberg,  E.,  907 
Fehlberg’s  fifth-order  RK  method, 
907-908 

Fehlberg’s  fourth-order  RK  method, 
907-908 

FFTs  (fast  Fourier  transforms), 
531-532 

Fibonacci  (Leonardo  of  Pisa),  690n.2 
Fibonacci  numbers,  690 
Fibonacci’s  rabbit  problem,  690 
Finite  complex  plane,  718.  See  also 
Complex  plane 


Finite  jumps,  209 
First  boundary  value  problem,  see 
Dirichlet  problem 
First  fundamental  form,  of  S,  451 
First-order  method,  Euler  method  as, 
902 

First-order  ODEs,  2-45,  44 
defined,  4 

direction  fields,  9-10 
Euler’s  method,  10-11 
exact,  20-27,  45 
defined,  21 

integrating  factors,  23-26 
explicit  form,  4 
geometric  meanings  of,  9-12 
implicit  form,  4 
initial  value  problem,  38 — 43 
linear,  27-36 

Bernoulli  equation,  31-33 
homogeneous,  28 
nonhomogeneous,  28-29 
population  dynamics,  33-34 
modeling,  2-8 
numeric  analysis,  901-915 
Adams-Bashforth  methods, 
911-914 

Adams-Moulton  methods, 
913-914 

backward  Euler  method, 
909-910 

Euler’s  method,  901-902 
improved  Euler’s  method, 
902-904 

multistep  methods,  911-915 
Runge-Kutta-Fehlberg 
method,  906-908 
Runge-Kutta  methods, 
904-906 

orthogonal  trajectories,  36-38 
separable,  12-20,  44 

extended  method,  17-18 
modeling,  13-17 
systems  of,  165 
transformation  of  systems  to, 
157-159 

First  (first  order)  partial  derivatives, 
A71 

First  shifting  theorem  (i-shifting), 
208-209 

First  transmission  line  equation,  599 
Fisher,  Sir  Ronald  Aylmer,  1086 
Fixed  points: 
defined,  799 
of  a mapping,  745 
Fixed-point  iteration  (numeric 
analysis),  798-801,  842 
Fixed-point  systems,  numbers  in,  791 
Floating,  793 

Floating-point  form  of  numbers, 
791-792 


Flow  augmenting  paths,  992-993, 
998,  1008 

Flow  problems  in  networks 

(combinatorial  optimization), 
991-997 

cut  sets,  994-996 
flow  augmenting  paths,  992-993 
paths,  992 
Fluid  flow: 

Laplace’s  equation,  593 
potential  theory,  771-777 
Fluid  state,  404 
Flux  (motion  of  a fluid),  404 
Flux  integral,  444,  450 
Forced  motions,  68,  86 
Forced  oscillations: 

Fourier  analysis,  492^195 
second-order  nonhomogeneous 
linear  ODEs,  85-92 
damped,  89-90 
resonance,  88-91 
undamped,  87-89 
Forcing  function,  86 
Ford,  Lester  Randolph,  Jr.,  998n.7 
FORD-FULKERSON, 

ALGORITHM,  998 
Ford-Fulkerson  algorithm  for 

maximum  flow,  998-1001, 
1008 

Forest  (graph),  987 
Form(s): 

canonical,  344 
complex,  351 
differential,  422 
exact,  21,  470 
path  independence  and 
exactness  of,  422 
Hesse’s  normal,  366 
Lagrange’s,  812 
normal  (linear  optimization 

problems),  955-957,  959, 
969 

Pfaffian,  422 

polar,  of  complex  numbers, 
613-618,  631 
quadratic,  343-344,  346 
reduced  echelon,  279 
row  echelon,  279-280 
skew-Hermitian  and  Hermitian, 
351 

standard: 

first-order  ODEs,  27 
higher-order  homogeneous 
linear  ODEs,  105 
higher-order  linear  ODEs,  123 
power  series  method,  172 
second-order  linear  ODEs,  46, 
103 

triangular  (Gauss  elimination), 
846 


no 
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Forward  edge: 
cut  sets,  994 
initial  flow,  998 
of  a path,  992 
Four-color  theorem,  1006 
Fourier,  Jean-Baptiste  Joseph,  473n.  1 
Fourier  analysis,  473-539 

approximation  by  trigonometric 
polynomials,  495-498 
forced  oscillations,  492-495 
Fourier  integral,  510-517 
applications,  513-515 
complex  form  of,  522-523 
sine  and  cosine,  515-516 
Fourier  series,  474-483 
convergence  and  sum  of, 
480-481 

derivation  of  Euler  formulas, 
479-480 

even  and  odd  functions, 
486-488 

half-range  expansions,  488-490 
from  period  2tt  to  2 L, 

483-486 

Fourier  transforms,  522-536 
complex  form  of  Fourier 
integral,  522-523 
convolution,  527-528 
cosine,  518-522,  534 
discrete,  528-531 
fast,  531-532 
and  its  inverse,  523-524 
linearity,  526-527 
sine,  518-522,  535 
spectrum  representation,  525 
orthogonal  series  (generalized 
Fourier  series),  504—510 
completeness,  508-509 
mean  square  convergence, 
507-508 

Sturm-Liouville  Problems, 
498-504 

eigenvalues,  eigenfunctions, 
499-500" 

orthogonal  functions,  500-503 
Fourier-Bessel  series,  506-507,  589 
Fourier  coefficients,  476,  484,  538, 
582-583 

Fourier  constants,  504-505 
Fourier  cosine  integral,  515-516 
Fourier  cosine  series,  484,  486,  538 
Fourier  cosine  transforms,  518-522, 
534 

Fourier  cosine  transform  method,  518 
Fourier  integrals,  510-517,  539 
applications,  513-515 
complex  form  of,  522-523 
heat  equation,  568-57 1 
residue  integration,  729-730 
sine  and  cosine,  515-516 


Fourier-Legendre  series,  505-506, 
596-598 

Fourier  matrix,  530 
Fourier  series,  473-483,  538 

convergence  and  sum  or,  480-481 
derivation  of  Euler  formulas, 
479-480 
double,  577-585 

even  and  odd  functions,  486-488 
half-range  expansions,  488^190 
heat  equation,  558-563 
from  period  27 r to  2 L,  483-486 
Fourier  sine  integral,  515-516 
Fourier  sine  series,  477,  486,  538 
one-dimensional  heat  equation, 
561 

vibrating  string,  548 
Fourier  sine  transforms,  518-522, 

535 

Fourier  transforms,  522-536,  539 
complex  form  of  Fourier  integral, 
522-523 

convolution,  527-528 
cosine,  518-522,  534,  539 
defined,  522,  523 
discrete,  528-531 
fast,  531-532 
heat  equation,  571-574 
and  its  inverse,  523-524 
linearity  of,  526-527 
sine,  518-522,  535,  539 
spectrum  representation,  525 
Fourier  transform  method,  524 
Four-point  formulas,  841 
Fraction  defective  chars,  1091-1092 
Francis,  J.  G.  F.,  892 
Fredholm,  Erik  Ivar,  198n.7,  263n.3 
Free  condition  (spline  interpolation), 
823 

Free  oscillations  of  mass-spring 

system  (second-order  ODEs), 
62-70 

critical  damping,  65,  66 
damped  system,  64-65 
overdamping,  65-66 
undamped  system,  63-64 
underdamping,  65,  67 
Frenet,  Jean-Frederic,  392 
Frenet  formulas,  392 
Frequency  (in  statistics): 
absolute,  1012,  1019 
cumulative  absolute,  1012 
cumulative  relative,  1012 
relative  class,  1012 
Frequency  (of  vibrating  string),  547 
Frequency  distributions,  mean  and 
variance  of: 

expectation,  1037-1038 
moments,  1038 
transformation  of,  1036-1037 


Fresnel,  Augustin,  697n.4,  A68n.l 
Fresnel  integrals,  697,  A68 
Frobenius,  Georg,  180n.4 
Frobenius  method,  167,  180-187, 

201 

indicial  equation,  181-183 
proof  of,  A77-A81 
typical  applications,  183-185 
Frobenius  norm,  861 
Fulkerson.  Delbert  Ray,  998n.7 
Function,  of  complex  variable, 
620-621 

Function  spaces,  313 
Fundamental  matrix,  139 
Fundamental  period,  475 
Fundamental  region  (exponential 
function),  632 

Fundamental  system,  50,  104.  See 
also  Basis,  of  solutions 
Fundamental  Theorem: 

higher-order  homogeneous  linear 
ODEs,  106 

for  linear  systems,  288 
PDEs,  541-542 

second-order  homogeneous  linear 
ODEs,  48 


Galilei,  Galileo,  16n.4 
Gamma  functions,  190-191,  208 
formula  for,  A66-A67 
incomplete,  A67 
table,  A98 

GAMS  (Guide  to  Available 

Mathematical  Software),  789 
GAUSS,  ALGORITHM,  849 
Gauss,  Carl  Friedrich,  186n,5, 
608n.l,  1103 

Gauss  distribution,  1045.  See  also 
Normal  distributions 
Gauss  "Double  Ring,”  45 1 
Gauss  elimination,  320,  849 
linear  systems,  274-280, 
844-852,  898 

back  substitution,  274-276, 
846 

elementary  row  operations, 
277 

if  infinitely  many  solutions 
exist,  278 

if  no  solution  exists,  278-279 
operation  count,  850-851 
row  echelon  form,  279-280 
operation  count,  850-85 1 
Gauss  integration  formulas,  807, 
836-838,  843 

Gauss-Jordan  elimination,  302-304, 
856-857 

GAUSS-SEIDEL,  ALGORITHM, 
860 
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Gauss-Seidel  iteration,  858-863, 

898 

Gauss’s  hypergeometric  ODE,  186, 
202 

Geiger,  H„  1044,  1100 
Generalized  Euler  formula,  582 
Generalized  Fourier  series,  see 
Orthogonal  series 
Generalized  solution  (vibrating 
string),  550 

Generalized  triangle  inequality,  615 
General  powers,  639-640,  642 
General  solution: 

Bessel’s  equation,  194-200 
first-order  ODEs,  6,  44 
higher-order  linear  ODEs,  106, 
110-111,  123 

nonhomogeneous  linear  systems, 
160 

second-order  linear  ODEs: 
homogeneous,  49-51,  77-78, 
104 

nonhomogeneous,  80-81 
systems  of  ODEs,  131-132,  139 
Generating  functions,  179,  241 
Geometric  interpretation: 
partial  derivatives,  A70 
scalar  triple  product,  373,  374 
Geometric  multiplicity,  326,  878 
Geometric  series,  168,  675 
Taylor  series,  694 
uniformly  convergent,  698 
Gerschgorin,  Semyon  Aranovich, 
879n.6 

Gerschgorin’ s theorem,  879-881,  899 
Gibbs  phenomenon,  515 
Global  error,  902 
Golden  Rule,  15,  24 
Gompertz  model,  19 
Goodness  of  fit,  1096-1100 
Gosset,  William  Sealy,  1086n.4 
Goursat,  Edouard,  654n.l 
Goursat’s  proof,  654 
Gradient,  A75 
fluid  flow,  771 
of  a scalar  field,  395-402 
directional  derivatives, 

396-397 

maximum  increase,  398 
as  surface  normal  vector, 
398-399 

vector  fields  that  are,  400-401 
of  a scalar  function,  396,  411 
unconstrained  optimization,  952 
Gradient  method,  952.  See  also 
Method  of  steepest  descent 
Graphs,  970-971,  1007 

bipartite,  1001-1006,  1008 
center  of,  991 
complete,  974 


Graphs  ( Cont .) 

complete  bipartite,  1005 
computer  representation  of, 
972-974 

connected,  977,  981,  984 
diameter  of,  991 
digraphs  (directed  graphs), 
971-974,  1007 
computer  representation  of, 
972-974 
defined,  972 
incidence  matrix  of,  975 
subgraphs,  972 
Euler,  980 
forest,  987 

incidence  matrix  of,  975 
planar,  1005 
radius  of,  991 
sparse,  974 
subgraphs,  972 
trees,  984 

vertices,  971,  977,  1007 
adjacent,  971,  977 
central,  991 
coloring,  1005-1006 
double  labeling  of,  986 
eccentricity  of,  991 
exposed,  1001,  1003 
four-color  theorem,  1006 
scanning,  998 
weighted,  976 

Graphic  data  representation,  1012 
Gravitation  (Laplace’s  equation), 
593 

Gravity,  acceleration  of,  8 
Gravity  constant,  at  the  Earth's 
surface,  63 

Greedy  algorithm,  984-988 
Green,  George,  433n.4 
Green’s  first  formula,  461,  470 
Green’s  second  formula,  461,  470 
Green’s  theorem: 

first  and  second  forms  of, 

461 

in  the  plane,  433-438,  470 
Gregory,  James,  816n.2 
Gregory-Newton’s  (Newton’s) 
backward  difference 
interpolation  formula, 
818-819 

Gregory-Newton’s  (Newton’s) 
forward  difference 
interpolation  formula, 
815-818 

Growth  restriction,  209 
Guidepoints,  827 
Guide  to  Available  Mathematical 
Software  (GAMS),  789 
Guldin,  Habakuk,  452n.7 
Guidin’ s theorem,  452n.7 


Hadamard,  Jacques,  683n.l 
Half-planes: 

complex  analysis,  619-620 
mapping,  747-749 
Half-range  expansions  (Fourier 
series),  488^190,  538 
Hamilton,  William  Rowan,  976n.l 
Hamiltonian  cycle,  976 
Hankel,  Hermann,  200n.8 
Hankel  functions,  200 
Harmonic  conjugate  function 

(Laplace’s  equation),  629 
Harmonic  functions,  460,  462,  758 
complex  analysis,  628-629 
under  conformal  mapping,  763 
defined,  758 

Laplace’s  equation,  593,  628-629 
maximum  modulus  theorem, 
783-784 

potential  theory,  781-784,  786 
Harmonic  oscillation,  63-64 
Heat  equation,  459-460,  557-558 
Dirichlet  problem,  564-566 
Laplace’s  equation,  564 
numeric  analysis,  936-941,  948 
Crank-Nicolson  method, 
938-941 

explicit  method,  937,  940-941 
one-dimensional,  559 
solution: 

by  Fourier  integrals,  568-571 
by  Fourier  series,  558-563 
by  Fourier  transforms, 

571-574 

steady  two-dimensional  heat 
problems,  546-566 
two-dimensional,  564—566 
unifying  power  of  methods,  566 
Heat  flow: 

Laplace’s  equation,  593 
potential  theory,  767-770 
Heat  flow  lines,  767 
Heaviside,  Oliver,  204n.l 
Heaviside  calculus,  204n.l 
Heaviside  expansions,  228 
Heaviside  function,  217-219 
Helix,  386 

Henry,  Joseph,  93n.7 
Hermite,  Charles,  510n.8 
Hermite  interpolation,  826 
Hermitian  form,  351 
Hermitian  matrices,  347,  348,  350,  353 
Hertz,  Heinrich,  63n.3 
Hesse,  Ludwig  Otto,  366n.2 
Hesse’s  normal  form,  366 
Heun,  Karl,  905n.l 
Heun’s  method,  903.  See  also 
Improved  Euler’s  method 
Higher  functions,  167.  See  also 
Special  functions 
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Higher-order  linear  ODEs,  105-123 
homogeneous,  105-116,  123 
nonhomogeneous,  116-123 
systems  of,  see  Systems 
of  ODEs 

Higher  order  ODEs  (numeric 
analysis),  915-922 
Euler  method,  916-917 
Runge-Kutta  methods,  917-919 
Runge-Kutta-Ny strom  methods, 

“ 919-921 

Higher  transcendental  functions,  920 
High-frequency  line  equations,  600 
Hilbert,  David,  198n.7,  312n.4 
Hilbert  spaces,  363 
Histograms,  1012 
Holes,  of  domains,  653 
Homogeneous  first-order  linear 
ODEs,  28 

Homogeneous  higher-order  linear 
ODEs,  105-111 

Homogeneous  linear  systems,  138, 
165,  272,  290-291,  845 
constant-coefficient  systems, 
140-151 

matrices  and  vectors,  124—130,  321 
trivial  solution,  290 
Homogeneous  PDEs,  541 
Homogeneous  second-order  linear 
ODEs,  46-48 
basis,  50-52 

with  constant  coefficients,  53-60 
complex  roots,  57-59 
real  double  root,  55-56 
two  distinct  real-roots,  54-55 
differential  operators,  60-62 
Euler-Cauchy  equations,  71-74 
existence  and  uniqueness  of 
solutions,  74—79 
general  solution,  49-51,  77-78 
initial  value  problem,  49-50 
modeling  free  oscillations  of 

mass-spring  system,  62-70 
particular  solution,  49-51 
reduction  of  order,  51-52 
Wronskian,  75-78 
Hooke,  Robert,  62 
Hooke’s  law,  62 

Householder,  Alston  Scott,  888n.  1 1 
Householder’ s tridiagonalization 
method,  888-892 
Hyperbolic  analytic  functions 

(conformal  mapping),  750-754 
Hyperbolic  cosine,  635,  752 
Hyperbolic  functions,  635,  642 
formula  for,  A65-A66 
inverse,  640 
Taylor  series,  695 
Hyperbolic  PDEs: 
defined,  923 

numeric  analysis,  942-945 


Hyperbolic  sine,  635,  752 
Hypergeometric  distributions, 
1042-1044,  1061 
Hypergeometric  equations,  167, 
185-187 

Hypergeometric  functions,  167,  186 
Hypergeometric  series,  186 
Hypothesis,  1077 
Hypothesis  testing  (in  statistics), 
1063,  1077-1087 
comparison  of  means,  1084-1085 
comparison  of  variances,  1086 
errors  in  tests,  1080-1081 
for  mean  of  normal  distribution 
with  known  variance, 
1081-1083 

for  mean  of  normal  distribution 
with  unknown  variance, 
1083-1084 

one-  and  two-sided  alternatives, 
1079-1080 


Idempotent  matrices,  270 
Identity  mapping,  745 
Identity  matrices,  268 
Identity  operator  (second-order 

homogeneous  linear  ODEs),  60 
Ill-conditioned  equations,  805 
Ill-conditioned  problems,  864 
Ill-conditioned  systems,  864,  865, 

899 

Ill-conditioning  (linear  systems), 
864-872 

condition  number  of  a matrix, 
868-870 

matrix  norms,  866-868 
vector  norms,  866 
Image: 

conformal  mapping,  737 
linear  transformations,  313 
Imaginary  axis  (complex  plane),  611 
Imaginary  part  (complex  numbers), 
609 

Imaginary  unit,  609 
Impedance  (RLC  circuits),  95 
Implicit  formulas,  913 
Implicit  method: 

backward  Euler  scheme  as,  909 
for  hyperbolic  PDEs,  943 
Implicit  solution,  21 
Improper  integrals: 
defined,  205 

residue  integration,  726-732 
Improper  node,  142 
Improved  Euler’s  method: 
error  of,  904,  906,  908 
first-order  ODEs,  902-904 
Impulse,  of  a force,  225 
short  impulses,  225-226 
unit  impulse  function,  226 


Incidence  matrices  (graphs  and 
digraphs),  975 
Incident  edges,  971 
Inclusion  theorems: 
defined,  882 

matrix  eigenvalue  problems, 
879-884 

Incomplete  gamma  functions, 
formula  for,  A67 
Inconsistent  linear  systems,  277 
Indefinite  (quadratic  form),  346 
Indefinite  integrals: 
defined,  643 
existence  of,  656-658 
Indefinite  integration  (complex  line 
integral),  646-647 
Independence: 
of  path,  669 

of  path  in  domain  (integrals),  470, 
655 

of  random  variables,  1055-1056 
Independent  events,  1022-1023, 

1061 

Independent  sample  values,  1064 
Independent  variables: 
in  calculus,  393 
in  regression  analysis,  1 103 
Indicial  equation,  181-183,  188,  202 
Indirect  methods  (solving  linear 
systems),  858,  898 
Inference,  statistical,  1059,  1063 
Infinite  dimensional  vector  space, 

311 

Infinite  populations,  1044 
Infinite  sequences: 
bounded,  A93-A95 
monotone  real,  A72-A73 
power  series,  671-673 
Infinite  series,  673-674 
Infinity: 

analytic  of  singular  at,  718-719 
point  at,  718 
Initial  conditions: 

first-order  ODEs,  6,  7,  44 
heat  equation,  559,  568,  569 
higher-order  linear  ODEs: 
homogeneous,  107 
nonhomogeneous,  117 
one-dimensional  heat  equation, 
559 

PDEs,  541,  605 

second-order  homogeneous  linear 
ODEs,  49-50,  104 
systems  of  ODEs,  137 
two-dimensional  wave  equation, 
577 

vibrating  string,  545 
Initial  point  (vectors),  355 
Initial  value  problem  (IVP): 
defined,  6 

first-order  ODEs,  6,  39,  44,  901 
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Initial  value  problem  (IVP):  ( Cont .) 
bell-shaped  curve,  13 
existence  and  uniqueness  of 
solutions  for,  38-43 
higher-order  linear  ODEs,  123 
homogeneous,  107-108 
nonhomogeneous,  117 
Laplace  transforms,  213-216 
for  RLC  circuit,  99 
second-order  homogeneous  linear 
ODEs,  49,  74-75,  104 
systems  of  ODEs,  137 
Injective  mapping,  737n.l 
Inner  product  (dot  product),  312 
for  complex  vectors,  349 
invariance  of,  336 
vector  differential  calculus, 
361-367,  410 
applications,  364-366 
orthogonality,  361-363 
Inner  product  spaces,  311-313 
Input  (driving  force),  27,  86,  214 
Instability,  numeric  vs.  mathematical, 
796 

Integrals,  see  Line  integrals 
Integral  equations: 
defined,  236 

Laplace  transforms,  236-237 
Integral  of  a function,  Laplace 
transforms  of,  212-213 
Integral  transforms,  205,  518 
Integrand,  414,  644 
Integrating  factors,  23-26,  45 
defined,  24 
finding,  24-26 

Integration.  See  also  Complex 
integration 
constant  of,  18 

of  Laplace  transforms,  238-240 
numeric,  827-838 
adaptive,  835-836 
Gauss  integration  formulas, 
836-838 

rectangular  rule,  828 
Simpson's  rule,  831-835 
trapezoidal  rule,  828-83 1 
termwise,  of  power  series,  687, 
688 

Intermediate  value  theorem,  807-808 
Intermediate  variables,  393 
Intermittent  harvesting,  36 
INTERPOL,  ALGORITHM,  814 
Interpolation,  529 
defined,  808 

numeric  analysis,  808-820,  842 
equal  spacing,  815-819 
Lagrange,  809-812 
Newton’s  backward  difference 
formula,  818-819 
Newton’s  divided  difference, 
812-815 


Interpolation  (Cont.) 

Newton’s  forward  difference 
formula,  815-818 
spline,  820-827 

Interpolation  polynomial,  808,  842 
Interquartile  range,  1013 
Intersection,  of  events,  1016,  1017 
Intervals.  See  also  Confidence 
intervals 
class,  1012 
closed,  A72n.3 
convergence,  171,  683 
open,  4,  A72n.3 
Interval  estimates,  1065 
Invariance,  of  curl,  A85-A88 
Invariant  rank,  283 
Invariant  subspace,  878 
Inverse  cosine,  640 
Inverse  cotangent,  640 
Inverse  Fourier  cosine  transform,  518 
Inverse  Fourier  sine  transform,  519 
Inverse  Fourier  sine  transform 
method,  519 

Inverse  Fourier  transform,  524 
Inverse  hyperbolic  function,  640 
Inverse  hyperbolic  sine,  640 
Inverse  mapping,  741,  745 
Inverse  of  a matrix,  128,  301-309, 

321 

cancellation  laws,  306-307 
determinants  of  matrix 
products,  307-308 
formulas  for,  304-306 
Gauss-Jordan  method, 

302-304,  856-857 
Inverse  sine,  640 
Inverse  tangent,  640 
Inverse  transform,  205,  253 
Inverse  transformation,  315 
Inverse  trigonometric  function,  640 
Irreducible,  883 

Irregular  boundary  (elliptic  PDEs), 
933-935 

Irrotational  flow,  774 
Isocline,  10 

Isolated  critical  point,  152 
Isolated  essential  singularity,  715 
Isolated  singularity,  715 
Isotherms,  36,  38,  402,  767 
Iteration  (iterative)  methods: 
numeric  analysis,  798-808 
fixed-point  iteration,  798-801 
Newton’s  (Newton-Raphson) 
method,  801-805 
secant  method,  805-806 
speed  of  convergence,  804-805 
numeric  linear  algebra,  858-864, 
898 

Gauss-Seidel  iteration,  858-862 
Jacobi  iteration,  862-863 
IVP,  see  Initial  value  problem 


Jacobi,  Carl  Gustav  Jacob,  430n.3 
Jacobians,  430,  741 
Jacobi  iteration,  862-863 
Jordan,  Wilhelm,  302n.3 
Joukowski  airfoil,  739-740 


Kantorovich,  Leonid  Vitaliyevich, 
959n.l 

KCL  (Kirchhoff’s  Current  Law), 
93n.7,  274 
Kernel,  205 

Kinetic  friction,  coefficient  of,  19 
Kirchhoff,  Gustav  Robert,  93n.7 
Kirchhoff’s  Current  Law  (KCL), 
93n.7,  274 
Kirchhoff’s  law,  991 
Kirchhoff’s  Voltage  Law  (KVL),  29, 
93,  274 

Koopmans,  Tjalling  Charles,  959n.  1 
Kreyszig,  Erwin,  855n.3 
Rronecker,  Leopold,  500n,5 
Kronecker  delta,  A85 
Rronecker  symbol,  500 
Kruskal,  Joseph  Bernard,  985n.5 
KRUSKAL,  ALGORITHM,  985 
Kruskal’ s Greedy  algorithm, 
984-988,  1008 
kth  backward  difference,  818 
kth  central  moment,  1038 
kth  divided  difference,  813 
kth  forward  difference,  815-816 
kth  moment,  1038,  1065 
Kublanovskaya,  V.  N.,  892 
Kutta,  Wilhelm,  905n.l 
Kutta’s  third-order  method,  911 
KVL,  see  Kirchhoff’s  Voltage  Law 


Lagrange,  Joseph  Louis,  51n.l 
Lagrange  interpolation,  809-812 
Lagrange’s  form,  812,  842 
Laguerre,  Edmond,  504n.7 
Laguerre  polynomials,  241,  504 
Laguerre’s  equation,  240-241 
LAPACK,  789 

Laplace,  Pierre  Simon  Marquis  de, 
204n.  1 

Laplace  equation,  400,  564,  593-600, 
642,  923 

boundary  value  problem  in 
spherical  coordinates, 
594-596 

complex  analysis,  628-629 
in  cylindrical  coordinates, 

593-594 

Fourier-Legendre  series,  596-598 
heat  equation,  564 
numeric  analysis,  922-936,  948 
ADI  method,  928-930 
difference  equations,  923-925 
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Laplace  equation  (Cont.) 

Dirichlet  problem,  925-928, 
934-935 

Liebmann’s  method,  926-928 
in  spherical  coordinates,  594 
theory  of  solutions  of,  460,  786. 

See  also  Potential  theory 
two-dimensional  heat  equation, 
564 

two-dimensional  problems,  759 
uniqueness  theorem  for,  462 
Laplace  integrals,  516 
Laplace  operator,  401.  See  also 
Laplacian 

Laplace  transforms,  203-253 
convolution,  232-237 
defined,  204,  205 
of  derivatives,  211-212 
differentiation  of,  238-240 
Dirac  delta  function,  226-228 
existence,  209-210 
first  shifting  theorem  (s-shifting), 
208-209 

general  formulas,  248 
initial  value  problems,  213-216 
integral  equations,  236-237 
of  integral  of  a function,  212-213 
integration  of,  238-240 
linearity  of,  206-208 
notation,  205 

ODEs  with  variable  coefficients, 
240-241 

partial  differential  equations, 
600-603 

partial  fractions,  228-230 
second  shifting  theorem 

(f-shifting),  219-223 
short  impulses,  225-226 
systems  of  ODEs,  242-247 
table  of,  249-251 
uniqueness,  210 
unit  step  function  (Heaviside 
function),  217-219 
Laplacian,  400,  463,  605,  A76 
in  cylindrical  coordinates, 
593-594 

heat  equation,  557 
Laplace’s  equation,  593 
in  polar  coordinates,  585-592 
in  spherical  coordinates,  594 
of  u in  polar  coordinates,  586 
Lattice  points,  925-926 
Laurent,  Pierre  Alphonse,  708n.l 
Laurent  series,  708-719,  734 
analytic  or  singular  at  infinity, 
718-719 

point  at  infinity,  718 
Riemann  sphere,  718 
singularities,  715-717 
zeros  of  analytic  functions,  717 


Laurent’s  theorem,  709 
LCL  (lower  control  limit),  1088 
Least  squares  approximation,  of  a 
function,  875-876 

Least  squares  method,  872-876,  899 
Least  squares  principle,  1103 
Lebesgue,  Henri,  876n.5 
Left-handed  Cartesian  coordinate 
system,  369,  370,  A84 
Left-hand  limit  (Fourier  series),  480 
Left-sided  tests,  1079,  1082 
Legendre,  Adrien-Marie,  175n.l, 

1103 

Legendre  function,  175 
Legendre  polynomials,  167,  177-179, 
202 

Legendre’s  equation,  167,  175-  177, 

201,  202 

Laplace’s  equation,  595-596 
special,  169-170 

Leibniz,  Gottfried  Wilhelm,  15n.3 
Leibniz  test  for  real  series,  A73-A74 
Length: 

curves,  385 
vectors,  355,  356,  410 
Leonardo  of  Pisa,  690n.2 
Leontief,  Wassily,  334n.l 
Leontief  input-output  model,  334 
Leslie  model,  331 
Level  surfaces,  380,  398 
LFTs,  see  Linear  fractional 
transformations 
Libby,  Willard  Frank,  13n.2 
Liebmann’s  method,  926-928 
Likelihood  function,  1066 
Limit  (sequences),  672 
Limit  cycle,  158-159,  621 
Limit  Z,  378 
Limit  point,  A93 
Limit  vector,  378 
Linear  algebra,  255.  See  also 
Numeric  linear  algebra 
determinants,  293-301 
Cramer’s  rule,  298-300 
general  properties  of,  295-298 
of  matrix  products,  307-308 
second-order,  291-292 
third-order,  292-293 
inverse  of  a matrix,  301-309 
cancellation  laws,  306-307 
determinants  of  matrix 
products,  307-308 
formulas  for,  304-306 
Gauss-Jordan  method, 

302-304 

linear  systems,  272-274 
back  substitution,  274—276 
elementary  row  operations,  277 
Gauss  elimination,  274-280 
homogeneous,  290-291 


Linear  algebra  {Cont.) 

nonhomogeneous,  291 
solutions  of,  288-291 
matrices  and  vectors,  257-262 
addition  and  scalar 

multiplication  of, 
259-261 

diagonal  matrices,  268 
linear  independence  and 

dependence  of  vectors, 
282-283 

matrix  multiplication, 

263-266,  269-279 
notation,  258 
rank  of,  283-285 
symmetric  and  skew-symmetric 
matrices,  267-268 
transposition  of,  266-267 
triangular  matrices,  268 
matrix  eigenvalue  problems, 
322-353 

applications,  329-334 
complex  matrices  and  forms, 
346-352 

determining  eigenvalues  and 
eigenvectors,  323-329 
diagonalization  of  matrices, 
341-342 

eigenbases,  339-341 
orthogonal  matrices,  337-338 
orthogonal  transformations,  336 
quadratic  forms,  343-344 
symmetric  and  skew- 

symmetric  matrices, 
334-336 

transformation  to  principal 
axes,  344 
vector  spaces: 

inner  product  spaces,  311-313 
linear  transformations, 

313-317 
real,  309-311 
special,  285-287 
Linear  combination: 

homogeneous  linear  ODEs: 
higher-order,  107 
second-order,  48 
of  matrices,  129,  271 
of  vectors,  129,  282 
of  vectors  in  vector  space,  311 
Linear  dependence,  of  vectors, 
282-283 

Linear  element,  386 
Linear  equations,  systems  of,  see 
Linear  systems 

Linear  fractional  transformations 
(LFTs),  742-750,  757 
extended  complex  plane,  744-745 
mapping  standard  domains, 
747-750 
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Linear  independence: 

scalar  triple  product,  373 
of  vectors,  282-283 
Linear  inequalities,  954 
Linear  interpolation,  809-810 
Linearity: 

Fourier  transforms,  526-527 
Laplace  transforms,  206-208 
line  integrals,  645 

Linearity  principle,  see  Superposition 
principle 

Linearization,  152-155 
Linearized  system,  153 
Linearly  dependent  functions: 

higher-order  homogeneous  linear 
ODEs,  106,1 09 

second-order  homogeneous  linear 
ODEs,  50,  75 

Linearly  dependent  sets,  129,  311 
Linearly  dependent  vectors,  282-283, 
285 

Linearly  independent  functions: 
higher-order  homogeneous  linear 
ODEs,  106,  109,  113 
second-order  homogeneous  linear 
ODEs,  50,  75 

Linearly  independent  sets,  128-129, 
311 

Linearly  independent  vectors,  282-283 
Linearly  related  variables,  1109 
Linear  mapping,  314.  See  also  Linear 
transformations 
Linear  ODEs,  45,  46 
first  order,  27-36 

Bernoulli  equation,  31-33 
homogeneous,  28 
nonhomogeneous,  28-29 
population  dynamics,  33-34 
higher-order,  105-123 
homogeneous,  105-116 
nonhomogeneous,  116-122 
higher-order  homogeneous,  105 
second-order,  46-104 

homogeneous,  46-78,  103 
nonhomogeneous,  79-102,  103 
Linear  operations: 

Fourier  cosine  and  sine 
transforms  as,  520 
integration  as,  645 
Linear  operators  (second-order 

homogeneous  linear  ODEs),  61 
Linear  optimization,  see  Constrained 
(linear)  optimization 
Linear  PDEs,  541 

Linear  programming  problems,  954—958 
normal  form  of  problems,  955-957 
simplex  method,  958-968 
degenerate  feasible  solution, 
962-965 

difficulties  in  starting,  965-968 


Linear  systems,  138-139,  165, 
272-274,  320,  845 
back  substitution,  274—276 
defined,  267,  845 
elementary  row  operations,  277 
Gauss  elimination,  274-280, 
844-852 

applications,  277-180 
back  substitution,  274—276 
elementary  row  operations,  277 
operation  count,  850-851 
row  echelon  form,  279-280 
Gauss-Jordan  elimination, 

856-857 

homogeneous,  138,  165,  272, 
290-291 

constant-coefficient  systems, 
140-151 

matrices  and  vectors,  124-130 
ill-conditioning,  864-872 

condition  number  of  a matrix, 
868-870 

matrix  norms,  866-868 
vector  norms,  866 
iterative  methods,  858-864 
Gauss-Seidel  iteration, 

858-882 

Jacobi  iteration,  862-863 
LU-factorization,  852-855 
Cholesky’s  method,  855-856 
of  m equations  in  n unknowns,  272 
nonhomogeneous,  138,  160-163, 
272,  290,  291 
solutions  of,  288-291,  898 
Linear  transformations,  320 

motivation  of  multiplication  by, 
265-266 

vector  spaces,  313-317 
Line  integrals,  643-652,  669 
basic  properties  of,  645 
bounds  for,  650-651 
definition  of,  414,  643-645 
existence  of,  646 
indefinite  integration  and 

substitution  of  limits, 
646-647 

path  dependence  of,  and 

integration  around  closed 
curves,  421-425 

representation  of  a path,  647-650 
vector  integral  calculus,  413 — 419 
definition  and  evaluation  of, 
414-416 

path  dependence  of,  418 — 426 
work  done  by  a force,  416-417 
Lines  of  constant  revenue,  954 
Lines  of  force,  760-762 
LINPACK,  789 
Liouville,  Joseph,  499n.4 
Liouville’s  theorem,  666-667 


Lipschitz,  Rudolf,  42n.9 
Lipschitz  condition,  42 
Ljapunov,  Alexander  Michailovich, 
149n.2 

Local  error,  830 
Local  maximum  (unconstrained 
optimization),  952 
Local  minimum  (unconstrained 
optimization),  951 
Local  truncation  error,  902 
Logarithm,  636-639 

natural,  636-638,  642,  A63 
Taylor  series,  695 
Logarithmic  decrement,  70 
Logarithmic  integral,  formula  for,  A69 
Logarithm  of  base  ten,  formula  for, 
A63 

Logistic  equation,  32-33 
Longest  path,  976 
Loss  of  significant  digits  (numeric 
analysis),  793-794 
Lotka,  Alfred  J„  155n.3 
Lotka-Volterra  population  model, 
155-156 

Lot  tolerance  percent  defective 
(LTPD),  1094 

Lower  confidence  limits,  1068 
Lower  control  limit  (LCL),  1088 
Lower  triangular  matrices,  268 
LTPD  (lot  tolerance  percent 
defective),  1094 

LU-factorization  (linear  systems), 
852-855 


Machine  numbers,  792 
Maclaurin,  Colin,  690n.2,  712 
Maclaurin  series,  690,  694-696 
Main  diagonal: 

determinants,  294 
matrix,  125,  258 
Malthus,  Thomas  Robert,  5n.l 
Malthus’  law,  5,  33 
Maple,  789 

Maple  Computer  Guide,  789 
Mapping,  313,  736,  737,  757 
bijective,  737n.  1 
conformal,  736-757 

boundary  value  problems, 
763-767,  A96 
defined,  738 

geometry  of  analytic  functions, 
737-742 
linear  fractional 

transformations, 

742-750 

Riemann  surfaces,  754-756 
by  trigonometric  and 

hyperbolic  analytic 
functions,  750-754 
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Mapping  ( Cont .) 
of  disks,  748-750 
fixed  points  of,  745 
of  half-planes  onto  half-planes,  748 
identity,  745 
injective,  737n.l 
inverse,  741,  745 
linear,  314.  See  also  Linear 
transformations 
one-to-one,  737n.l 
spectral  mapping  theorem,  878 
surjective,  737n.l 
Marconi,  Guglielmo,  63n.3 
Marginal  distributions,  1053-1055, 
1062 

of  continuous  distributions,  1055 
of  discrete  distributions, 

1053-1054 

Mariotte,  Edrne,  19n.5 
Markov,  Andrei  Andrejevitch,  270n.l 
Markov  process,  270,  331 
MATCHING,  ALGORITHM.  1003 
Matching,  1008 

assignment  problems,  1001 
complete,  1002 

maximum  cardinality,  1001,  1008 
Mathcad,  789 
Mathematica,  789 
Mathematica  Computer  Guide,  789 
Mathematical  models,  see  Models 
Mathematical  modeling,  see 
Modeling 

Mathematical  statistics,  1009, 
1063-1113 

acceptance  sampling,  1092-1096 
errors  in,  1093-1094 
rectification,  1094-1095 
confidence  intervals,  1068-1077 
for  mean  of  normal  distribution 
with  known  variance, 
1069-1071 

for  mean  of  normal  distribution 
with  unknown  variance, 
1071-1073 

for  parameters  of  distributions 
other  than  normal,  1076 
for  variance  of  a normal 

distribution,  1073-1076 
correlation  analysis,  1108-1111 
defined,  1103 

test  for  correlation  coefficient, 
1110-1111 
defined,  1063 

goodness  of  fit,  1096-1100 
hypothesis  testing,  1077-1087 
comparison  of  means, 
1084-1085 

comparison  of  variances,  1086 
errors  in  tests,  1080-1081 


for  mean  of  normal  distribution 
with  known  variance, 
1081-1083 

for  mean  of  normal  distribution 
with  unknown  variance, 
1083-1084 
one-  and  two-sided 

alternatives,  1079-1080 
main  purpose  of,  1015 
nonparametric  tests,  1100-1102 
point  estimation  of  parameters, 
1065-1068 

quality  control,  1087-1092 
for  mean,  1088-1089 
for  range,  1090-1091 
for  standard  deviation,  1090 
for  variance,  1089-1090 
random  sampling,  1063-1065 
regression  analysis,  1103-1108 
confidence  intervals  in, 
1107-1108 
defined,  1103 
Matlab,  789 

Matrices,  124-130,  256-262,  320 
addition  and  scalar  multiplication 
of,  259-261 

calculations  with,  126-127 
condition  number  of,  868-870 
definitions  and  terms,  125-126, 
128,  257 
diagonal,  268 

diagonalization  of,  341-342 
eigenvalues,  129-130 
equality  of,  126,  259 
fundamental,  139 
inverse  of,  128,  301-309,  321 
cancellation  laws,  306-307 
determinants  of  matrix 
products,  307-308 
formulas  for,  304-306 
Gauss-Jordan  method, 

302-304,  856-857 
matrix  multiplication,  127, 
263-266,  269-279 
applications  of,  269-279 
cancellation  laws,  306-307 
determinants  of  matrix 
products,  307-308 
scalar,  259-261 
normal,  352,  882 
notation,  258 
orthogonal,  337-338 
rank  of,  283-285 
square,  126 

symmetric  and  skew-symmetric, 
267-268 

transposition  of,  266-267 
triangular,  268 
unitary,  347-350,  353 


Matrix  eigenvalue  problems, 

322-353,  876-896 
applications,  329-334 
choice  of  numeric  method  for, 

879 

complex  matrices  and  forms, 
346-352 

determining  eigenvalues  and 
eigenvectors,  323-329 
diagonalization  of  matrices, 
341-342 

eigenbases,  339-341 
inclusion  theorems,  879-884 
orthogonal  matrices,  337-338 
orthogonal  transformations,  336 
power  method,  885-888 
QR-factorization,  892-896 
quadratic  forms,  343-344 
symmetric  and  skew-symmetric 
matrices,  334—336 
transformation  to  principal  axes, 
344 

tridiagonalization,  888-892 
Matrix  multiplication,  127,  263-266, 
269-279 

applications  of,  269-279 
cancellation  laws,  306-307 
determinants  of  matrix  products, 
307-308 
scalar,  259-261 
Matrix  norms,  861,  866-868 
Maximum  cardinality  matching, 

1001,  1003-1004,  1008^ 
Maximum  flow: 

Ford-Fulkerson  algorithm, 
998-1000 

and  minimum  cut  set,  996 
Maximum  increase: 

gradient  of  a scalar  field,  398 
unconstrained  optimization,  95 1 
Maximum  likelihood  estimates 
(MLEs),  1066-1067 
Maximum  likelihood  method, 
1066-1067,  1113 

Maximum  modulus  theorem,  782-784 
Maximum  principle,  783 
Mean(s),  1013-1014,  1061 
comparison  of,  1084-1085 
control  chart  for,  1088-1089 
of  normal  distributions: 
confidence  intervals  for, 
1069-1073 

hypothesis  testing  for, 
1081-1084 

probability  distributions, 
1035-1039 

addition  of,  1057-1058 
transformation  of,  1036-1037 
sample,  1064 
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Mean  square  convergence  (orthogonal 
series),  507-508 
Mean  value  (fluid  flow),  774n.l 
Mean  value  property: 

analytic  functions,  781-782 
harmonic  functions,  782 
Mean  value  theorem,  395 
for  double  integrals,  427 
for  surface  integrals,  448 
for  triple  integrals,  456-457 
Median,  1013,  1100-1101 
Mendel,  Gregor,  1100 
Meromorphic  function,  719 
Mesh  incidence  matrix,  262 
Mesh  points  (lattice  points,  nodes), 
925-926 
Mesh  size,  924 

Method  of  characteristics  (PDEs),  555 
Method  of  least  squares,  872-876, 

899 

Method  of  moments,  1065 
Method  of  separating  variables, 

12-13 

circular  membrane,  587 
partial  differential  equations, 
545-553,  605 
Fourier  series,  548-551 
satisfying  boundary  conditions, 
546-548 

two  ODEs  from  wave 

equation,  545-546 
vibrating  string,  545-546 
Method  of  steepest  descent,  952-954 
Method  of  undetermined  coefficients: 
higher-order  homogeneous  linear 
ODEs,  115,  123 

nonhomogeneous  linear  systems 
of  ODEs,  161 

second-order  nonhomogeneous 
linear  ODEs,  81-85,  104 
Method  of  variation  of  parameters: 
higher-order  nonhomogeneous 

linear  ODEs,  118-120,  123 
nonhomogeneous  linear  systems 
of  ODEs,  162-163 
second-order  nonhomogeneous 
linear  ODEs,  99-102,  104 
Minimization  (normal  form  of  linear 
optimization  problems),  957 
Minimum  (unconstrained 
optimization),  951 
Minimum  cut  set,  996 
Minors,  of  determinants,  294 
Mixed  boundary  condition  (two- 
dimensional  heat  equation), 

564 

Mixed  boundary  value  problem,  605, 
923.  See  also  Robin  problem 
elliptic  PDEs,  931-933 
heat  conduction,  768-769 


Mixed  type  PDEs,  555 
Mixing  problems,  14 
MLEs  (maximum  likelihood 
estimates),  1066-1067 
ML-inequality,  650-651 
Mobius,  August  Ferdinand,  447n.5 
Mobius  strip,  447 
Mobius  transformations,  743.  See 
also  Linear  fractional 
transformations  (LFTs) 
Models,  2 

Modeling,  1,  2-8,  44 

and  concept  of  solution,  4—6 
defined,  2 

first-order  ODEs,  2-8 
initial  value  problem,  6 
separable  ODEs,  13-17 
typical  steps  of,  6-7 
and  unifying  power  of 
mathematics,  766 
Modification  Rule  (method  of 

undetermined  coefficients): 
higher-order  homogeneous  linear 
ODEs,  115-116 
second-order  nonhomogeneous 
linear  ODEs,  81,  83 
Modulus  (complex  numbers),  613 
Moments,  method  of,  1065 
Moments  of  inertia,  of  a region,  429 
Moment  vector  (vector  moment), 

371 

Monotone  real  sequences, 

A72-A73 

Moore,  Edward  Forrest,  977n.2 
MOORE,  ALGORITHM,  977 
Moore’s  BFS  algorithm,  977-980, 
1008 

Morera’s  theorem,  667 
Moulton,  Forest  Ray,  913n.3 
Multinomial  distribution,  1045 
Multiple  complex  roots,  1 15 
Multiple  points,  curves  with,  383 
Multiplication: 

of  complex  numbers,  609,  610, 
615 

in  conditional  probability, 
1022-1023 

matrix,  127,  263-266 
applications  of,  269-279 
cancellation  laws,  306-307 
determinants  of  matrix 
products,  307-308 
scalar,  259-261 
of  means,  1057-1058 
of  power  series,  687 
scalar,  126-127,  259-261,  310 
termwise,  173,  687 
of  transforms,  232.  See  also 
Convolution 

Multiplicity,  algebraic,  326,  878 


Multiply  connected  domains,  652, 
653 

Cauchy’s  integral  formula, 
662-663 

Cauchy’s  integral  theorem, 
658-659 

Multistep  methods,  911-915,  947 
Adams-Bashforth  methods, 
911-914 

Adams-Moulton  methods, 
913-914 
defined,  908 
first-order  ODEs,  911 
Mutually  exclusive  events,  1016, 
1021 

m X n matrix,  258 


Nabla,  396 

NAG  (Numerical  Algorithms  Group, 
Inc.),  789 

National  Institute  of  Standards  and 
Technology  (NIST),  789 
Natural  condition  (spline 
interpolation),  823 
Natural  frequency,  63 
Natural  logarithm,  636-638,  642, 

A63 

Natural  spline,  823 
u-dimensional  vector  spaces,  31 1 
Negative  (scalar  multiplication),  260 
Negative  definite  (quadratic  form), 
346 

Neighborhood,  619,  720 
Net  flow,  through  cut  set,  994-995 
NETLIB,  789 
Networks: 
defined,  991 

flow  problems  in,  991-997 
cut  sets,  994-996 
flow  augmenting  paths, 
992-993 
paths,  992 

Neumann,  Carl,  198n.7 
Neumann,  John  von,  959n.l 
Neumann  boundary  condition,  564 
Neumann  problem,  605,  923 
elliptic  PDEs,  93 1 
Laplace’s  equation,  593 
two-dimensional  heat  equation, 
564 

Neumann’s  function,  198 
NEWTON,  ALGORITHM,  802 
Newton,  Sir  Isaac,  15n.3 
Newton-Cotes  formulas,  833,  843 
Newton’s  (Gregory-Newton’s) 
backward  difference 
interpolation  formula,  818-819 
Newton’s  divided  difference 

interpolation,  812-815,  842 
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Newton’s  divided  difference 

interpolation  formula,  814—815 
Newton’s  (Gregory-Newton's) 
forward  difference 
interpolation  formula, 

815-818,  842 

Newton’s  law  of  cooling,  15-16 
Newton’s  law  of  gravitation,  377 
Newton’s  (Newton-Raphson) 
method,  801-805,  842 
Newton’s  second  law,  11,  63,  245, 
544,  576 

Neyman,  Jerzy,  1068n.l,  1077n.2 
Nicolson,  Phyllis,  938n.5 
Nicomedes,  391n.4 
Nilpotent  matrices,  270 
NIST  (National  Institute  of  Standards 
and  Technology),  789 
Nodal  incidence  matrix,  262 
Nodal  lines,  580-581,  588 
Nodes,  165,  925-926 
degenerate,  145-146 
improper,  142 
interpolation,  808 
proper,  143 

spline  interpolation,  820 
trapezoidal  rule,  829 
vibrating  string,  547 
Nonbasic  variables,  960 
Nonconservative  physical  systems, 
422 

Nonhomogeneous  linear  ODEs: 
convolution,  235-236 
first-order,  28-29 
higher-order,  106,  116-122 
second-order,  79-102 
defined,  47 

method  of  undetermined 
coefficients,  81-85 
modeling  electric  circuits, 
93-99 

modeling  forced  oscillations, 
85-92 

particular  solution,  80 
solution  by  variation  of 
parameters,  99-102 
Nonhomogeneous  linear  systems, 

138,  160-163,  166,  272,  290, 
291,  845 

method  of  undetermined 
coefficients,  161 

method  of  variation  of  parameters, 
162-163 

Nonhomogeneous  PDEs,  541 
Nonlinear  ODEs,  46 
first-order,  27 
higher-order  homogeneous, 

105 

second-order,  46 
Nonlinear  PDEs,  541 


Nonlinear  systems,  qualitative 
methods  for,  152-160 
linearization,  152-155 
Lotka-Volterra  population  model, 
155-156 

transformation  to  first-order 

equation  in  phase  plane, 
157-159 

Nonparametric  tests  (statistics), 
1100-1102,  1113 
Nonsingular  matrices,  128,  301 
Norm(s): 

matrix,  861,  866-868 
orthogonal  functions,  500 
vector,  312,  355,  410,  866 
Normal  accelerations,  391 
Normal  acceleration  vector,  387 
Normal  derivative,  437 
defined,  437 

mixed  problems,  768,  931 
Neumann  problems,  931 
solutions  of  Laplace’s  equation, 
460 

Normal  distributions,  1045-1051, 
1062 

as  approximation  of  binomial 
distribution,  1049-1050 
confidence  intervals: 

for  means  of,  1069-1073 
for  variances  of,  1073-1076 
distribution  function,  1046-1047 
means  of: 

confidence  intervals  for, 
1069-1073 

hypothesis  testing  for, 
1081-1084 

numeric  values,  1047-1048 
tables,  A101-A102 
two-dimensional,  1110 
working  with  normal  tables, 
1048-1049 

Normal  equations,  873,  1105-1106 
Normal  form  (linear  optimization 

problems),  955-957,  959,  969 
Normalizing,  eigenvectors,  326 
Normal  matrices,  352,  882 
Normal  mode: 

circular  membrane,  588 
vibrating  string,  547-548 
Normal  plane,  390 
Normal  random  variables,  1045 
Normal  vectors,  366,  441 
Not  rejecting  a hypothesis,  1081 
No  trend  hypothesis,  1101 
nth  order  linear  ODEs,  105,  123 
nth-order  ODEs,  134-135 
nth  partial  sum,  170 
Fourier  series,  495 
of  series,  673 
nth  roots,  616 


nth  roots  of  unity,  617 
Null  hypothesis,  1078 
Nullity,  287,  291 
Null  space,  287,  291 
Numbers: 

acceptance,  1092 
Bernoulli’s  law  of  large  numbers, 
1051 

chromatic,  1006 
complex,  608-619,  641 
addition  of,  609,  610 
conjugate,  612 
defined,  608 
division  of,  610 
multiplication  of,  609,  610 
polar  form  of,  613-618 
subtraction  of,  610 
condition,  868-870,  899 
Fibonacci,  690 

floating-point  form  of,  791-792 
machine,  792 
random,  1064 

Number  of  degrees  of  freedom,  1071, 
1074 

Numerics,  see  Numeric  analysis 
Numerical  Algorithms  Group,  Inc. 
(NAG),  789 

Numerically  stable  algorithms,  796, 
842 

Numerical  Recipes,  789 
Numeric  analysis  (numerics), 

787-843 
algorithms,  796 
basic  error  principle,  796 
error  propagation,  795 
errors  of  numeric  results,  794-795 
floating-point  form  of  numbers, 
791-792 

interpolation,  808-820 
equal  spacing,  815-819 
Lagrange,  809-812 
Newton’s  backward  difference 
formula,  818-819 
Newton’s  divided  difference, 
812-815 

Newton’s  forward  difference 
formula,  815-818 
spline,  820-827 

loss  of  significant  digits,  793-794 
numeric  differentiation,  838-839 
numeric  integration,  827-838 
adaptive,  835-836 
Gauss  integration  formulas, 
836-838 

rectangular  rule,  828 

Simpson’s  rule,  831-835 
trapezoidal  rule,  828-831 
for  ODEs,  901-922 
first-order,  901-915 
higher  order,  915-922 
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numeric  integration  ( Cont .) 
for  PDEs,  922-945 
elliptic,  922-936 
hyperbolic,  942-945 
parabolic,  936-942 
roundoff,  792-793 
software  for,  788-789 
solution  of  equations  by  iteration, 
798-808 

fixed-point  iteration,  798-801 
Newton’s  (Newton-Raphson) 
method,  801-805 
secant  method,  805-806 
speed  of  convergence,  804-805 
spline  interpolation,  820-827 
Numeric  differentiation,  838-839 
Numeric  integration,  827-838 
adaptive,  835-836 
Gauss  integration  formulas, 
836-838 

rectangular  rule,  828 
Simpson’s  rule,  831-835 
trapezoidal  rule,  828-83 1 
Numeric  linear  algebra,  844-899 
curve  fitting,  872-876 
least  squares  method,  872-876 
linear  systems,  845 

Gauss  elimination,  844-852 
Gauss-Jordan  elimination, 
856-857 

ill-conditioning  norms, 

864-872 

iterative  methods,  858-864 
LU-factorization,  852-855 
matrix  eigenvalue  problems, 
876-896 

inclusion  theorems,  879-884 
power  method,  885-888 
QR-factorization,  892-896 
tridiagonalization,  888-892 
Numeric  methods: 
choice  of,  791,  879 
defined,  791 
n X n matrix,  125 
Nystrom,  E.  J.,  919 

Objective  function,  951,  969 
OCs  (operating  characteristics),  1081 
OC  curve,  see  Operating 
characteristic  curve 
Odd  functions,  486-488 
Odd  periodic  extension,  488^190 
ODEs,  see  Ordinary  differential 
equations 

Ohm,  Georg  Simon,  93n.7 
Ohm’s  law,  29 

One-dimensional  heat  equation,  559 
One-dimensional  wave  equation, 
544-545 


One-parameter  family  of  curves,  36-37 
One-sided  alternative  (hypothesis 
testing),  1079-1080 
One-sided  tests,  1079 
One-step  methods,  908,  911,  947 
One-to-one  mapping,  737n.l 
Open  annulus,  619 
Open  circular  disk,  619 
Open  integration  formula,  838 
Open  intervals,  4,  A72n.3 
Open  Leontief  input-output  model, 
334 

Open  set,  in  complex  plane,  620 
Operating  characteristic  curve  (OC 
curve),  1081,  1092,  1095 
Operating  characteristics  (OCs),  1081 
Operational  calculus,  60,  203 
Operation  count  (Gauss  elimination), 
850 

Operators,  60-61,  313 
Optimal  solutions  (normal  form  of 
linear  optimization  problems), 
957 

Optimization: 

combinatorial,  970,  975-1008 
assignment  problems, 
1001-1006 

flow  problems  in  networks, 
991-997 

Ford-Fulkerson  algorithm  for 
maximum  flow, 
998-1001 

shortest  path  problems, 
975-980 

constrained  (linear),  951,  954-968 
normal  form  of  problems, 
955-957 

simplex  method,  958-968 
unconstrained: 

basic  concepts,  951-952 
method  of  steepest  descent, 
952-954 

Optimization  methods,  949 
Optimization  problems,  949, 

954-958 

normal  form  of  problems, 

955-957 
objective,  951 
simplex  method,  958-968 
degenerate  feasible  solution, 
962-965 

difficulties  in  starting,  965-968 

Order: 

and  complexity  of  algorithms,  978 
Gauss  elimination,  850 
of  iteration  process,  804 
of  PDE,  540 
singularities,  714 
Ordering  (Greedy  algorithm),  987 
Order  statistics,  1 100 


Ordinary  differential  equations 
(ODEs),  44 
autonomous,  11,  33 
defined,  1,  3^1 
first-order,  2-45 

direction  fields,  9-10 
Euler’s  method,  10-11 
exact,  20-27 

geometric  meanings  of,  9-12 
initial  value  problem,  38-43 
linear,  27-36 
modeling,  2-8 
numeric  analysis,  901-915 
orthogonal  trajectories,  36-38 
separable,  12-20 
higher-order  linear,  105-123 
homogeneous,  105-116,  123 
nonhomogeneous,  116-123 
systems  of,  see  Systems  of 
ODEs 

Laplace  transforms,  203-253 
convolution,  232-237 
defined,  204,  205 
of  derivatives,  211-212 
differentiation  of,  238-240 
Dirac  delta  function,  226-228 
existence,  209-210 
first  shifting  theorem 

(i-shifting),  208-209 
general  formulas,  248 
initial  value  problems, 
213-216 

integral  equations,  236-237 
of  integral  of  a function, 
212-213 

integration  of,  238-240 
linearity  of,  206-208 
notation,  205 
ODEs  with  variable 

coefficients,  240-241 
partial  differential  equations, 
600-603 

partial  fractions,  228-230 
second  shifting  theorem 

(f-shifting),  219-223 
short  impulses,  225-226 
systems  of  ODEs,  242-247 
table  of,  249-251 
uniqueness,  210 
unit  step  function  (Heaviside 
function),  217-219 

linear,  46 
nonlinear,  46 

numeric  analysis,  901-922 
first-order  ODEs,  901-915 
higher  order  ODEs,  915-922 
second-order  linear,  46-104 
homogeneous,  46-79 
nonhomogeneous,  79-102 
second-order  nonlinear,  46 
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Ordinary  differential  equations  ( Cont .) 
series  solutions  of  ODEs,  167-202 
Bessel  functions,  187-194, 
196-200 

Bessel’s  equation,  187-200 
Frobenius  method,  180-187 
Legendre  polynomials, 

177-179 

Legendre’s  equation,  175-  179 
power  series  method,  167-175 
systems  of,  124-166 
basic  theory,  137-139 
constant-coefficient,  140-151 
conversion  of  nth-order  ODEs 
to,  134-135 
homogeneous,  138 
Laplace  transforms,  242-247 
linear,  124-130,  138-151, 
160-163 

matrices  and  vectors,  124—130 
as  models  of  applications, 
130-134 

nonhomogeneous,  138,  160-163 
nonlinear,  152-160 
in  phase  plane,  124,  141-146, 
157-159 

qualitative  methods  for 
nonlinear  systems, 
152-160 

Orientable  surfaces,  446-447 
Oriented  curve,  644 
Oriented  surfaces,  integrals  over, 
446-447 

Origin  (vertex),  980 
Orthogonal,  to  a vector,  362 
Orthogonal  coordinate  curves,  A74 
Orthogonal  expansion,  504 
Orthogonal  functions: 
defined,  500 

Sturm-Liouville  Problems, 
500-503 
Orthogonality: 

trigonometric  system,  479-480,  538 
vector  differential  calculus, 
361-363 

Orthogonal  matrices,  335,  337-338, 
353,  A85n.2 

Orthogonal  polynomials,  179 
Orthogonal  series  (generalized 
Fourier  series),  504—510 
completeness,  508-509 
mean  square  convergence, 

507-508 

Orthogonal  trajectories: 
defined,  36 

first-order  ODEs,  36-38 
Orthogonal  transformations,  336, 
^A85n.2 

Orthogonal  vectors,  312,  362,  410 
Orthonormal  functions,  500,  501,  508 


Orthonormal  system,  337 
Oscillations: 
forced,  85-92 
free,  62-70 
harmonic,  63-64 
second-order  linear  ODEs: 
homogeneous,  62-70 
nonhomogeneous,  85-92 
Osculating  plane,  389,  390 
Outcomes: 

of  experiments,  1015,  1060 
probability  theory,  1015 
Outer  normal  derivative,  460,  931 
Outliers,  1013-1015 
Output  (response  to  input),  27,  86, 
214 

Overdamping,  65-66 
Overdetermined  linear  systems,  277 
Overflow  (floating-point  numbers), 
792 

Overrelaxation  factor,  863 


Paired  comparison,  1084,  1113 
Pappus,  theorem  of,  452 
Pappus  of  Alexandria,  452n.7 
Parabolic  PDEs: 
defined,  923 

numeric  analysis,  936-942 
Parallelogram  law,  357 
Parallel  processing  of  products  (on 
computer),  265 
Parameters,  175,  381,  1112 
estimation  of,  1063 
point  estimation  of,  1065-1068 
probability  distributions, 

1035 

of  a sample,  1065 
Parameter  curves,  442 
Parametric  representations,  381, 
439-441 

Parseval,  Marc  Antoine,  497n.3 
Parseval  equality,  509 
Parseval’ s identity,  497 
Parseval’ s theorem,  497 
Partial  derivatives,  A69-A71 
defined,  A69 
first  (first  order),  A71 
second  (second  order),  A71 
third  (third  order),  A71 
of  vector  functions,  380 
Partial  differential  equations  (PDEs), 
473,  540-605 
basic  concepts  of,  540-543 
d'Alembert’s  solution,  553-556 
defined,  540 

double  Fourier  series  solution, 
577-585 

heat  equation,  557-558 

Dirichlet  problem,  564-566 


Partial  differential  equations  (Cont.) 
Laplace’s  equation,  564 
solution  by  Fourier  integrals, 
568-571 

solution  by  Fourier  series, 
558-563 

solution  by  Fourier  transforms, 
571-574 

steady  two-dimensional  heat 
problems,  546-566 
unifying  power  of  methods, 

566 

homogeneous,  541 
Laplace’s  equation,  593-600 
boundary  value  problem  in 
spherical  coordinates, 
594-596 

in  cylindrical  coordinates, 
593-594 

Fourier-Legendre  series, 
596-598 

in  spherical  coordinates,  594 
Laplace  transforms,  solution  by, 
600-603 

Laplacian  in  polar  coordinates, 
585-592 
linear,  541 

method  of  separating  variables, 
545-553 

Fourier  series,  548-551 
satisfying  boundary  conditions, 
546-548 

two  ODEs  from  wave 

equation,  545-546 
nonhomogeneous,  541 
nonlinear,  541 
numeric  analysis,  922-945 
elliptic,  922-936 
hyperbolic,  942-945 
parabolic,  936-942 
ODEs  vs.,  4 

wave  equation,  544—545 
d’Alembert’s  solution, 

553-556 

solution  by  separating 

variables,  545-553 
two-dimensional,  575-584 
Partial  fractions  (Laplace  transforms), 
228-230 

Partial  pivoting,  276,  846-848,  898 
Partial  sums,  of  series,  477,  478,  495 
Particular  solution(s): 
first-order  ODEs,  6,  44 
higher-order  homogeneous  linear 
ODEs,  106 

nonhomogeneous  linear  systems, 
160 

second-order  linear  ODEs: 
homogeneous,  49-51,  104 
nonhomogeneous,  80 
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Partitioning,  of  a path,  645 
Pascal,  Blaise,  391n.4 
Pascal,  Etienne,  391n.4 
Paths: 

alternating,  1002 
augmenting,  1002-1003 
closed,  414,  645,  975-976 
deformation  of,  656 
directed,  1000 

flow  augmenting,  992-993,  998, 
1008 

flow  problems  in  networks,  992 
integration  by  use  of,  647-650 
longest,  976 
partitioning  of,  645 
principle  of  deformation  of,  656 
shortest,  976 

shortest  path  problems,  975-976 
simple  closed,  652 
Path  dependence  (line  integrals), 
418-426,  470,  649-650 
defined,  418 

and  integration  around  closed 
curves,  421-425 
Path  independence,  669 

Cauchy’s  integral  theorem,  655 
in  a domain  D in  space,  419 
proof  of,  A88-A89 
Stokes’s  Theorem  applied  to, 

468 

Path  of  integration,  414,  644 
Pauli  spin  matrices,  35 1 
p-charts,  1091-1092 
PDEs,  see  Partial  differential 
equations 

Pearson,  Egon  Sharpe,  1077n.2 
Pearson,  Karl,  1077,  1086n.4 
Period,  475 

Periodic  boundary  conditions,  501 
Periodic  extensions,  488-490 
Periodic  function,  474^175,  538 
Periodic  Sturm-Liouville  problem, 
501 

Permutations: 

of  n things  taken  k at  a time, 

1025 

of  n things  taken  k at  a time  with 
repetitions,  1025-1026 
probability  theory,  1024-1026 
Perron,  Oskar,  882n.8 
Perron-Frobenius  Theorem,  883 
Perron’s  theorem,  334,  882-883 
Pfaff,  Johann  Friedrich,  422n.  1 
Pfaffian  form,  422 
p-fold  connected  domains,  652-653 
Phase  angle,  90 
Phase  lag,  90 
Phase  plane,  134,  165 
linear  systems,  141,  148 
nonlinear  systems,  152 


Phase  plane  method,  124 
linear  systems: 

critical  points,  142-146 
graphing  solutions,  141-142 
nonlinear  systems,  152 
linearization,  152-155 
Lotka-Volterra  population 
model,  155-156 
transformation  to  first-order 
equation  in,  157-159 
Phase  plane  representations,  134 
Phase  portrait,  165 

linear  systems,  141-142,  148 
nonlinear  systems,  152 
Picard,  Emile,  42n.l0 
Picard’s  Iteration  Method,  42 
Picard’s  theorem,  716 
Piecewise  continuous  functions,  209 
Piecewise  smooth  path  of  integration, 
414,  645 

Piecewise  smooth  surfaces,  442,  447 

Pivot,  276,  898,  960 

Pivot  equation,  276,  846,  898,  960 

Planar  graphs,  1005 

Plane: 

complex,  611 

extended,  718,  744-745 
finite,  718 
sets  in,  620 
normal,  390 
osculating,  389,  390 
phase,  134,  165 

linear  systems,  141,  148 
nonlinear  systems,  152 
rectifying,  390 
tangent,  398,  441-442 
vectors  in,  309 
Plane  curves,  383 
Planimeters,  436 
Poincare,  Henri,  141n.l,  510n.8 
Points: 

boundary,  426n.2,  620 
branch,  755 
center,  144,  165 
critical,  33,  144,  165 

asymptotically  stable,  149 
and  conformal  mapping,  738, 
757 

constant-coefficient  systems  of 
ODEs,  142-151 
isolated,  152 
nonlinear  systems,  152 
stable,  140,  149 
stable  and  attractive,  140,  149 
unstable,  140,  149 
equilibrium,  33-34 
fixed,  745,  799 
guidepoints,  827 
at  infinity,  718 
initial  (vectors),  355 


Points:  ( Cont .) 
lattice,  925-926 
limit,  A93 
mesh,  925-926 
regular,  181 

regular  singular,  180n.4 
saddle,  143,  165 
sample,  1015 
singular,  181,  201 

analytic  functions,  693 
regular,  180n.4 
spiral"  144-145,  165 
stagnation,  773 
stationary,  952 
terminal  (vectors),  355 
Point  estimation  of  parameters 

(statistics),  1065-1068,  1113 
defined,  1065 

maximum  likelihood  method, 
1066-1067 

Point  set,  in  complex  plane,  620 
Point  source  (flow  modeling),  776 
Point  spectrum,  525 
Poisson,  Simeon  Denis,  779n.2 
Poisson  distributions,  1041-1042, 
1061,  A100 
Poisson  equation: 
defined,  923 

numeric  analysis,  922-936 
ADI  method,  928-930 
difference  equations,  923-925 
Dirichlet  problem,  925-928 
mixed  boundary  value 
problem,  931-933 
Poisson’s  integral  formula: 
derivation  of,  778-778 
potential  theory,  777-781 
series  for  potentials  in  disks, 
779-780 

Polar  coordinates,  431 
Laplacian  in,  585-592 
notation  for,  594 
two-dimensional  wave  equation 
in,  586 

Polar  form,  of  complex  numbers, 
613-618,  631 

Polar  moment  of  inertia,  of  a region, 
429 

Poles  (singularities),  714—715 
of  order  m,  735 
and  zeros,  717 
Polynomials,  624 

characteristic,  325,  353,  877 
Chebyshev,  504 
interpolation,  808,  842 
Laguerre,  241,  504 
Legendre,  167,  177-179,  202 
orthogonal,  179 
trigonometric: 

approximation  by,  495^198 
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Polynomials  ( Cont .) 
complex,  529 
of  the  same  degree  N,  495 
Polynomial  approximations,  808 
Polynomial  interpolation,  808,  842 
Polynomially  bounded,  979 
Polynomial  matrix,  334,  878-879 
Populations: 
infinite,  1044 

for  statistical  sampling,  1063 
Population  dynamics: 
defined,  33 

logistic  equation,  33-34 
Position  vector,  356 
Positive  correlation,  1111 
Positive  definite  (quadratic  form), 

346 

Positive  sense,  on  curve,  644 
Possible  values  (random  variables), 
1030 

Postman  problem,  980 
Potential  (potential  function),  400 
complex,  760-761 
Laplace’s  equation,  593 
Poisson’s  integral  formula  for, 
777-781 

Potential  theory,  179,  420,  460, 
758-786 

conformal  mapping  for  boundary 
value  problems,  763-767 
defined,  758 

electrostatic  fields,  759-763 
complex  potential,  760-761 
superposition,  761-762 
fluid  flow,  771-777 
harmonic  functions,  781-784 
heat  problems,  767-770 
Laplace’s  equation,  593,  628 
Poisson’s  integral  formula,  777-781 
Power  function,  of  a test,  1081,  1113 
Power  method  (matrix  eigenvalue 
problems),  885-888,  899 
Power  series,  168,  671-707 

convergence  behavior  of,  680-682 
convergence  tests,  674—676, 
A93-A94 

functions  given  by,  685-690 
Maclaurin  series,  690 
in  powers  of  x,  168 
radius  of  convergence,  682-684 
ratio  test,  676-678 
root  test,  678-679 
sequences,  671-673 
series,  673-674 
Taylor  series,  690-697 
uniform  convergence,  698-705 
and  absolute  convergence,  704 
properties  of,  700-701 
termwise  integration,  701-703 
test  for,  703-704 


Power  series  method,  167-175,  201 
extension  of,  see  Frobenius  method 
idea  and  technique  of,  168-170 
operations  on,  173-174 
theory  of,  170-174 
Practical  resonance,  90 
Predator-prey  population  model, 
155-156 

Predictor-corrector  method,  913 
PRIM,  ALGORITHM,  989 
Prim,  Robert  Clay,  988n.6 
Prim’s  algorithm,  988-991,  1008 
Principal  axes,  transformation  to,  344 
Principal  branch,  of  logarithm,  639 
Principal  directions,  330 
Principal  minors,  346 
Principal  part,  735 

of  isolated  singularities,  715 
of  singularities,  708,  709 
Principal  value  (complex  numbers), 
614,  617,  642 
complex  logarithm,  637 
general  powers,  639 
Principle  of  deformation  of  path,  656 
Prior  estimates,  805 
Probability,  1060 
axioms  of,  1020 
basic  theorems  of,  1020-1022 
conditional,  1022-1023 
definitions  of,  1018-1020 
independent  events,  1023 
Probability  distributions,  1029,  1061 
binomial,  1039-1042 
continuous,  1032-1034 
discrete,  1030-1032 
hypergeometric,  1042-1044 
mean  and  variance  of,  1035-1039 
multinomial,  1045 
normal,  1045-1051 
Poisson,  1041-1042 
of  several  random  variables, 
1051-1060 

addition  of  means,  1057-1058 
addition  of  variances, 
1058-1059 

continuous  two-dimensional 
distributions,  1053 
discrete  two-dimensional 

distributions,  1052-1053 
function  of  random  variables, 
1056 

independence  of  random 

variables,  1055-1056 
marginal  distributions, 
1053-1055 
symmetric,  1036 
two-dimensional,  1051 
continuous,  1053 
discrete,  1052-1053 
uniform,  1035-1036 


Probability  function,  1030-1032, 
1052,  1061 

Probability  theory,  1009,  1015-1062 
binomial  coefficients,  1027-1028 
combinations,  1024,  1026-1027 
distributions  (probability 
distributions),  1029 
binomial,  1039-1042 
continuous,  1032-1034 
discrete,  1030-1032 
hypergeometric,  1042-1044 
mean  and  variance  of, 
1035-1039 
normal,  1045-1051 
Poisson,  1041-1042 
of  several  random  variables, 
1051-1060 
events,  1016-1017 
experiments,  1015-1016 
factorial  function,  1027 
outcomes,  1015 
permutations,  1024-1026 
probability: 

basic  theorems  of,  1020-1022 
conditional,  1022-1023 
definition  of,  1018-1020 
independent  events,  1023 
random  variables,  1029-1030 
continuous,  1032-1034 
discrete,  1030-1032 
Problem  of  existence,  39 
Problem  of  uniqueness,  39 
Producers,  1092 
Producer’s  risk,  1094 
Product: 

inner  (dot),  312 

for  complex  vectors,  349 
invariance  of,  336 
vector  differential  calculus, 
361-367,  410 
of  matrix,  260 

determinants  of,  307-308 
inverting,  306 

matrix  multiplication,  263,  320 
parallel  processing  of  (on 
computer),  265 
scalar  multiplication,  260 
scalar  triple,  373-374,  411 
vector  (cross): 
in  Cartesian  coordinates, 
A83-A84 

vector  differential  calculus, 
368-375,  410 

Product  method,  605.  See  also 

Method  of  separating  variables 
Projection  (vectors),  365 
Proper  node,  143 
Pseudocode,  796 
Pure  imaginary  complex  numbers, 

609 
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QR-factorization,  892-896 
Quadrant,  of  a circle,  604 
Quadratic  forms  (matrix  eigenvalue 
problems),  343-344 
Quadratic  interpolation, 

810-811 

Qualitative  methods,  124, 

141n.l 
defined,  152 

for  nonlinear  systems,  152-160 
linearization,  152-155 
Lotka-Volterra  population 
model,  155-156 
transformation  to  first-order 

equation  in  phase  plane, 
157-159 

Quality  control  (statistics), 
1087-1092,  1113 
for  mean,  1088-1089 
for  range,  1090-1091 
for  standard  deviation,  1090 
for  variance,  1089-1090 
Quantitative  methods,  124 
Quasilinear  equations,  555,  923 
Quotient: 

complex  numbers,  610 
difference,  923 
Rayleigh,  885,  899 


Radius: 

of  convergence,  172 
defined,  172 

power  series,  682-684,  706 
of  a graph,  991 
Random  experiments,  1011, 
1015-1016,  1060 
Randomly  selected  samples,  1064 
Randomness,  1015,  1064.  See  also 
Random  variables 
Random  numbers,  1064 
Random  number  generators,  1064 
Random  sampling  (statistics), 
1063-1065 

Random  selections,  1064 
Random  variables,  1011,  1029-1030, 
1061 

continuous,  1029,  1032-1034, 
1055 

defined,  1030 
dependent,  1055 
discrete,  1029-1032,  1054 
function  of,  1056 
independence  of,  1055-1056 
marginal  distribution  of,  1054, 
1055 

normal,  1045 
occurrence  of,  1063 
probability  distributions  of, 
1051-1060 


addition  of  means,  1057-1058 

addition  of  variances,  1058-1059 
continuous  two-dimensional 
distributions,  1053 
discrete  two-dimensional 

distributions,  1052-1053 
function  of  random  variables, 
1056 

independence  of  random 

variables,  1055-1056 
marginal  distributions, 
1053-1055 
skewness  of,  1039 
standardized,  1037 
two-dimensional,  1051,  1062 
Random  variation,  1063 
Range,  1013 

control  chart  for,  1090-1091 
defined,  1090 
of/,  620 
Rank: 

of  A,  279 

of  a matrix,  279,  283,  321 
in  terms  of  column  vectors, 
284-285 

in  terms  of  determinants,  297 
of  R,  279 

Raphson,  Joseph,  801n.l 
Rational  functions,  624,  725-729 
Ratio  test  (power  series),  676-678 
Rayleigh,  Lord  (John  William  Strutt), 
l60n.5,  885n.lO 
Rayleigh  equation,  160 
Rayleigh  quotient,  885,  899 
Reactance  (RLC  circuits),  94 
Real  axis  (complex  plane),  611 
Real  different  roots,  7 1 
Real  double  root,  55-56,  72 
Real  functions,  complex  analytic 
functions  vs.,  694 
Real  inner  product  space,  312 
Real  integrals,  residue  integration  of, 
725-733 

Fourier  integrals,  729-730 
improper  integrals,  730-732 
of  rational  functions  of  cos  0 
sin  d , 725-729 

Real  part  (complex  numbers),  609 
Real  pre-Hilbert  space,  312 
Real  roots: 
different,  71 
double,  55-56 

higher-order  homogeneous  linear 
ODEs: 

distinct,  112-113 
multiple,  114-115 
second-order  homogeneous  linear 
ODEs: 

distinct,  54-55 
double,  55-56 


Real  sequence,  671 
Real  series,  A73-A74 
Real  vector  spaces,  309-311,  359, 
410 

Recording,  of  sample  values, 
1011-1012 

Rectangular  cross-section,  120 
Rectangular  matrix,  258 
Rectangular  membrane  R,  577-584 
Rectangular  rule  (numeric 
integration),  828 
Rectifiable  (curves),  385 
Rectification  (acceptance  sampling), 
1094-1095 

Rectifying  plane,  390 
Recurrence  formula,  201 
Recurrence  relation,  176 
Recursion  formula,  176 
Reduced  echelon  form,  279 
Reduction  of  order  (second-order 
homogeneous  linear  ODEs), 
51-52 

Regions,  426n.2 
bounded,  426n.2 
center  of  gravity  of  mass  in,  429 
closed,  426n.2 
critical,  1079 
feasibility,  954 
fundamental  (exponential 
function),  632 
moments  of  inertia  of,  429 
polar  moment  of  inertia  of,  429 
rejection,  1079 
sets  in  complex  plane,  620 
total  mass  of,  429 
volume  of,  428 
Regression  analysis,  1063, 
1103-1108,  1113 
confidence  intervals  in, 
1107-1108 
defined,  1103 

Regression  coefficient,  1105, 
1107-1108 

Regression  curve,  1103 
Regression  line,  1103,  1104,  1106 
Regular  point,  181 
Regular  singular  point,  180n.4 
Regular  Sturm-Liouville  problem, 
501 

Rejectable  quality  level  (RQL),  1094 
Rejection: 

of  a hypothesis,  1078 
of  products,  1092 
Rejection  region,  1079 
Relative  class  frequency,  1012 
Relative  error,  794 
Relative  frequency  (probability): 
of  an  event,  1019 
class,  1012 
cumulative,  1012 
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Relaxation  methods,  862 
Remainder,  170 
of  a series,  673 
of  Taylor  series,  691 
Remarkable  parallelogram,  375 
Removable  singularities,  717 
Repeated  factors,  220,  221 
Representation,  315 
by  Fourier  series,  476 
by  power  series,  683 
spectral,  525 
Residual,  805,  862,  899 
Residues,  708,  720,  735 
at  mth-order  pole,  722 
at  simple  poles,  721-722 
Residue  integration,  719-733 
formulas  for  residues,  721-722 
of  real  integrals,  725-733 
Fourier  integrals,  729-730 
improper  integrals,  730-732 
of  rational  functions  of  cos  6 
sin  d , 725-729 
several  singularities  inside 
contour,  723-725 
Residue  theorem,  723-724 
Resistance,  apparent,  95 
Resonance: 
practical,  90 

undamped  forced  oscillations, 
88-89 

Resonance  factor,  88 
Response  to  input,  see  Output 
(response  to  input) 

Resultant,  of  forces,  357 
Riccati  equation,  35 
Riemann,  Bernhard,  625n.4 
Riemannian  geometry,  625n.4 
Riemann  sphere,  718 
Riemann  surfaces  (conformal 
mapping),  754-757 
Right-hand  derivatives  (Fourier 
series),  480 

Right-handed  Cartesian  coordinate 
system,  368-369,  A83-A84 
Right-handed  triple,  369 
Right-hand  limit  (Fourier  series),  480 
Right-sided  tests,  1079,  1082 
Risks  of  making  false  decisions,  1080 
RKF  method,  see 

Runge-Kutta-Fehlberg  method 
RK  methods,  see  Runge-Kutta 
methods 

RKN  methods,  see 

Runge-Kutta-Nystrom  methods 
Robin  problem: 

Laplace’s  equation,  593 
two-dimensional  heat  equation,  564 
Rodrigues,  Olinde,  179n.2 
Rodrigues’s  formula,  179,  241 
Romberg  integration,  840,  843 


Roots: 

complex: 

higher-order  homogeneous 
linear  ODEs,  113-115 
second-order  homogeneous 
linear  ODEs,  57-59 
complex  conjugate,  72-73 
differing  by  an  integer,  183 
Frobenius  method,  183 
distinct  (Frobenius  method),  182 
double  (Frobenius  method),  183 
of  equations,  798 
multiple  complex,  115 
nth,  616 

nth  roots  of  unity,  617 
simple  complex,  113-114 
Root  test  (power  series),  678-679 
Rotation  (vorticity  of  flow),  774 
Rounding,  792 
Rounding  unit,  793 
Roundoff  (numeric  analysis),  792-793 
Roundoff  errors,  792,  794,  902 
Roundoff  rule,  793 
Rows: 

determinants,  294 
matrix,  125,  257,  320 
Row  echelon  form,  279-280 
Row-equivalent  matrices,  283-284 
Row-equivalent  systems,  277 
Row  operations  (linear  systems),  276, 
277 

Row  scaling  (Gauss  elimination),  850 
Row  “sum”  norm,  861 
Row  vectors,  126,  257,  320 
RQL  (rejectable  quality  level),  1094 
Runge,  Carl,  820n.3 
Runge,  Karl,  905n.l 
RUNGE-KUTTA,  ALGORITHM,  905 
Runge-Kutta-Fehlberg  (RKF) 
method,  947 
error  of,  908 

first-order  ODEs,  906-908 
Runge-Kutta  (RK)  methods,  915,  947 
error  of,  908 

first-order  ODEs,  904-906 
higher  order  ODEs,  917-919 
Runge-Kutta-Nystrom  (RKN ) 
methods,  919-921,  947 
Rutherford,  E.,  1044,  1100 
Rutherford-Geiger  experiments, 

1044,  1 100 

Rutishauser,  Heinz,  892n.  12 


Saddle  point,  143,  165 
Samples: 

for  experiments,  1015 
in  mathematical  statistics, 
1063-1064 

selection  of,  1063-1064 


Sample  covariance,  1105 
Sampled  function,  529 
Sample  distribution  function,  1096 
Sample  mean,  1064,  1113 
Sample  points,  1015 
Sample  regression  line,  1104 
Sample  size,  1015,  1064 
Sample  space,  1015,  1016,  1060 
Sample  standard  deviation,  1065 
Sample  variance,  1015,  1113 
Sampling: 

from  a population,  1023 
random,  1063-1065 
with  replacement,  1023 

binomial  distribution,  1042 
hypergeometric  distribution, 
1043-1044 
in  statistics,  1063 
without  replacement,  1018,  1023 
binomial  distribution, 

1042- 1043 

hypergeometric  distribution, 

1043- 1044 

Sampling  plan,  1092-1093 
Scalar(s),  260,  310,  354 
Scalar  fields,  vector  fields  that  are 
gradients  of,  400-401 
Scalar  functions: 
defined,  376 

vector  differential  calculus,  376 
Scalar  matrices,  268 
Scalar  multiplication,  126-127,  310 
of  matrices  and  vectors,  259-261 
vectors  in  2-space  and  3-space, 
358-359 

Scalar  triple  product,  373-374,  411 
Scale  (vectors),  886-887 
Scanning  labeled  vertices,  998 
Schrodinger,  Erwin.  226n.2 
Schur,  Issai,  882n.7 
Schur’s  inequality,  882 
Schur’ s theorem,  882 
Schwartz,  Laurent,  226n.2 
Secant,  formula  for,  A65 
Secant  method  (numeric  analysis), 
805-806,  842 

Second  boundary  value  problem,  see 
Neumann  problem 
Second-order  determinants,  291-292 
Second-order  differential  operator,  60 
Second-order  linear  ODEs,  46-104 
homogeneous,  46-79 
basis,  50-52 

with  constant  coefficients, 
53-60 

differential  operators,  60-62 
Euler-Cauchy  equations, 

71-74 

existence  and  uniqueness  of 
solutions,  74-79 
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Second-order  linear  ODEs  ( Cont .) 

general  solution,  49-51,  77-78 
initial  value  problem,  49-50 
modeling  free  oscillations  of 
mass-spring  system, 
62-70 

reduction  of  order,  51-52 
superposition  principle,  47-48 
Wronskian,  75-78 
nonhomogeneous,  79-102 
defined,  47 

general  solution,  80-81 
method  of  undetermined 
coefficients,  81-85 
modeling  electric  circuits,  93-99 
modeling  forced  oscillations, 
85-92 

solution  by  variation  of 
parameters,  99-102 
Second-order  method,  improved 
Euler  method  as,  904 
Second-order  nonlinear  ODEs,  46 
Second-order  PDEs,  540-541 
Second  (second  order)  partial 
derivatives,  A71 

Second  shifting  theorem  (f-shifting), 
219-223 

Second  transmission  line  equation, 
599 

Seidel,  Philipp  Ludwig  von,  858n.4 
Self-starting  methods,  911 
Sense  reversal  (complex  line 
integrals),  645 
Separable  equations,  12-13 
Separable  ODEs,  44 
first-order,  12-20 

extended  method,  17-18 
modeling,  13-17 
reduction  of  nonseparable  ODEs 
to,  17-18 

Separating  variables,  method  of, 

12-13 

circular  membrane,  587 
partial  differential  equations, 
545-553,  605 
Fourier  series,  548-551 
satisfying  boundary  conditions, 
546-548 

two  ODEs  from  wave 

equation,  545-546 
vibrating  string,  545-546 
Separation  constant,  546 
Sequences  (infinite  sequences): 
bounded,  A93-A95 
convergent,  507-508,  672 
divergent,  672 
limit  point  of,  A93 
monotone  real,  A72-A73 
power  series,  671-673 
real,  671 


Series,  A73-A74 
binomial,  696 

conditionally  convergent,  675 
convergent,  171,  673 
cosine,  781 
derived,  687 
divergent,  171,  673 
double  Fourier: 
defined,  582 
rectangular  membrane, 

577-585 

Fourier,  473-483,  538 
convergence  and  sum  or, 
480-481 

derivation  of  Euler  formulas, 
479-480 
double,  577-585 
even  and  odd  functions, 
486-488 

half-range  expansions,  488-490 
heat  equation,  558-563 
from  period  2tt  to  2 L, 

483-486 

Fourier-Bessel,  506-507,  589 
Fourier  cosine,  484,  486,  538 
Fourier-Legendre,  505-506, 
596-598 

Fourier  sine,  477,  486,  538 

one-dimensional  heat  equation, 
561 

vibrating  string,  548 
geometric,  168,  675 
Taylor  series,  694 
uniformly  convergent,  698 
hypergeometric,  186 
infinite,  673-674 
Laurent,  708-719,  734 

analytic  or  singular  at  infinity, 
718-719 

point  at  infinity,  718 
Riemann  sphere,  718 
singularities,  715-717 
zeros  of  analytic  functions,  717 
Maclaurin,  690,  694-696 
orthogonal,  504-510 
completeness,  508-509 
mean  square  convergence, 
507-508 

power,  168,  671-707 

convergence  behavior  of, 
680-682 

convergence  tests,  674-676, 
A93-A94 

functions  given  by,  685-690 
Maclaurin  series,  690 
in  powers  of  x,  168 
radius  of  convergence, 

682-684 

ratio  test,  676-678 
root  test,  678-679 


Series  (Cont.) 

sequences,  671-673 
series,  673-67 4 
Taylor  series,  690-697 
uniform  convergence,  698-705 
real,  A73-A74 
Taylor,  690-697,  707 
trigonometric,  476,  484 
value  (sum)  of,  171,  673 
Series  solutions  of  ODEs,  167-202 
Bessel  functions,  187-188 
of  the  first  kind,  189-194 
of  the  second  kind,  196-200 
Bessel’s  equation,  187-196 
Bessel  functions,  187-188, 
196-200 

general  solution,  194-200 
Frobenius  method,  180-187 
indicial  equation,  181-183 
typical  applications,  183-185 
Legendre  polynomials,  177-179 
Legendre’s  equation,  175-  179 
power  series  method,  167-175 
idea  and  technique  of, 

168-170 

operations  on,  173-174 
theory  of,  170-174 

Sets: 

complete  orthonormal,  508 
in  the  complex  plane,  620 
cut,  994-996,  1008 
linearly  dependent,  129,  311 
linearly  independent,  128-129, 

311 

Shewhart,  W.  A.,  1088 
Shifted  function,  219 
Shortest  path,  976 
Shortest  path  problems 

(combinatorial  optimization), 
975-980,  1008 
Bellman’s  principle,  980-981 
complexity  of  algorithms, 

978-980 

Dijkstra’s  algorithm,  981-983 
Moore’s  BFS  algorithm,  977-980 
Shortest  spanning  trees: 

combinatorial  optimization,  1008 
Greedy  algorithm,  984-988 
Prim’s  algorithm,  988-991 
defined,  984 

Short  impulses  (Laplace  transforms), 
225-226 

Sifting  property,  226 
Significance  (in  statistics),  1078 
Significance  level,  1078,  1080,  1113 
Significance  tests,  1078 
Significant  digits,  791-792 
Similarity  transformation,  340 
Similar  matrices,  340-341,  878 
Simple  closed  curves,  646 
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Simple  closed  path,  652 
Simple  complex  roots,  113-114 
Simple  curves,  383 
Simple  events,  1015 
Simple  general  properties  of  the  line 
integral,  415-416 
Simple  poles,  714 
Simplex  method,  958-968 
degenerate  feasible  solution, 
962-965 

difficulties  in  starting,  965-968 
Simplex  table,  960 
Simplex  tableau,  960 
Simple  zero,  717 

Simply  connected  domains,  423,  646, 
652,  653 

SIMPSON,  ALGORITHM,  832 
Simpson,  Thomas,  832n.4 
Simpson's  rule,  832,  843 

adaptive  integration  with,  835-836 
numeric  integration,  831-835 
Simultaneous  corrections,  862 
Sine  function: 

conformal  mapping  by,  750-751 
formula  for,  A63-A65 
Sine  integral,  514,  697,  A68-A69,  A98 
Single  precision,  floating-point 
standard  for,  792 
Singularities  (singular,  having  a 
singularity),  693,  707,  715 
analytic  functions,  693 
essential,  715-716 
inside  a contour,  723-725 
isolated,  715 
isolated  essential,  715 
Laurent  series,  715-719 
principal  part  of,  708 
removable,  717 
Singular  matrices,  301 
Singular  point,  181,  201 
analytic  functions,  693 
regular,  180n.4 
Singular  solutions: 

first-order  ODEs,  8,  35 
higher-order  homogeneous  linear 
ODEs,  110 

second-order  homogeneous  linear 
ODEs,  50,  78 

Singular  Sturm-Liouville  problem, 
501,  503 
Sink(s): 

motion  of  a fluid,  404,  458,  775, 
776 

networks,  991 
Size: 

of  matrices,  258 
sample,  1015,  1064 
Skew-Hermitian  form,  351 
Skew-Hermitian  matrices,  347,  348, 
350,  353 


Skewness,  of  a random  variables,  1039 
Skew-symmetric  matrices,  268,  320, 
334-336,  353 
Slack  variables,  956,  969 
Slope  field  (direction  field),  9-10 
Smooth  curves,  414,  644 
Smooth  surfaces,  442 
Sobolev,  Sergei  L’Vovich,  226n.2 
Software: 

for  data  representation  in  statistics, 
1011 

numeric  analysis,  788-789 
variable  step  size  selection  in,  902 
Solenoid,  405 

Solutions.  See  also  specific  methods 
defined,  4,  798 
first-order  ODEs: 
concept  of,  4-6 
equilibrium  solutions,  33-34 
explicit  solutions,  21 
family  of  solutions,  5 
general  solution,  6,  44 
implicit  solutions,  21 
particular  solution,  6,  44 
singular  solution,  8,  35 
solution  by  calculus,  5 
trivial  solution,  28,  35 
graphing  in  phase  plane,  141-142 
higher-order  homogeneous  linear 
ODEs,  106 

general  solution,  106,  110-111 
particular  solution,  106 
singular  solution,  110 
linear  systems,  273,  745 
nonhomogeneous  linear  systems: 
general  solution,  160 
particular  solution,  160 
PDEs,  541 

second-order  homogeneous  linear 
ODEs: 

general  solution,  49-51,  77-78 
linear  dependence  and 

independence  of,  75 
particular  solution,  49-51 
singular  solution,  50,  78 
second-order  linear  ODEs,  47 
second-order  nonhomogeneous 
linear  ODEs: 
general  solution,  80-81 
particular  solution,  80 
systems  of  ODEs,  137,  139 
Solution  curves,  4-6 
Solution  space,  290 
Solution  vector,  273,  745 
SOR  (successive  overrelaxation),  863 
SOR  formula  for  Gauss-Seidel,  863 
Sorting,  of  sample  values,  1011-1012 
Source(s): 

motion  of  a fluid,  404,  458,  775 
networks,  991 


Source  intensity,  458 
Source  line  (flow  modeling),  776 
Span,  of  vectors,  286 
Spanning  trees,  984,  988 
Sparse  graphs,  974 
Sparse  matrices,  823,  925 
Sparse  systems,  858 
Special  functions,  167,  202 
formulas  for,  A63-A69 
theory  of,  175 

Special  vector  spaces,  285-287 
Specific  circulation,  of  flow,  467 
Spectral  density,  525 
Spectral  mapping  theorem,  878 
Spectral  radius,  324,  861 
Spectral  representation,  525 
Spectral  shift,  896 
Spectrum,  877 
of  matrix,  324 
vibrating  string,  547 
Speed,  386,  391 

angular  (rotation),  372 
of  convergence,  804-805 
Spherical  coordinates,  A74-A76 
boundary  value  problem  in, 
594-596 
defined,  594 
Laplacian  in,  594 
Spiral  point,  144-145,  165 
Spline,  821,  843 
Spline  interpolation,  820-827 
Spring  constant,  62 
Square  error,  496^-97,  539 
Square  matrices,  126,  257,  258, 
301-309,  320 
^-shifting,  208-209 
Stability: 

of  critical  points,  165 
of  solutions,  33-34,  124,  936 
of  systems,  84,  124 
Stability  chart,  149 
Stable  algorithms,  796,  842 
Stable  and  attractive  critical  points, 
140,  149 

Stable  critical  points,  140,  149 
Stable  equilibrium  solution,  33-34 
Stable  systems,  84 
Stagnation  points,  773 
Standard  basis,  314,  359,  365 
Standard  deviation,  1014,  1035,  1090 
Standard  form: 

first-order  ODEs,  27 
higher-order  homogeneous  linear 
ODEs,  105 

higher-order  linear  ODEs,  123 
power  series  method,  172 
second-order  linear  ODEs,  46, 

103 

Standardized  normal  distribution, 
1046 
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Standardized  random  variables,  1037 
Standard  trick  (confidence  intervals), 
1068 

Stationary  point  (unconstrained 
optimization),  952 
Statistics,  1015,  1063.  See  also 
Mathematical  statistics 
Statistical  inference,  1059,  1063 
Steady  flow,  405,  458 
Steady  heat  flow,  767 
Steady-state  case  (heat  problems), 

591 

Steady-state  current,  98 
Steady-state  heat  flow,  460 
Steady-state  solution,  31,  84,  89-91 
Steady  two-dimensional  heat 
problems,  546-566,  605 
Steepest  descent,  method  of,  952-954 
Steiner,  Jacob,  451n.6 
Stem-and-leaf  plots,  1012 
Stencil  (pattern,  molecule,  star),  925 
Step-by-step  methods,  901 
Step  function,  828,  1031 
Step  size,  901,  902 
Stereographic  projection,  718 
Stiff  ODEs,  909-910 
Stiff  systems,  920-921 
Stirling,  James,  1027n.2 
Stirling  formula,  1027,  A67 
Stochastic  matrices,  270 
Stochastic  variables,  1029.  See  also 
Random  variables 
Stokes,  Sir  George  Gabriel,  464n.9, 
703n.5 

Stokes’s  Theorem,  463-470 
Stream  function,  771 
Streamline,  771 
Strength  (flow  modeling),  776 
Strictly  diagonally  dominant  matrices, 
881 

Sturm,  Jacques  Charles  Francois, 
499n.4 

Sturm-Liouville  equation,  499 
Sturm-Liouville  expansions,  474 
Sturm-Liouville  Problems,  498-504 
eigenvalues,  eigenfunctions, 
499-500 

orthogonal  functions,  500-503 
Subgraphs,  972 

Submarine  cable  equations,  599 
Submatrices,  288 
Subsidiary  equation,  203,  253 
Subspace,  of  vector  space,  286 
Subtraction: 

of  complex  numbers,  610 
termwise,  of  power  series,  687 
Success  corrections,  862 
Successive  overrelaxation  (SOR),  863 
Sufficient  convergence  condition, 

861 


Sum: 

of  matrices,  320 

partial,  of  series,  477,  478,  495 

of  a series,  171,  673 

of  vectors,  357 

Sum  Rule  (method  of  undetermined 
coefficients): 

higher-order  homogeneous  linear 
ODEs,  115 

second-order  nonhomogeneous 
linear  ODEs,  81,  83-84 
Superlinear  convergence,  806 
Superposition  (electrostatic  fields), 
761-762 

Superposition  (linearity)  principle: 
higher-order  homogeneous  linear 
ODEs,  106 

higher-order  linear  ODEs,  123 
homogeneous  linear  systems, 

138 

PDEs,  541-542 

second-order  homogeneous  linear 
ODEs,  47-48,  104 
undamped  forced  oscillations,  87 
Surfaces,  for  surface  integrals, 
439-443 

orientation  of,  446-447 
representation  of  surfaces, 
439-441 

tangent  plane  and  surface  normal, 
441-442 

Surface  integrals,  470 
defined,  443 
surfaces  for,  439-443 
orientation  of,  446-447 
representation  of  surfaces, 
439-441 

tangent  plane  and  surface 
normal,  441^-42 
vector  integral  calculus,  443^152 
orientation  of  surfaces, 
446^147 

without  regard  to  orientation, 
448-450 

Surface  normal,  398-399,  442 
Surface  normal  vector,  398-399 
Surjective  mapping,  737n.l 
Sustainable  yield,  36 
Symbol  O,  979 

Symmetric  coefficient  matrix,  343 
Symmetric  distributions,  1036 
Symmetric  matrices,  267-268,  320, 
334-336,  353 

Systems  of  ODEs,  124-166 
basic  theory  of,  137-139 
constant-coefficient,  140-151 
critical  points,  142-146, 
148-151 

graphing  solutions  in  phase 
plane,  141-142 


Systems  of  ODEs  ( Cont .) 

conversion  of  nth-order  ODEs  to, 
134-135 

homogeneous,  138 
Laplace  transforms,  242-247 
linear,  138-139.  See  also  Linear 
systems 

constant-coefficient  systems, 
140-151 

matrices  and  vectors,  124-130 
nonhomogeneous,  160-163 
matrices  and  vectors,  124-130 
calculations  with,  125-127 
definitions  and  terms, 

125-126,  128-129 
eigenvalues  and  eigenvectors, 
129-130 

systems  of  ODEs  as  vector 
equations,  127-128 
as  models  of  applications: 
electrical  network,  132-134 
mixing  problem  involving  two 
tanks,  130-132 

nonhomogeneous,  138,  160-163 
method  of  undetermined 
coefficients,  161 
method  of  variation  of 

parameters,  162-163 
nonlinear  systems: 

qualitative  methods  for, 
152-160 

transformation  to  first-order 

equation  in  phase  plane, 
157-159 
in  phase  plane,  124 

critical  points,  142-146 
graphing  solutions  in,  141-142 
transformation  to  first-order 
equation  in,  157-159 
qualitative  methods  for  nonlinear 
systems,  152-160 
linearization,  152-155 
Lotka-Volterra  population 
model,  155-156 


Tangent: 

to  a curve,  384 
formula  for,  A65 

Tangent  function,  conformal  mapping 
by,  752-753 

Tangential  accelerations,  391 
Tangential  acceleration  vector,  387 
Tangent  plane,  398,  441-442 
Tangent  vector,  384,  411 
Target  (networks),  991 
Taylor,  Brook,  690n.2 
Taylor  series,  690-697,  707 
Taylor’s  formula,  691 
Taylor’s  theorem,  691 
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Index 


^-distribution,  1071-1073,  1078, 

A103 

Telegraph  equations,  599 
Term(s): 

of  a sequence,  67 1 
of  a series,  673 
Terminal  point  (vectors),  355 
Termination  criterion,  802-803 
Termwise  addition,  173,  687 
Termwise  differentiation,  173, 
687-688,  703 

Termwise  integration,  687,  688, 
701-703 

Termwise  multiplication,  173,  687 
Termwise  subtraction,  687 
Tests,  statistical,  1077,  1113 
Theory  of  special  functions,  175 
Thermal  diffusivity,  460 
Third  boundary  value  problem,  see 
Robin  problem 

Third-order  determinants,  292-293 
Third  (third  order)  partial  derivatives, 
A71 

3-space,  vectors  in,  309,  354 

components  of  a vector,  356-357 
scalar  multiplication,  358-359 
vector  addition,  357-359 
Three-sigma  limits,  1047 
Time  (curves  in  mechanics),  386 
TI-Nspire,  789 
Todd,  John,  855n.3 
Tolerance  (adaptive  integration),  835 
Torricelli,  Evangelista,  16n.4 
Torricelli’s  law,  16-17 
Torsion,  curvature  and,  389-390 
Total  differential,  20,  45 
Total  energy,  of  physical  system,  525 
Total  error,  902 
Total  mass,  of  a region,  429 
Total  orthonormal  set,  508 
Total  pivoting,  846 
Trace,  345 

Trail  (shortest  path  problems),  975 
closed  trails,  975-976 
Euler  trail,  980 
Trajectories,  134,  165 

linear  systems,  141-142,  148 
nonlinear  systems,  152 
Transcendental  equations,  798 
Transducers,  98 
Transfer  function,  214 
Transformation(s),  313 
orthogonal,  336 
to  principal  axes,  344 
Transient  solution,  84,  89 
Transient-state  solution,  31 
Translation  (vectors),  355 
Transposition(s): 

of  matrices  or  vectors,  128,  320 
in  samples,  1101 


Trapezoidal  rule,  828,  843 

error  bounds  and  estimate  for, 
829-831 

numeric  integration,  828-831 
Trees  (graphs),  984,  988.  See  also 
Shortest  spanning  trees 
Trials  (experiments),  1011,  1015 
Triangle  inequality,  363,  614-615 
Triangular  form  (Gauss  elimination), 
^846 

Triangular  matrices,  268 
Tricomi,  Francesco,  556n.2 
Tricomi  equation,  555,  556 
Tridiagonalization  (matrix  eigenvalue 
problems),  888-892 
Tridiagonal  matrices,  823,  888,  928 
Trigonometric  analytic  functions 

(conformal  mapping),  750-754 
Trigonometric  function,  633-635, 

642 

inverse,  640 
Taylor  series,  695 
Trigonometric  polynomials: 
approximation  by,  495-498 
complex,  529 
of  the  same  degree  N,  495 
Trigonometric  series,  476,  484 
Trigonometric  system,  475,  479-480, 
538 

Trihedron,  390 
Triple  integrals,  470 
defined,  452 

mean  value  theorem  for,  456-457 
vector  integral  calculus,  452^158 
Triply  connected  domains,  653,  658, 
659 

Trivial  solution,  28,  35 

homogeneous  linear  systems, 

290 

linear  systems,  273 
Sturm-Liouville  problem,  499 
Truncating,  794 
f-shifting,  219-223 
Tuning  (vibrating  string),  548 
Twisted  curves,  383 
2-space  (plane),  vectors  in,  354 
components  of  a vector,  356-357 
scalar  multiplication,  358-359 
vector  addition,  357-359 
2X2  matrix,  125 
Two-dimensional  heat  equation, 
564-566 

Two-dimensional  normal  distribution, 
1110 

Two-dimensional  probability 
distributions: 
continuous,  1053 
discrete,  1052-1053 
Two-dimensional  problems  (potential 
theory),  759,  771 


Two-dimensional  random  variables, 
1051,  1062 

Two-dimensional  wave  equation, 
575-584,  586 

Two-sided  alternative  (hypothesis 
testing),  1079-1080 
Two-sided  tests,  1079,  1082-1083 
Type  I errors,  1080,  1081 
Type  II  errors,  1080-1081 


UCL  (upper  control  limit),  1088 
Unacceptable  lots,  1094 
Unconstrained  optimization,  969 
basic  concepts,  951-952 
method  of  steepest  descent, 
952-954 

Uncorrelated  related  variables,  1109 
Underdamping,  65,  67 
Underdetermined  linear  systems,  277 
Underflow  (floating-point  numbers), 
792 

Undetermined  coefficients,  method  of: 
higher-order  homogeneous  linear 
ODEs,  115  " 

higher-order  linear  ODEs,  123 
nonhomogeneous  linear  systems 
of  ODEs,  161 
second-order  linear  ODEs: 
homogeneous,  104 
nonhomogeneous,  81-85 
Uniform  convergence: 

and  absolute  convergence,  704 
power  series,  698-705 
properties  of  uniform 

convergence,  700-701 
termwise  integration,  701-703 
test  for,  703-704 

Uniform  distributions,  1035-1036, 
1053 

Unifying  power  of  mathematics,  97 
Union,  of  events,  1016-1017 
Uniqueness: 

of  Laplace  transforms,  210 
of  Laurent  series,  712 
of  power  series  representation, 
685-686 
problem  of,  39 
Uniqueness  theorems: 
cubic  splines,  822 
Dirichlet  problem,  462,  784 
first-order  ODEs,  39-42 
higher-order  homogeneous  linear 
ODEs,  108 

Laplace’s  equation,  462 
linear  systems,  138 
proof  of,  A77-A79 
second-order  homogeneous  linear 
ODEs,  74 

systems  of  ODEs,  137 


Index 
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Unitary  matrices,  347-350,  353 
Unitary  systems,  349 
Unitary  transformation,  349 
Unit  binormal  vector,  389 
Unit  circle,  617,  619 
Unit  impulse  function,  226.  See  also 
Dirac  delta  function 
Unit  matrices,  128,  268 
Unit  normal  vectors,  366,  441 
Unit  principal  normal  vector,  389 
Unit  step  function  (Heaviside 
function),  217-219 
Unit  tangent  vector,  384 
Unit  vectors,  312,  355 
Universal  gravitational  constant,  63 
Unknowns,  257 
Unrepeated  factors,  220-221 
Unstable  algorithms,  796 
Unstable  critical  points,  140,  149 
Unstable  equilibrium  solution, 

33-34 

Unstable  systems,  84 
Upper  bound,  for  flows,  995 
Upper  confidence  limits,  1068 
Upper  control  limit  (UCL),  1088 
Upper  triangular  matrices,  268 


Value  (sum)  of  series,  171,  673 
Vandermonde,  Alexandre  Theophile, 
113n.l 

Vandermonde  determinant,  113 
Van  der  Pol.  Balthasar,  158n.4 
Van  der  Pol  equation,  158-160 
Variables: 

artificial,  965-968 
basic,  960 
complex,  620-621 
control,  951 
controlled,  1 103 
dependent,  393,  1055,  1056 
independent,  393,  1103 
intermediate,  393 
linearly,  1 109 
nonbasic,  960 

random,  1011,  1029-1030,  1061 
continuous,  1029,  1032-1034, 
1055 

defined,  1030 
dependent,  1055 
discrete,  1029-1032,  1054 
function  of,  1056 
independence  of,  1055-1056 
marginal  distribution  of,  1054, 
1055 

normal,  1045 
occurrence  of,  1063 
probability  distributions  of. 

1051-1060 
skewness  of,  1039 


Variables:  (Cont.) 

standardized,  1037 
two-dimensional,  1051,  1062 
slack,  956,  969 
stochastic,  1029 
uncorrelated  related,  1109 
Variable  coefficients: 

Frobenius  method,  180-187 
indicial  equation,  181-183 
typical  applications, 

183-185 

Laplace  transforms  ODEs  with, 
240-241 

power  series  method,  167-175 
idea  and  technique  of, 

168-170 

operations  on,  173-174 
theory  of,  170-174 
second-order  homogeneous  linear 
ODEs,  73 

Variance(s),  1014,  1061 
comparison  of,  1086 
control  chart  for,  1089-1090 
equality  of,  1084n.3 
of  normal  distributions, 

confidence  intervals  for, 
1073-1076 

of  probability  distributions, 
1035-1039 

addition  of,  1058-1059 
transformation  of,  1036-1037 
sample,  1015 
Variation,  random,  1063 
Variation  of  parameters,  method  of: 
higher-order  linear  ODEs,  123 
high-order  nonhomogeneous  linear 
ODEs,  118-120 

nonhomogeneous  linear  systems 
of  ODEs,  162-163 
second-order  linear  ODEs: 
homogeneous,  104 
nonhomogeneous,  99-102 
Vectors,  256,  259 

addition  and  scalar  multiplication 
of,  259-261 

calculations  with,  126-127 
definitions  and  terms,  126, 

128-129,  257,  259,  309 
eigenvalues,  129-130 
eigenvectors,  129-130 
linear  independence  and 

dependence  of,  282-283 
multiplying  matrices  by, 

263-265 

in  the  plane,  309,  355 
systems  of  ODEs  as  vector 
equations,  127-128 
in  3-space,  309 
transposition  of,  266-267 
Vector  addition,  309,  357-359 


Vector  calculus,  354,  378-380 

differential,  see  Vector  differential 
calculus 

integral,  see  Vector  integral 
calculus 

Vector  differential  calculus,  354 — 412 
curves,  381-392 

arc  length  of,  385-386 
length  of,  385 
in  mechanics,  386-389 
tangents  to,  384-385 
and  torsion,  389-390 
gradient  of  a scalar  field,  395-402 
directional  derivatives, 

396-397 

maximum  increase,  398 
as  surface  normal  vector, 
398-399 

vector  fields  that  are, 

400-401 

inner  product  (dot  product), 
361-367 

applications,  364—366 
orthogonality,  361-363 
scalar  functions,  376 
and  vector  calculus,  378-380 
vector  fields,  377-378 
curl  of,  406-409 
divergence  of,  402-406 
that  are  gradients  of  scalar 
fields,  400-401 
vector  functions,  375-376 
partial  derivatives  of,  380 
of  several  variables,  392-395 
vector  product  (cross  product), 
368-375 

applications,  371-372 
scalar  triple  product,  373-374 
vectors  in  2-space  and  3-space: 
components  of  a vector, 
356-357 

scalar  multiplication,  358-359 
vector  addition,  357-359 
Vector  fields: 
defined,  376 

vector  differential  calculus, 
377-378 

curl  of,  406^109,  412 
divergence  of,  402-406 
that  are  gradients  of  scalar 
fields,  400^101 
Vector  functions: 

continuous,  378-379 
defined,  375-376 
differentiable,  379 
divergence  theorem  of  Gauss, 
453^157 

of  several  variables,  392-395 
chain  rules,  392-394 
mean  value  theorem,  395 
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Vector  functions:  ( Cont .) 
vector  differential  calculus, 
375-376,  411 
partial  derivatives  of,  380 
of  several  variables,  392-395 
Vectors  in  2-space  and  3-space: 
components  of  a vector, 

356-357 

scalar  multiplication,  358-359 
vector  addition,  357-359 
Vector  integral  calculus,  413-471 
divergence  theorem  of  Gauss, 

" 453-463 

double  integrals,  426-432 
applications  of,  428-429 
change  of  variables  in, 

" 429-431 

evaluation  of,  by  two 

successive  integrations, 
427-428 

Green’s  theorem  in  the  plane, 
433-438 

line  integrals,  413-419 

definition  and  evaluation  of, 
414-416 

path  dependence  of,  418-426 
work  done  by  a force,  416-417 
path  dependence  of  line  integrals, 
418-426 
defined,  418 

and  integration  around  closed 
curves,  421-425 
Stokes’s  Theorem,  463-469 
surface  integrals,  443-452 
orientation  of  surfaces, 

446-447 

without  regard  to  orientation, 
448-450 

surfaces  for  surface  integrals, 
439-443 

representation  of  surfaces, 
439-441 


Vector  integral  calculus  (Cont.) 
tangent  plane  and  surface 
normal,  441^142 
triple  integrals,  452-458 
Vector  moment,  371 
Vector  norms,  866 
Vector  product  (cross  product): 
in  Cartesian  coordinates, 

A83-A84 

vector  differential  calculus, 
368-375,  410 
applications,  371-372 
scalar  triple  product,  373-374 
Vector  spaces,  482 

complex,  309-310,  349 
inner  product  spaces,  311-313 
linear  transformations,  313-317 
real,  309-311 
special,  285-287 
Velocity,  391,  411,  771 
Velocity  potential,  771 
Velocity  vector,  386,  771 
Venn,  John,  1017n.l 
Venn  diagrams,  1017 
Verhulst,  Pierre-Frangois,  32n.8 
Verhulst  equation,  32-33 
Vertices  (graphs),  971,  977,  1007 
adjacent,  971,  977 
central,  991 
coloring,  1005-1006 
double  labeling  of,  986 
eccentricity  of,  991 
exposed,  1001,  1003 
four-color  theorem,  1006 
scanning,  998 
Vertex  condition,  991 
Vertex  incidence  list  (graphs),  973 
Volta,  Alessandro,  93n.7 
Voltage  drop,  29 

Volterra,  Vito,  155n,3,  198n,7,  236n,3 
Volterra  integral  equations,  of  the 
second  kind,  236-237 


Volume,  of  a region,  428 
Vortex  (fluid  flow),  777 
Vorticity,  774 


Walk  (shortest  path  problems),  975 
Wave  equation,  544-545,  942 
d’Alembert’s  solution,  553-556 
numeric  analysis,  942-944,  948 
one-dimensional,  544—545 
solution  by  separating  variables, 
545-553 

two-dimensional,  575-584 
Weber’s  equation,  510 
Weber’s  functions,  198n,7 
Weierstrass,  Karl,  625n.4,  703n,5 
Weierstrass  approximation  theorem, 
809 

Weierstrass  AJ-test  for  uniform 
convergence,  703-704 
Weighted  graphs,  976 
Weight  function,  500 
Well-conditioned  problems,  864 
Well-conditioning  (linear  systems), 
865 

Wessel,  Caspar,  611n,2 
Work  done  by  a force,  416-417 
Work  integral,  415 
Wronski,  Josef  Maria  Hone,  76n.5 
Wronskian  (Wronski  determinant): 
second-order  homogeneous  linear 
ODEs,  75-78 
systems  of  ODEs,  139 


Zeros,  of  analytic  functions,  717 
Zero  matrix,  260 
Zero  surfaces,  598 
Zero  vector,  129,  260,  357 
z-score,  1014 
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PI 


Some  Constants 


Polar  Coordinates 


e = 2.71828  18284  59045  23536 
Ve  = 1.64872  12707  00128  14685 
e2  = 7.38905  60989  30650  22723 


x = r cos  8 


y = r sin  6 


r = Vx2  4-  y2  tan  8 = — 

x 


it  = 3.14159  26535  89793  23846 
t r2  = 9.86960  44010  89358  61883 
Vtt7  = 1.77245  38509  05516  02730 

log10  tt  = 0.49714  98726  94133  85435 
In  7T  = 1.14472  98858  49400  17414 
log10  e = 0.43429  44819  03251  82765 
In  10  = 2.30258  50929  94045  68402 


dx  dy  = r dr  d8 


Series 


= 2 x 
m= 0 


(M  < i) 


V2  = 1.41421  35623  73095  04880 
^2  = 1.25992  10498  94873  16477 
V3  = 1.73205  08075  68877  29353 
■^3  = 1.44224  95703  07408  38232 
In  2 = 0.69314  71805  59945  30942 
In  3 = 1.09861  22886  68109  69140 

y = 0.57721  56649  01532  86061 

In  y = -0.54953  93129  81644  82234 
(see  Sec.  5.6) 


= 2 

m— 0 


m\ 


OO 

sin  x = 

m= 0 


(-l)mx2m+1 
(2m  + 1)! 


oo 

COS  X = 

m= 0 


(_j)mx  2m 

(2m)  \ 


oo 

In  (1  - x)  = - 2 (|x|  < 1) 

, m 


1°  = 0.01745  32925  19943  29577  rad 
1 rad  = 57.29577  95130  82320  87680° 

= 57°17'44.806" 


arctanx  = 


Greek  Alphabet 


2 

m= O 


(-l)mx2m+1 
2m  + 1 


Vectors 


(H  < i) 


a 

Alpha 

v 

Beta 

y.r 

Gamma 

S,  A 

Delta 

6,  £ 

Epsilon 

£ 

Zeta 

V 

Eta 

i i?,  e 

Theta 

i 

Iota 

K 

Kappa 

A,  A 

Lambda 

Mu 

V 

Nu 

a*b  = a1b1  + a2b2  + a3b3 

£ 

Xi 

i 

j k 

o 

Omicron 

a x b = 

a2  a3 

7 T 

Pi 

*1 

b2  b3 

P 

Rho 

<j,  s 

Sigma 

grad  f = V/  = 

3/  df 

— i + — j + 

dy 

df 

— k 

dz 

T 

Tau 

v,  Y 

Upsilon 

• 

> 

II 

>> 

> 

du,  dllo 

= — - H + 

dV3 

dx  dy 

dz 

>,  (p,  <h 

Phi 

X 

Chi 

i j 

k 

curl  v = V x v 

d d 

d 

ifi,  ^ 

Psi 

— 

dx  dy 

dz 

CO,  fl 

Omega 

V1  V2 

v3 

